Estimated read time: 3 minutes
Number portions generated when using lists/numberings/bullets in Writer now can have formatting which is preserved in ODT files as well.
First, thanks Docmosis for funding this work by Collabora.
Motivation¶
Word and DOCX files support explicit character properties for the paragraph marker, and these are also used for the formatting of a number portion if the paragraph has one. This was already loaded from / saved to DOCX, but it was lost when saving to ODT.
Results so far¶
First, we got a bug document, where the reference rendering and our rendering differed:
In this case, what happened was that part of the heading text was covered by a bookmark, so we first created multiple character ranges (outside the bookmark, inside the bookmark), then as an optimization we even unified them to be a single formatted character range, covering the entire paragraph. This was a document model that is different from the bookmark-free version, where the character formatting was set on the paragraph itself.
This was fixed at render time and at DOCX export time to consider both full-paragraph character ranges and in-paragraph character properties. For a while, this looked like the entire story, since this now looks good in Writer:
A bit later another, related bug was discovered. Given a reference document:
Just opening this DOCX file in Writer, it looked like this:
Note how the first number portion turned into bold! This was expected after the above layout change to consider full-paragraph formatted character ranges, but it also meant that Word can have one set of character formatting for the entire character range of a paragraph, and another for the paragraph marker.
To make the problem worse, this second document was showing that even the ODT export/export feature had problems, still:
The fix to solve all of the above was to undo the previous render / DOCX export change, then teach the ODT export to explicitly save the paragraph marker formatting (as an empty span at the end of the text node) to ODT, and also to load it back.
This means that now Writer can render the second document correctly, without breaking the first document:
How is this implemented?¶
If you would like to know a bit more about how this works, continue reading... :-)
As usual, the high-level problem was addressed by a series of small changes:
- sw, numbering portion format: consider full-para char formats as well
- DOCX export, numbering portion format: consider full-para char formats as well
- sw, numbering portion format: ignore char formats covering the entire paragraph
- sw: ODT import/export of DOCX's paragraph marker formatting
- sw: fix ODT import of paragraph marker formatting
Want to start using this?¶
You can get a snapshot / demo of Collabora Office 22.05 and try it out yourself right now: try the unstable snapshot. Collabora intends to continue supporting and contributing to LibreOffice, the code is merged so we expect all of this work will be available in TDF's next release too (7.6).