Improved interactivity for LOK clients in Writer's layout

Posted on: Tue 03 September 2024

Estimated read time: 3 minutes

Writer now has support for doing partial layout passes when LOK clients have pending events, which sometimes improves interactivity a lot.

This work is primarily for Collabora Online, but the feature is useful for any LOK clients.

Motivation¶

I recently worked with a document that has relatively simple structure, but it has 300 pages, and most of the content is part of a numbered list. Pasting a simple string (like an URL) into the end of a paragraph resulted in a short, but annoying hang. It turns out we updated Writer's layout for all the 300 pages before the content was repainted on the single visible page. In theory, you could reorder events, so you first calculate the first page, you paint the first page, then you calculate the remaining 299 pages. Is this possible in practice? Let's try!

Results so far¶

The relevant part of the test document is simple: just an empty numbered paragraph, so we can paste somewhere:

Bugdoc: empty paragraph, part of a numbered list and then pasting an URL there

This is a good sample, because pasting into a numbered list requires invalidating all list items in that list, since possibly the paste operation created a new list item, and then the number portion has to be updated for all items in the rest of the list. So if you paste into a numbered list, you need to re-calculate the entire document if all the document is just a numbered list.

The first problem was that Writer tracks its visible area, but LOK needs two kinds of visible areas. The first kind decides if invalidations are interesting for part of the document area. LOK wants to get all invalidations, so in case we cache some document content in the client that is near the visible area, we need to know when to throw away that cache. On the other hand, we want to still track the actually visible viewport of the client, so we can prioritize visible vs hidden parts of the document. Writer in LOK mode thought that all parts of the document are a priority, but this could improved by taking the client's viewport into account.

The second problem was that even if Writer had two layout passes (first is synchronous, for the visible area; second is async, for the rest of the document), both passes were performed before allowing a LOK client to request tiles for the issued invalidations.

This is now solved by a new registerAnyInputCallback() API, which allows the LOK client to signal if it has pending events (e.g. unprocessed callbacks, tiles to be painted) or it's OK for Writer layout to finish its idle job first.

The end result for pasting a URL into this 300 pages document, when measuring end-to-end (from sending the paste command to getting the first updated tile) is a decrease in response time, from 963 ms to 14 ms.

How is this implemented?¶

If you would like to know a bit more about how this works, continue reading... :-)

As usual, the high-level problem was addressed by a series of small changes:

The tracking issue for this problem was cool#9735.

Want to start using this?¶

You can get a development edition of Collabora Online 24.04 and try it out yourself right now: try the development edition. Collabora intends to continue supporting and contributing to LibreOffice, the code is merged so we expect all of this work will be available in TDF's next release too (25.2).

Improved font fallback in the DOCX import of Writer

Posted on: Wed 07 August 2024

Estimated read time: 2 minutes

Writer now has improved support for font fallback when you open a DOCX file that refers to fonts which are not available currently.

This work is primarily for Collabora Online, but the feature is fully available in desktop Writer as well.

Motivation¶

Font embedding is meant to solve the problems around missing fonts, but you can also find documents with stub embedded fonts that are to be ignored and our code didn't have any sanity check on such fonts, leading to unexpected glyph-level fallbacks. Additionally, once font-level fallback happened, we didn't take the font style (e.g. sans vs serif) into account, which is expected to work when finding a good replacement for the missing font.

Results so far¶

Here is how to the original rendering looked like:

Bugdoc, before: ugly glyph-level fallback

Once the handler for the embedded fonts in ODT/DOCX was improved to ignore stub fonts where even basic glyphs were not available, the result was a bit more consistent, but still bad. Here is a different document to show the problem:

Bugdoc, first improvement: no glyph fallback but the result is sans

Note how now we used the same font, but the glyphs are always sans, not serif. So the final step was to import the font type from DOCX and consider that while deciding font fallback:

Bugdoc, second improvement: no glyph fallback and the result is sans / serif

With this, we ignore stub embedded fonts from DOCX, we import the font type and in general font fallback on Linux takes the font type into account while deciding font fallback.

How is this implemented?¶

If you would like to know a bit more about how this works, continue reading... :-)

As usual, the high-level problem was addressed by a series of small changes:

Want to start using this?¶

Fixing handling of line object transformations in the DOCX import of Writer

Posted on: Tue 09 July 2024

Estimated read time: 2 minutes

Writer now has improved support for toplevel line shapes when you import those from DOCX.

This work is primarily for Collabora Online, but the feature is fully available in desktop Writer as well.

Motivation¶

As described in a post from 2014, Writer reads the drawingML markup for shapes in DOCX files, including line shapes. While investigating a simple-looking problem around a horizontal vs vertical line, it turns out that there is a deeper issue here, and it looks like now have proper fix for this bug.

Results so far¶

Imagine that your company template has a nice footer in two columns, and the content in the columns are separated by a vertical line shape, but when you open your DOCX in Writer, it crosses the text of that footer instead:

Bugdoc, before: reference is red, Writer result is painted on top of it

While researching how line shapes are represented in our document model and how ODT import works, it turned out that the proper way to create a line shape is to only consider size / scaling when it comes to the individual points of the line, everything else (e.g. position / translation) should go to the transform matrix of the shape, then the render result will be as expected:

Bugdoc, after: reference is red, Writer result is painted on top of it

It was also interesting to see that this also improved other, existing test documents, e.g. core.git sw/qa/extras/ooxmlimport/data/line-rotation.docx looked like this before:

3 rotated lines, before: reference is red, Writer result is painted on top of it

And the same fix makes it perfect:

3 rotated lines, after: reference is red, Writer result is painted on top of it

Just stick to the rule: scaling goes to the points -- translation, rotation and horizontal shear goes to the shape.

For now, this is only there for toplevel Writer lines, but in-groupshape and Calc/Impress lines could also follow this technique if there is a practical need.

The "after" screenshots show ~no red, which means there is ~no reference output, where the Writer output would be missing.

How is this implemented?¶

If you would like to know a bit more about how this works, continue reading... :-)

The bugfix commit was tdf#161779 DOCX import, drawingML: fix handling of translation for lines.

The tracking bug was tdf#161779.

Want to start using this?¶

Section-based continuous endnotes in Writer

Posted on: Mon 03 June 2024

Estimated read time: 3 minutes

Writer now has much better support for continuous / inline endnotes (not on a separate page) in Writer, enabled by default for DOCX files.

This work is primarily for Collabora Online, but the feature is fully available in desktop Writer as well.

Motivation¶

As described in a previous post, Writer already had minimal support for not rendering endnotes on a separate endnote page, but it was not mature enough to enable is by default for DOCX files.

Results so far¶

What changed from the previous "continuous endnotes" approach is that instead of trying to map endnotes to footnotes, we now create a special endnotes section, which only exists at a layout level (no section node is backing this one), and this hosts all endnotes at the end of the document. It turns out this is a much more scalable technique, for example a stress-test with 72 endnotes over several pages is now handled just fine.

Here are some screenshots:

Before: reference is red, Writer result is painted on top of it

After: reference is red, Writer result is rendered on top of it

As you can see, there were various differences for this document, but the most problematic one was that the entire endnote was missing from the (originally) last page, as it was rendered on a separate page.

Now it's not only on the correct page, but also its position is correct: the endnote is after the body text, while the footnote is at the bottom of the page, as expected. The second screenshot shows ~no red, which means there is ~no reference output, where the Writer output would be missing.

How is this implemented?¶

If you would like to know a bit more about how this works, continue reading... :-)

As usual, the high-level problem was addressed by a series of small changes:

The tracking bug was tdf#160984.

Want to start using this?¶

Improved copy & paste in Collabora Online: the copy side

Posted on: Fri 03 May 2024

Estimated read time: 4 minutes

Collabora Online now uses the Clipboard API if it's available in the used browser, getting rid of many annoying dialogs.

Motivation¶

The paste side was covered by an earlier post, but in short, if you're on Chrome (or Safari), you won't see the "can't paste" dialog:

Also you won't see a "Can't paste special" dialog:

"Can't paste special" dialog in COOL 23.05

Wouldn't it be nice to improve the copy side in a similar way?

Results so far¶

The primary reason why we needed similar, annoying dialogs for copy in the past is that the clipboard API was synchronous but the network API is async. This means that writing to the clipboard ("copy") is only possible with data that we have in the browser when the copy is executed. This is a problem, since in general we don't have data for your selection in the browser.

With the Clipboard API, the copy side can be improved as well. Instead of always fetching the HTML for a simple selection (even if you don't copy) and having a three step copy for complex selections, async clipboard write is now possible. This gets us rid of the "You need to download" dialog:

"You need to download" dialog in COOL 23.05

Also it eliminates the "Download completed dialog:

"Download completed" dialog in COOL 23.05

Instead, we still need to nominally write to the clipboard in the special keyboard or click context initiating the copy (Chrome doesn't require this, Safari does), but the written object can be a callback that will do the network operation for us.

Unfortunately it's hard to screenshot the new code, since part of the result is that all these dialogs are now eliminated, copy & paste just works. :-)

Note that this can be used also in Firefox, but first you need to set dom.events.asyncClipboard.clipboardItem to true in about:config.

The last part was to adapt tests to this new world, because previously it was handy to just create a selection and assert what would be copied to the clipboard as HTML, but now we don't download the HTML anymore every time you create a selection.

How is this implemented?¶

If you would like to know a bit more about how this works, continue reading... :-)

As usual, the high-level problem was addressed by a series of small changes:

The tracking issue was cool#8648.

Want to start using this?¶

You can get a development edition of Collabora Online 24.04 and try it out yourself right now: try the development edition.

Improve copy&paste in Calc and elsewhere

Posted on: Thu 04 April 2024

Estimated read time: 3 minutes

Calc now supports much better copy&paste when you transfer data between Google Sheets and Calc.

This work is primarily for Collabora Online, but the feature is fully available in desktop Calc as well.

Motivation¶

First, Collabora Online was using the deprecated document.execCommand() API to paste text, which is problematic, as the "paste" button on the toolbar can't behave the same way as pressing Ctrl-V on the keyboard.

Second, it turns out Google Sheets came up with some additional HTML attributes to represent spreadsheet data in HTML in a much better way, and Calc HTML import/export had no support for this, while this is all fixable.

Results so far¶

In short, Collabora Online now uses the Clipboard API to read from the system clipboard -- this has to be supported by the integration, and Calc's HTML filter now support the subset of the Google Sheets markup I figured out so far. This subset is also documented.

Note that the default behavior is that the new Clipboard API is available in Chrome/Safari, but not in Firefox.

For the longer version, here are some screenshots:

The new paste code in action, handling an image

Import from Google Sheets to Calc: text is auto-converted to a number, bad

Import from Google Sheets to Calc: text is no longer auto-converted to a number, good

HTML import into an active cell edit, only RTF was working there previously

Paste from Google Sheets to Calc: text is no longer auto-converted to a number, good

Paste from Google Sheets to Calc: booleans are now also preserved

Paste from Google Sheets to Calc: number formats are now also preserved

Paste from Google Sheets to Calc: formulas are now also preserved

Paste from Google Sheets to Calc: also handling a single cell

Copy from Calc to Google Sheets: text is now handled, no longer auto-converted to a number

Copy from Calc to Google Sheets: booleans are now handled

Cross-origin iframes also block clipboard access, now fixed

Copy from Calc to Google Sheets: number formats are now also preserved

Copy from Calc to Google Sheets: formulas are now also preserved

Copy from COOL Writer to a text editor: much better result, new one on the right hand side

How is this implemented?¶

If you would like to know a bit more about how this works, continue reading... :-)

As usual, the high-level problem was addressed by a series of small changes:

The tracking issue on the Online side was cool#8023.

Want to start using this?¶

You can get a snapshot / demo of Collabora Office 24.04 and try it out yourself right now: try the unstable snapshot. Collabora intends to continue supporting and contributing to LibreOffice, the code is merged so we expect all of this work will be available in TDF's next release too (24.8).

Legal numbering in Writer: DOC and RTF support

Posted on: Fri 01 March 2024

Estimated read time: 3 minutes

Writer now supports legal numbering for two more formats: DOC and RTF (ODT and DOCX were working already.)

This work is primarily for Collabora Online, done as a HackWeek project, but the feature is fully available in desktop Writer as well.

Motivation¶

Legal numbering is a way to influence the number format of values inherited in a multi-level numbering. Say, the outer numbering uses Roman numerals and the inner numbering uses X.Y as the number format, but the inner level wants to display the outer values as Arabic numerals. If this is wanted (and guessing from the name, sometimes lawyers do want this), then the inner number portion will expand to values like "2.01" instead of "II.01", while the outer number portions will remain values like "II".

Mike did 80% of the work, what you can see here is just the RTF/DOC filters.

Picking a smaller feature task like this looked like a good idea, since I wanted to spend some of the time on regression fixing around last year's multi-page floating table project.

Results so far¶

For (binary) DOC, the relevant detail is the fLegal bit in the LVLF structure. Here is the result:

Improved handling of legal numbering from DOC: old, new and reference rendering

It shows how the outer "II" gets turned into "2", while it remained "II" in the past. This works for both loading and saving.

The same feature is now handled in the RTF filter as well. There the relevant detail is the \levellegal control word, which has an odd 1 default value (the default is usually 0). Here is the result:

Improved handling of legal numbering from RTF: old, new and reference rendering

It shows that the RTF filter is up to speed with the DOC one by now.

As for the multi-page floating tables, I looked at tdf#158986 and tdf#158801.

How is this implemented?¶

If you would like to know a bit more about how this works, continue reading... :-)

As usual, the high-level problem was addressed by a series of small changes:

Want to start using this?¶

Fixing multi-view programming challenges in Calc and elsewhere

Posted on: Tue 06 February 2024

Estimated read time: 3 minutes

This post describes some challenges around having multiple views of one opened document in LibreOffice core, when those views belong to LOK views, representing different users, with their own language, locale and other view settings.

This work is primarily for Collabora Online, but is useful for all clients of the LOK API.

Motivation¶

LOK views are meant to represent separate users, so we need to make sure that when a user sets their preferences and trigger an action, then the response to that action goes to the correct view, with the correct view settings.

This is different from the desktop LibreOffice use-case, where multiple windows are still meant to share the same user name, language, undo stack and so on.

Results so far¶

In this post, I would like to present 4 small improvements that recently happened to the LOK API to provide this wanted separation of views.

The first was an issue where two users were editing the same document, one busily typing and the other clicked on a link in Calc. What could happen sometimes is the link popup appeared for the user who typed, not for the user who clicked on the link:

Link popup is actually on the left, should be on the right, now fixed

This specific problem can be fixed by making sure that link click callbacks are invoked synchronously (while the clicking view is still active) and not later, when the current view may or may not be the correct one.

It turns out the same problem (async command dispatch) affects not only hyperlinks, but many other cases as well, where we want to stay async, for example, when one dialog would invoke another dialog, like the Calc conditional format -> add dialog:

Calc conditional format add dialog appearing on the left, should be on the right, now fixed

There you don't want to change async commands into sync commands, because that may mean spinning the main loop inside a dialog, resulting in nested main loops. This can be fixed by making sure that async commands to be dispatched (sfx2 hints in general) are processed in a way that the current view at dispatch & processing is the same, which is now the case.

The third problem was around wrong language & locale in the status bar:

Unexpected English strings in localized statubar UI, now fixed

This is not simply a problem of missing translation, the trouble was that the status bar update is also async and by the time the update happened, the locale of the view on the left was used, for a string that appears on the right.

The way to fix this is to perform the update of toolbars/statusbar/etc (in general: SfxBindings) in a way that the language at job schedule time and at UI string creation time is the same.

The last problem was quite similar, still about bad language on the UI, but this time on the sidebar:

Unexpected English strings in localized sidebar UI, now fixed

This is similar to the statusbar case, but the sidebar has its own executor for its async jobs, so that needed a fix similar to what the statusbar already had, now done.

How is this implemented?¶

If you would like to know a bit more about how this works, continue reading... :-)

As usual, the high-level problem was addressed by a series of small changes:

Want to start using this?¶

You can get the latest Collabora Online Development Edition 23.05 and try it out yourself right now: try the development edition. Collabora intends to continue supporting and contributing to LibreOffice, the code is merged so we expect all of this work will be available in TDF's next release too (24.8).

Multi-page floating tables in Writer: tables wrapping tables

Posted on: Wed 03 January 2024

Estimated read time: 3 minutes

This post is part of a series to describe how Writer now gets a feature to handle tables that are both floating and span over multiple pages.

This work is primarily for Collabora Online, but is useful on the desktop as well. See the 10th post for the previous part.

Motivation¶

Previous posts described the hardest part of multi-page floating tables: making sure that text can wrap around them and they can split across pages. In this part, we'll look at a case where that content is not just text, but the wrapping content itself is also a table.

Results so far¶

Regarding testing of the floating table feature in general, the core.git repository has 92 files now which are focusing on correct handling of floating tables (filenames matching floattable-|floating-table-). This doesn't count cases where the document model is built using C++ code in the memory and then we assert the result of some operation.

Here are some screenshots from the improvements this month:

Improved click handling near the first page of a floating table

The first screenshot shows a situation where the mouse cursor is near the right edge of the first page of a floating table. What used to happen is we found this position close to the invisible anchor of the floating table on that page, then corrected this position to be at the real anchor on the last page. In short, the user clicked on one page and we jumped to the last page. This is now fixed, we notice that part of the floating table is close to the click position and we correct the cursor to be at the closest position inside the table's content.

A floating table, wrapped by an inline table: old, new and reference rendering

The next screenshot shows a floating table where the content wrapping around the table happens to be an inline table. You can see how such wrapping didn't happen in the past, and the new rendering is close to the reference now.

How is this implemented?¶

If you would like to know a bit more about how this works, continue reading... :-)

As usual, the high-level problem was addressed by a series of small changes:

Want to start using this?¶

You can get a snapshot / demo of Collabora Office 23.05 and try it out yourself right now: try the unstable snapshot. Collabora intends to continue supporting and contributing to LibreOffice, the code is merged so we expect all of this work will be available in TDF's next release too (24.8).

Multi-page floating tables in Writer: UI improvements

Posted on: Mon 04 December 2023

Estimated read time: 3 minutes

This post is part of a series to describe how Writer now gets a feature to handle tables that are both floating and span over multiple pages.

This work is primarily for Collabora Online, but is useful on the desktop as well. See the 9th post for the previous part.

Motivation¶

Previous posts described the hardest part of multi-page floating tables: reading them from documents, so we layout and render them. In this part, you can read about UI improvements when it comes to creating, updating and deleting them in Writer.

Results so far¶

Regarding testing of the floating table feature in general, the core.git repository has 89 files now which are focusing on correct handling of floating tables (filenames matching floattable-|floating-table-). This doesn't count cases where the document model is built using C++ code in the memory and then we assert the result of some operation.

Here are some screenshots from the improvements this month:

The first screenshot shows that the underlying LibreOffice Insert Frame dialog is now async (compatible with collaborative editing) and is now exposed in the Collabora Online notebookbar.

There were other improvements as well, so in case you select a whole table and insert a new frame, the result is close to what the DOCX import creates to floating tables. This includes a default frame width that matches the table width, and also disabling frame borders, since the table can already have one.

The next screenshot shows an inserted floating table, where the context menu allows updating the properties of an already inserted floating table, and also allows to delete ("unfloat") it.

Several of these changes are shared improvements between LibreOffice and Collabora Online, so everyone benefits. For example, inserting a frame when a whole table was selected also cleared the undo stack, which is now fixed. Or unfloating table was only possible if some part of the table was clipped, but now this is always possible to do.

How is this implemented?¶

If you would like to know a bit more about how this works, continue reading... :-)

As usual, the high-level problem was addressed by a series of small changes: