Index ¦ Archives ¦ RSS

Multi-page floating tables in Writer: from row deletion to page breaks

Estimated read time: 6 minutes

Writer now has continued steps to handle tables that are both floating and span over multiple pages.

This work is primarily for Collabora Online, but is useful on the desktop as well. See the second post for background.

Motivation

The previous post finished with cursor traversal: if a floating table is on both page 1 and page 2, then you expect Writer to be able to move between the rows of the table, even if those are not on the same page. In this post, we'll see what else started to work during the past month.

Results so far

The feature is enabled by default and now the DOCX/DOC/RTF import makes use of it if. This allows stress-testing the layout code with complex user documents, hopefully with the found breakage fixed before it would be released in a stable version.

On the positive side, core.git repository has has 19 files now which are focusing on correct handling of floating tables. Also, there are additional tests that quickly build a specific multi-page floating table in the memory and do some operation on it, e.g. delete the last row and assert what happens.

Here are some screenshots from the effort so far:

Editing of floating tables: delete the last row of a table with 3 rows

The first case is about editing: if a floating table had a first, middle and last page, then deleting the last row of a table lead to incorrect layout, which is now fixed.

Selection & dragging of split floating tables

An odd problem is that the vertical position of tables on non-first pages is generated by the layout, which means that normal drag&move to position them won't work, leading to annoying jumps. This is now fixed by selecting the first (master) fly frame on click, and you can always reposition that table (even vertically.)

Bad binary DOC import

Once DOCX import/export was there, the next step is binary DOC import, which gives us access to a larger corpus of test documents, to stress-test the layout code. This shows how the binary DOC import looked before the work.

Good binary DOC import

And this one shows how it works now.

Good binary DOC export

DOC import is not enough, e.g. Collabora Online will save your documents automatically, so we really want to export everything that is possible to import. Here is how good DOC export looks like in Word.

In-footer floating table

At this point the first crashtest results arrived (we try to import about 280 thousand documents and see what crashes). The first problem was floating tables in footers. Well, we should not try to split such tables (even if they don't fit): adding one more page does not give us more footer space.

Bad RTF import

Similar to the DOC filter, RTF can express floating tables. Here is how we did a bad rendering of an RTF document before.

Good RTF import

And here is how we import it currently. The RTF control words are quite close to the binary DOC markup semantically, just the syntax is different.

Bad RTF export

The RTF export side was also missing, as visible in Word, before the work.

Good RTF export

And this is how the good RTF export result looks like in Word.

Floating table in a section

Another crashtest find was that sometimes we map Word's continuous section breaks to Writer sections, so we can't assume that tables are anchored directly in body frames. This is now fixed.

Correct handling of the TableRowKeep flag in floating tables

A related problem was that non-floating tables have a trick, that we call the TableRowKeep mode. If this is on (which is the default for documents imported form Word), a table row will stick to the next table row (we try to keep them on the same page) if the first cell's first paragraph in that row has the "keep with next" paragraph property specified. It turns out, this should be ignored when the table is floating.

Page break before a floating table

A next problem was that some page breaks simply disappeared. It turns out that we need to transfer the "break before" property from the table to the table anchor (paragraph) to get the desired layout, since page breaks are generally ignored inside text frames.

Handling of 2 times nested tables, middle one is floating

All combinations of nesting with floating tables is not yet handled, but at least we should not crash when the user tries to do that. Here is 3 tables, nested in each other, the second table is marked to be floating.

Handling of a floating table, immediately followed by an other table

The last fixed problem is when a floating table is immediately followed by an other, non-floating table. Given that we try to anchor the floating table in the next paragraph, the layout could not handle this previously, but now we ensure that each floating table is followed by a paragraph.

And that's where we stand. Hope to address all problems reported by crashtesting soon. Once that happens, it may be possible to switch from bugfixing mode to feature mode again, e.g. better handling of overlapping or nested tables could be done.

How is this implemented?

If you would like to know a bit more about how this works, continue reading... :-)

As usual, the high-level problem was addressed by a series of small changes:

Want to start using this?

You can get a snapshot / demo of Collabora Office 23.05 and try it out yourself right now: try the unstable snapshot. Collabora intends to continue supporting and contributing to LibreOffice, the code is merged so we expect all of this work will be available in TDF's next release too (7.6).

© Miklos Vajna. Built using Pelican. Theme by Giulio Fidente on github.