Document Liberation Project regression testing

Posted on: Sat 14 March 2015

Estimated read time: 3 minutes

https://lh5.googleusercontent.com/-0Qh5cUx4gGA/VQR_RkRe2jI/AAAAAAAAFU8/5KIPToOna4Q/s0/

Earlier I wrote about my setup to hack libvisio. One missing bit was testing the contributed code. Testing can be performed at various levels, so far DLP libraries were tested by recording the output of the various foo2raw tools and then comparing the current output to some previously known good state. This has a number of benefits:

If you know that the current state is good, then there is no need write testcases, you can just record your state automatically.
Any change in the output fill signal instant failure, so it gives pretty good test coverage.

The same technique was used in LibreOffice for Impress testcases initially, however we saw a drawback there: Being automatically generated, you have no control over what part of the output is important and what part is not — both parts are recorded and when some part changes, you have to carefully evaluate on a case by case basis if the change is OK or not. The upshot is that from time to time you just end up regenerating your reference testsuite and till the maintainer doesn’t do that, everyone can only ignore the test results — so it doesn’t really scale.

In short, both techniques have some benefits, but given that the libvisio test repo is quite empty, I thought it’s a good time to give an other method (what we use quite successfully in LO code) a go, too. This method is easy: instead of recording the whole output of some test tool, output a structured format (in this case XML), and then just assert the interesting part of it using XPath. Additionally, these tests are in libvisio.git, so you can nicely put the code change and the testcase in the same commit. So the hope is that this is a more scalable technique:

Provided that make distcheck is ran before committing, you can’t forget to clone and run the tests.
Writing explicit assertions means that it’s rarely needed to adjust existing tests. Which is a good thing, as there are no tests for the tests, so touching existing tests should be avoided, if possible. ;-)
Having testcase + code change in the same commit is one step closer to the dream e.g. the git.git guys do — they usually require documentation, code and test parts in each patchset. :-)

Technically this method is implemented using a librevenge::RVNGDrawingInterface implementation that generates XML. For now, this is part of libvisio, so in case you want to re-use it in some other DLP library, you need to copy it to your import library, though if indeed multiple importers start to use it, perhaps it’ll be moved to librevenge. The rest of the test framework is a simple testsuite runner and a cppunit TestFixture subclass that contains the actual test cases.

So in case you are planning how to test your import library, then now you have two options, evaluate them and choose what seems to be the better tool for your purpose.

Tiled editing: from input handling to selections

Posted on: Fri 27 February 2015

Estimated read time: 3 minutes

In from a living document to input handling, I wrote about how we handle touch and on-screen keyboard events in the LibreOffice Android app. A next step in this TDF-funded project is to provide more UI elements which are specific to touch devices: selections is one of them.

Here are the problems we had to solve to get this working:

Long push is not an event core would recognize.
If you use the mouse and have a selection in Writer, it’s only possible to extend the end of it. If you use the keyboard, then it’s possible to shrink the end of it, but still no adjustment of the start. On touch devices, it’s natural to have selection handles at the start and end of the selection and be able to adjust both, in both directions.
Additionally, when the user drags the selection handles, the expected behavior is that the position of the selection and the handle are never the same: the handle is placed below the selection position and when you drag the handle, the new selection position is above the handle… ;-)

Long push is reasonable to map to double mouse click, as in both cases e.g. in Writer the user expects to have a select word action. But for the adjustment of selections, we really had to define a new API (lok::Document::setTextSelection()) to allow setting the start or end of the selection to a new logical (in document coordinates, not paragraph / character indexes) point.

If you are interested how this looks like, here is a demo:

An other direction we’re working towards is to have the same features in other applications as well: Impress and Calc. Perhaps not so surprisingly, we hit similar problems in these applications as well that we had to solve in Writer. The typical problems are:

LibreOffice assumes a given portion of the document is visible (visual area), but the Android view is independent from what LO thinks is visible. Example: LO thinks a table is not visible, so it doesn’t send the selection events for the text inside the table, even if it’s in fact visible on the Android app.
Instead of calling Invalidate() and waiting for a timer to call Paint(), at some places direct Paint() is performed, so the tile invalidation notification triggered by Invalidate() is missing → lack of content on Android.
We render each tile into a VirtualDevice — kind of an off-screen rendering — and at some places LO assumed that certain content like the actively edited shape’s text is not interesting, as it’s not interesting "during printing".
LO’s mouse events are in pixels, and then this is translated to mm100 (hunderd of milimeters) or twips in core. So counting in pixels is the common language, while the Android app counts everything in twips, and doesn’t want to care about what would be visible at what pixel on the screen, if LO would run in desktop mode. So we had to make sure that we can pass in event coordinates in twips, and get invalidation coordinates in twips, even if previously it was a mix of mm100, twips and pixels.

Here is how Impress looks like, with working tile invalidation, touch and keyboard handling:

Calc is lagging a bit behind, but it also has working tile invalidation and keyboard handling:

That’s it for now — as usual the commits of me and Tomaž Vajngerl are in master (a few of them is only in feature/tiled-editing for now), so you can try this right now, or wait till the next Tuesday and get the Android daily build. :-)

Tiled editing: from a living document to input handling

Posted on: Mon 09 February 2015

Estimated read time: 2 minutes

In from viewing only to a living document, I wrote about how tile invalidation can handle updates in the Android app in case what should be displayed on the screen changes. A next step in this TDF-funded project is to handle more than blinking text: keyboard and mouse/touch events from the user.

First let me enumerate over the issues we had to face:

Gtk, Android and LibreOffice’s VCL use different key codes for the same physical keys. We solved this by mapping the special keys manually on the Gtk/Android side (using the C++ and Java UNO binding), and for the rest, we simply use the unicode representation of the keys.
Special keys: while "return" was easy to map, getting "backspace" to work was more challenging. It worked fine on the Gtk side, but on Android we had to make sure that the whole sfx2 dispatching framework works properly, only then could map the backspace key to the correct UNO command, which is .uno:SwBackspace in case of Writer.
Mouse handling: VCL sends pixel coordinates to the editing windows, they then calculate the offset of the editing area (think about toolbars and menus that have to be excluded), and then converts the pixel values to document coordinates. In case of tiled editing, we always work with document coordinates in logical units (twips), so we had to add the possibility to send the coordinates in document ones. This allows core not knowing where the user exactly is (in case the tiles are already ready, swiping can be handled without any LOK calls), and also allows Android not knowing the implementation details of the desktop app (where menus and toolbars would be).
Cursor caret overlay: we wanted to be sure that it’s not necessary to re-render the affected tiles each time the cursor blinks, so we added a LOK API to send the rectangle (its width is nearly zero) of the cursor to Android, and then it can handle the blinking cursor itself in a transparent overlay. This overlay will be useful for presenting selections as well.

As usual the commits of me and Tomaž Vajngerl are in master, so you can try this right now, or wait till tomorrow and get the Android daily build. However, if you are just interested how this looks like, here are some demos:

Keyboard handling in gtktiledviewer:

Same on Android, including newlines and backspace handling:

Mouse handling in gtktiledviewer:

Same on Android, including the transparent selection overlay that can efficiently blink the cursor:

That’s it for now — next on our list are selections, so you can delete and overwrite more easily. :-)

LibreOffice on Android FOSDEM talk

Posted on: Wed 04 February 2015

Estimated read time: 1 minutes

(via LOfCollabora)

Today is my last day in Brussels where I gave a TextBoxes: complex shapes with complex content and a LibreOffice on Android talk at FOSDEM 2015, in the Open document editors devroom. The devroom was well-crowded, with about 100 users in the rows of the audience — proof pictures above and below. ;-)

(via deneb_alpha)

We also had a Hackfest with about 20 hackers attending, (again) kindly hosted by Betacowork on Monday and Tuesday:

(via floeff)

There were a few topics I hacked on:

tdf#88583: a fallout from the Writer fillattributes work introduced in LibreOffice 4.4
tdf#68183: a small new feature missing since the RSID GSoC project ended
build on HiDPI screens should now no longer fail
tdf#88811: an RTF regression fix, so that now the counter is down to zero again
we no longer produce invalid ODF output when writing character borders

A full list of achievements is available, if you were at the hackfest and you did not contribute to that section, please write a line about what did you hack on. :-)

Quite some other slides are now available on Planet, don’t miss them. Mines are also uploaded.

Tiled editing: from viewing only to a living document

Posted on: Tue 27 January 2015

Estimated read time: 3 minutes

As it has been announced last week, an Android port of LibreOffice in the form of a viewer app is now available for download. What’s next? Editing, naturally. First, thanks again to The Document Foundation — and all the donors who made this (ongoing) work possible. In this post I would like to explain what did we do with Tomaž Vajngerl at Collabora so far in that direction.

If you ever touched the Android port of LibreOffice, you probably noticed that sadly developing for Android is much harder compared to Linux (desktop). On Linux, if you just touch a single module, it’s possible to rebuild just that module in a few seconds, and then you can run soffice again with your modifications included. On Android, this is much harder:

due to a limitation of the Android linker, we link all the native code into a single shared object, that has to be re-linked after each native code modification
the native + the Java code has to be packed into a .apk archive
the .apk archive has to be uploaded to the device (or emulator) and installed there

and only then can you test your changes. To partly sidestep from this problem, we split the "Android editing" into two:

tiled editing: this can be tested on Linux using the gtktiledviewer test application (and ideally any core problem can be seen here already)
Android LibreOfficeKit client: replacing gtktiledviewer with the real Android client code, and this time testing it on the device

One problem with this approach was that while Android properly rendered small tiles of 256x256 pixels, gtktiledviewer rendered a single huge tile. This means that in case part of the document changes and we need to re-draw it, we always repainted the whole document in gtktiledviewer, while we only repainted the necessary parts on Android. Guess what, if the area to be repainted is wrong, it’ll be visible on Android but not on gtktiledviewer. So the first task we solved was to let gtktiledviewer also render small tiles. For debugging purposes, small red rectangles are painted at the top left corners of each rectangle, so the size and position of the tiles can be seen easily:

https://lh5.googleusercontent.com/-VvQFF-Kg270/VMYqe9G76-I/AAAAAAAAFL4/Fnh9_ig03Ww/s420/

The next step was to somehow start work on real editing — but where to start? We identified two critical building blocks:

there should be some way for the user to provide input (e.g. press a key on the software keyboard)
once the document changed, the application has to redraw the changed part of the view

To avoid solving two problems at the same time, we first went after the second. One use case that only requires the update of the view is blinking text. Even if no touch or key events are available, a blinking text wants to update the view using a timer, so it’s a good testcase. It’s now possible for LibreOfficeKit clients to register a notification callback, and using that, LibreOffice can notify clients if part of the view has to be redrawn. Here is how it looks using gtktiledviewer:

This demonstrates that the LibreOfficeKit implementation in LibreOffice core and also the gtktiledviewer client code handle correctly tile invalidations. Once that was done, we could also implement a similar client code in the Android app — it looks like this:

That’s it for now — next on our list is adding support for input handling, so it’s possible to type in some text. :-)

Perfect WW8 comment import

Posted on: Sat 24 January 2015

Estimated read time: 2 minutes

TL;DR: Import of annotated text ranges from binary DOC format was a problem for quite some time, now it should be as good as it always was in the ODT/DOCX/RTF filter.

Longer version: the import of annotation marks from binary DOC was never perfect. My initial implementation had a somewhat hidden, but important shortcoming, in the form of a "Don’t support ranges affecting multiple SwTxtNode for now." comment. The underlying problem was that annotation marks have a start and end position, and this is described as an offset into the piece table (so the unit was a character position, CP) in the binary DOC format, while in Writer, we work with document model positions (text node and content indexes, SwPosition), and it isn’t trivial to map between these two.

Tamás somewhat improved this homegrown CP → SwPosition mapping code, but was still far from perfect. Here is an example. This is how this demo document looks like now in LibreOffice Writer:

https://lh6.googleusercontent.com/-SYW-7l2Otpo/VMQDQ-Fme1I/AAAAAAAAFLg/nkGHlfIV85Y/s800/

And this is how it looked like before the end of last year:

https://lh4.googleusercontent.com/-geD82nPpzC4/VMQDQ9souvI/AAAAAAAAFLk/Mhuqrib2DEs/s800/

Notice how "Start" is commented and it wasn’t before. Which one is correct? Here is the reference:

https://lh5.googleusercontent.com/-L_LmD_wIZks/VMQDQ76Jn3I/AAAAAAAAFLo/mMHr5h5p4oM/s800/

The reason is that the document has fields and tables, and the homegrown CP → SwPosition mapping did not handle this. A much better approach is to handle the mapping as we do it for bookmarks: even if at the end annotation marks and bookmarks are entires in sw::mark::MarkManager, it’s possible to set the start position as a character attribute during import (since mapping the current CP to the current SwPosition is easy) and when we know both the start and end, delete the character attribute and turn it into a mark manager entry. That’s exactly what I’ve done. The first screenshot is the result of 3 changes:

Hopefully this makes LibreOffice not only avoid crashing on such complex annotated contents, but also puts an end to the long story of "annotation marks from binary DOC" problems.

Note

Just like how C++11 perfect forwarding isn’t perfect — if you think it is, see "Familiarize yourself with perfect forwarding failure cases." in this post of Scoot — the above changes may still not result in a truly perfect import result of DOC annotation marks. But I think the #1 problem in this area is now solved. :-)

Export validation as a new year's resolution

Posted on: Sat 10 January 2015

Estimated read time: 2 minutes

TL;DR: If you touch the ODF and/or OOXML filters in LibreOffice, please use the --with-export-validation configure option after you ran the setup.sh script.

Markus Mohrhard did an excellent job with adding the --with-export-validation build switch to LibreOffice. It does the following:

it validates every Calc and Impress zipped XML document (both ODF and OOXML) produced during the build by export filters
it does the same for Writer, except there only a subset of documents are validated

One remaining problem was that it required setting up both odfvalidator and officeotron, neither of them are standard GNU projects but Java beasts. So even if I and a number of other developers do use this option, it happens from time to time that we need to fix new validation regressions, as others don’t see the problem; and even if we point it out, it’s hard to reproduce for the author of the problematic commit.

This has just changed, all you need is to get export-validation/setup.sh from dev-tools.git, and run it like this:

./setup.sh ~/svn /opt/lo/bin

I.e. the first parameter is a working directory and the second is a directory that’s writable by you and is already in your path. And then wait a bit… ODF validator uses maven as a build system, so how much you have to wait depends on how much of the maven dependencies you already have in your local cache… it’s typically 5 to 15 minutes.

Once it’s done, you can add --with-export-validation to your autogen.input and then toplevel make will invoke odfvalidator and officeotron for the above mentioned documents.

The new year is here, if you don’t have a new year’s resolution yet — or if you hate those, but you’re willing to adopt a new habit from time to time — then please consider --with-export-validation, so that such regressions can be detected before you publish your changes. Thanks! ;-)

Fixing the cloud problem

Posted on: Sat 27 December 2014

Estimated read time: 2 minutes

https://lh3.googleusercontent.com/-0aWZ5rvDWI4/VJ3tCCh9wcI/AAAAAAAAFHA/A1P8Un5ksrw/s400/

TL;DR: see above -- a number of preset shapes are now rendered correctly at any scale factors, where previously rendering problems occurred.

fdo#87448 has a reproducer document that shows rendering errors with the scaled cloud preset shape definition. At first I thought that the OOXML spec has wrong definition for this shape type, but that turned out to be not the case. What was a problem is our implementation of the drawingML arcTo command. This implementation defines how we render such arcs as polygons when the shape is to be painted, and given that LibreOffice has native support for the drawingML arcTo / ODF G command, this implementation is invoked during rendering, it’s not an import/export problem.

The rendering result looked like this before:

https://lh3.googleusercontent.com/-tYg4cifemAs/VJ3tCEHtf9I/AAAAAAAAFG8/WzioMo1AkMA/s400/

The cloud is drawn using a set of moveTo and arcTo commands. MoveTo is easier, as it uses explicit coordinates, but arcTo is more complex. It has 4 parameters: the height and width of a "circle", and the start / end angle of an arc on that circle. (Of course if height and width do not equal, than that’s no longer a circle… ;-) ) The problem is that due to this, the distance vector between the arc’s start and end points is implicit — so if something is miscalculated, errors are nicely added to each other as more and more arcs are drawn. This is especially a problem if you later return to the end of an earlier arc using moveTo: if arcTo has some problem, then it’ll be clearly visible.

After fixing UNO ARCANGLETO to only take care of scaling / translation only after counting the actual arc, we started to produce correct end points for the arcs and shapes started to appear correctly at any scale factor, yay! :-)

One remaining problem was how to test this from cppunit, in the above commit I exported the shape to a metafile, and then I could use Tomaž’s excellent MetafileXmlDump to assert that the end of an arc (implicit location) and the parameters of a moveTo command (explicit location) equal — when they do not, that’s what your eyes call a "rendering problem".

Document Liberation Project hacking experience

Posted on: Sat 29 November 2014

Estimated read time: 4 minutes

As someone who usually hacks on LibreOffice, external import filters produced by the Document Liberation Project cut both ways: they are great, as they deal with obscure formats and we get them for free, OTOH hacking such code is more complex than the usual LO code. I recently contributed a few patches to libvisio and libodfgen, but before I was able to do actual code changes, I had to set up a number of repositories and configure them to talk to each other — this post describes one possible setup that suited my needs.

Building blocks

DLP’s central project is librevenge and everything builds on top of that, either by calling it or called by it. In case the task is to turn VSDX files into ODG ones, it looks like this:

https://lh3.googleusercontent.com/-cxQ9QnWmyAo/VHoyKdHXSAI/AAAAAAAAFAY/Rqqr8xPorNM/s400/

libvisio can build a librevenge document model from Visio files (more on the various librevenge-based libraries here), libodfgen can generate ODF output from such document models (one other possibility would be e.g. libepubgen), and the writerperfect module provides kind of a controller for the remaining modules, e.g. for our purpose, a vsd2odg binary.

Alternatives considered

One possibility is to build LibreOffice, use --with-system-libvisio and similar switches, then clone the repos, install them system-wide (possibly with your modifications), and then you can test your changes just with building the various libs, without changing your LO build (more here). The drawback is that this way you pollute your system with unstable versions of those libs.

An other possibility is to build LibreOffice as usual, and then use the external libraries patching mechanism to hack on the code. The drawback is that you have to work without git on the code, and also you can only work with a released version.

The pkg-config approach

So here is what I did to avoid the above mentioned drawbacks: all DLP projects use pkg-config to find the required libraries, so you can configure them in a way that allows building as a user, avoid installing them at all, and still execute vsd2odg using the libs with your changes. Here is how to do it:

librevenge:

git clone git://git.code.sf.net/p/libwpd/librevenge
cd librevenge
./autogen.sh
./configure --enable-debug
make

libvisio:

git clone git://gerrit.libreoffice.org/libvisio
cd libvisio
./autogen.sh
./configure REVENGE_CFLAGS="-I/home/vmiklos/git/libreoffice/librevenge/inc" REVENGE_LIBS="-L/home/vmiklos/git/libreoffice/librevenge/src/lib/.libs/ -lrevenge-0.0" REVENGE_GENERATORS_CFLAGS="-I/home/vmiklos/git/libreoffice/librevenge/inc" REVENGE_GENERATORS_LIBS="-L/home/vmiklos/git/libreoffice/librevenge/src/lib/.libs/ -lrevenge-generators-0.0" REVENGE_STREAM_CFLAGS="-I/home/vmiklos/git/libreoffice/librevenge/inc" REVENGE_STREAM_LIBS="-L/home/vmiklos/git/libreoffice/librevenge/src/lib/.libs/ -lrevenge-stream-0.0" --enable-debug --enable-werror
make

libodfgen:

git clone git://git.code.sf.net/p/libwpd/libodfgen
cd libodfgen
./autogen.sh
./configure REVENGE_CFLAGS="-I/home/vmiklos/git/libreoffice/librevenge/inc" REVENGE_LIBS="-L/home/vmiklos/git/libreoffice/librevenge/src/lib/.libs/ -lrevenge-0.0" REVENGE_STREAM_CFLAGS="-I/home/vmiklos/git/libreoffice/librevenge/inc" REVENGE_STREAM_LIBS="-L/home/vmiklos/git/libreoffice/librevenge/src/lib/.libs/ -lrevenge-stream-0.0" --enable-debug
make

writerperfect:

git clone git://git.code.sf.net/p/libwpd/writerperfect
cd writerperfect
./autogen.sh
./configure REVENGE_CFLAGS="-I/home/vmiklos/git/libreoffice/librevenge/inc" REVENGE_LIBS="-L/home/vmiklos/git/libreoffice/librevenge/src/lib/.libs/ -lrevenge-0.0" REVENGE_STREAM_CFLAGS="-I/home/vmiklos/git/libreoffice/librevenge/inc" REVENGE_STREAM_LIBS="-L/home/vmiklos/git/libreoffice/librevenge/src/lib/.libs/ -lrevenge-stream-0.0" ODFGEN_CFLAGS="-I/home/vmiklos/git/libreoffice/libodfgen/inc" ODFGEN_LIBS="-L/home/vmiklos/git/libreoffice/libodfgen/src/.libs -lodfgen-0.1 -lrevenge-0.0 -lrevenge-stream-0.0" VISIO_CFLAGS="-I/home/vmiklos/git/libreoffice/libvisio/inc" VISIO_LIBS="-L/home/vmiklos/git/libreoffice/libvisio/src/lib/.libs -lvisio-0.1 -lrevenge-0.0" --enable-debug --with-libvisio

Of course, replace /home/vmiklos/git/libreoffice/ with any other directory you like, just be consistent. ;-)

Now you can hack on any of these libraries, you just need to build your changes, and then vsd2odg will produce a flat ODG that you can quickly test with any ODF processor, like LibreOffice. One remaining trick (in case you’re not an autotools expert) is that vsd2odg is a libtool shell script, not a binary. If you still want to run the underlying binary in gdb, here is how you can do that:

libtool --mode=execute gdb --args vsd2odg /home/vmiklos/git/libreoffice/test.vsdx

In case the above considered two alternatives are not sufficient for your purposes, then I hope you find this setup useful. ;-)

The yellow border around the pig

Posted on: Sat 25 October 2014

Estimated read time: 1 minutes

https://lh3.googleusercontent.com/-IhaYMXbDVyo/VEufBghNiHI/AAAAAAAAE4g/8n2DuQ_Edeo/s400/

It turns out LibreOffice’s RTF and DOCX import filter ignored borders around Writer pictures. Given that this worked in the RTF case in the past, it’s a bit amusing that now the very same commit implements a new feature for the DOCX case and at the same time fixes a regression in the RTF filter. Code sharing FTW! :-)