I think this is my first Calc bugfix. :-) The problem I wanted to fix is that while LibreOffice 4.4 learned advanced fill attributes (gradients, hatches, etc) for page headers / footers in Writer, this broke the saving of simple graphic header backgrounds in Calc. Seeing that no-one stepped up to fix this, I tried to do this myself — and luckily the problem was in the ODF export filter, which is much more familiar to me, compared to Calc core.
Part of that larger feature was changes to the ODF filter, and the bug was exactly about touching shared ODF filter code to please Writer without testing other LibreOffice applications.
The actual problem was overlapping constants: as in multiple constants had the same numeric value. Such issues are sometimes hard to track down, in this case it wasn’t that hard: the context filter that tried to make sure we don’t write duplicated XML attributes removed the background property when it tried to guard header repeat offsets.
Given that this affected the LibreOffice 4.4 and 5.0 series, both branches got a backport of the commit, and so the next release from those lines will have the fix.
If you are into C++ programming, you probably know that smart pointers are not just literally strange things like the above ones. ;-) LibreOffice 5.0 got VclPtr, which is smart pointer specialized to VCL’s needs.
Such refactorings are good things, except that there is no huge rework without regressions. In this case the WMF filter had 3 places where we were leaking VirtualDevices due to a misconversion to the new VclPtr API. The problem was that the document had 135 images, WMF files to be exact. Now given that the leaks were during parsing of WMF records (a WMF file consists of multiple WMF records), at the end we leaked 8884 VirtualDevices. At first the problem was seen as Windows-specific, as at least X on Linux has no problem with creating that many VirtualDevices, but Windows' default resource limits are hit in this case.
A note about testing this bugfix effectively, so it never happens again. The problem was that I wanted to create a minimal reproducer, but I also needed a document with lots of WMF images, each complex enough to trigger the resource limit. At the end I manually created a DOCX file that had the same image copy&pasted multiple times: that way we really imported them multiple times (normally we notice that they’re the same, and only save the image to the file once, and put multiple references to it), and because DOCX is also a ZIP container, the test file can be still only 99KB instead of the original 17MB RTF.
Thanks to the 4.4 → 5.0 Windows bibisect repo, it was immediately obvious that this is a VclPtr problem, and then it was possible to identify the root cause, and finally see that the bug title mentioning RTF was just container of the WMF images in this case, the problem had nothing to with with RTF and the leak wasn’t Windows-specific, either.
If you ever used the mail merge wizard with a Calc data source, then you know how it worked in the past: you’ve got 3 files: the .odt mail template, the .ods data source and a .odb data source definition that defines how to access the .ods.
The target of this LHM-funded project was to get rid of the .odb file and just embed it into the .odt mail template. Why?
Here is the problem description from a user’s point of view: "When a mail merge document is being saved a separate database file is being automatically created. It links the Writer file to the spreadsheet (ODS, XLS, XLSX etc.) as data source. This additional 2kB ODB file confuses users, they might delete it without knowing to break the connection to the data source." An additional problem is that because the non-embedded data source definition is part of the user profile, you can’t just move the three files to an other machine, as .odb registration will be missing there.
If you are interested how this looks like, here is a demo (click on the image to see the video):
That’s it for now — as usual the commits are in master, so you can try this right now with a 5.1 daily build. :-)
What was problematic is that since C++11,
override is a valid keyword
after a member function declaration, and we have our
SAL_OVERRIDE macro that
to be able to use it before all our supported compilers recognize it.
Unsupported parsers include ctags, so if a member function have SAL_OVERRIDE,
ctags only indexed the definition, not the declaration.
The hope is that the later will go away in the long run, so it won’t really be a problem that ctags do not recognize that macro out of the box. :-)
libreoffice-5-0 branch is created, and in each release cycle there is at
least one topic that was a long overdue cleanup. In this post, I’m describing
how and why the
writerfilter/source/resourcemodel/ directories disappeared — though
probably nobody will miss them. :-)
The resourcemodel building block of writerfilter (that handles Writer’s DOCX and RTF import in LibreOffice) was basically a bucket of old and unused stuff. After the removal of the unused .DOC tokenizer, it turned out that most of that code was just referring to itself or template code that was used with a single type only (hello TableManager). resourcemodel was about 6000 lines of code at the time LibreOffice was started, and after some manual cleanup and moving the still needed small part to dmapper (the shared part of the RTF / DOCX import), tools like loplugin:unreffun and callcatcher helped to detect what became truly unused — at the end resulting in the complete removal of these directories.
That means that after folding the last remaining header into dmapper, the relevant documentation can hopefully now describe source contents easier, having just 4 directories: the RTF and the DOCX tokenizer, the shared part and the UNO service implementations. One less cryptic leftover nobody really knows what it is! ;-)
The first ever UK LibreOffice Hackfest took place in the city of Cambridge on May 21st to 23rd (Thursday → Saturday), kindly hosted by Collabora.
My starter idea was to fix tdf#90315, i.e. to support both nested tables and multiple columns with the proper spacing in between them in the RTF import. For comparison, here is how this looked in LibreOffice 3.4:
The table borders looked OK due to correct column spacing, but the nested table is missing. Then here is the LibreOffice 4.4 state:
Nested table is OK, but the table borders are strange due to incorrect column spacing. Finally here is how it looks like now, when the import result is correct:
Other than this, here is a list of other topics I hacked on:
After fixing two more less interesting regressions, now it seems we’re down to 0 for the regressions having RTF in their summary, which is promising. :-)
Last week I went to Zaragoza to give a talk on how LibreOffice handles interoperability at Protocols Plugfest Europe 2015 on Tuesday. Although I was told this conference is a successor of the previous Zentyal Summit (and I were not there) the conference seemed well-attended — proof above. :-)
Jacobo also gave a LibreOffice-related talk on Wednesday.
On the same day, there were some explicit spare time, so I took the opportunity to walk in the historical parts of the city, see my photos and a panorama if that kind of pictures are of your interest. FWIW, Hotel Sauce has free wifi in the rooms, that’s kind of impressing for a two-star category. ;-)
As usual, thanks Collabora for sponsoring this travel!
TL;DR: Import of old-style (pre-2010 for RTF, pre-2007 for DOCX) math equations embedded into text documents should be now imported as editable embedded math objects.
Longer version: if you want to embed math equations into RTF or DOCX files, you have two choices. The older approach is to embed a MathType OLE object into the file, the newer one is a native OOXML markup, which has an RTF markup equivalent as well. Handling of the later has been implemented by Luboš Luňák for DOCX a long time ago, and I contributed the RTF equivalent almost 3 years ago.
What remains is the handling of the older version, the embedded OLE object. Previously only the replacement graphic was imported, so regardless of the Tools → Options → Load / Save → Microsoft Office → MathType to Math checkbox, the result was never editable.
Here is how it looks like now:
Given that the RTF and the DOCX importers share lots of code in the
writerfilter/ module, I implemented the same for the DOCX import at the same
time, too. The interesting challenge was that writerfilter wants an
implementation for the embedded object if it is to be handled internally by
LibreOffice, but the MathType filter (originally created to handle math
objects inside binary DOC files) didn’t have one. Once I implemented such a
wasn’t too hard.
If I’m at describing features new in LibreOffice Writer 5.0 file filters, here are a few more:
Automatic hyphenation at a document level and exceptions to it is now imported in RTF. I also adjusted the exporter, so now Word sort of understands our hyphenation rules, replacing the OOo-specific custom hyphenation RTF extension that Word just ignored.
picture wrap distance properties are now handled in the RTF importer — previously that was only handled for shapes.
And a number of bugfixes for the RTF filter:
Do these sound interesting? Look at what others did for LibreOffice 5.0 on the TDF wiki, even if it’s far from complete, as the 5.0 branch is not yet created. :-)
Thanks Óbuda University for hosting us, it was a great event! Other than talking to the usual suspects like Tamás Zolnai or Gábor Kelemen, I enjoyed two OpenStreetMap talks: it was extermely cool to hear that finally the turistautak.hu community changed their license in February so that all their free maps can be imported to OpenStreetMap — finally one pointless fight ends.
My uploaded slides are available here.