Index ¦ Archives ¦ RSS > Tag: en

A LibreOffice / AddressSanitizer setup

Estimated read time: 5 minutes

sanitizers (ASAN, UBSAN, etc.) is a collection of tools to detect memory corruption bugs, undefined behavior and more by instrumenting the code generated by the compiler. (That’s the main difference from valgrind.) From LibreOffice’s perspective one more important difference is that there is a Jenkins_Linux_Ubsan tinderbox that makes sure that the master branch is kept clean from errors detected by a given configuration.

So when the tinderbox failed after a commit of mine, I wanted to set up a similar environment locally, reproduce and fix the bug, and push the fix once I saw that the fix indeed solves the problem. You can set many options both at build and runtime, so while we have some documentation on the TDF wiki (and also Stephan was kind enough to share his config) on how to use these sanitizers, it wasn’t clear to me what to do step by step. So here is one possible setup that worked for me — in my case I wanted to reproduce a stack-use-after-return problem. If you haven’t ever built LibreOffice before, then go to the Development wiki page, first do a normal build, and if everything went fine, came back here.

Build options

My autogen.input looks like this:

CC=clang -fsanitize=address
CXX=clang++ -fsanitize=address
--enable-dbgutil
--disable-firebird-sdbc

Which is a normal clang debug build, except:

  • you need to add -fsanitize=... to CXX (not to CXXFLAGS), as explained on the wiki

  • you need to explicitly disable Firebird integration for now

Building

My first attempt failed at build time, as even the tools used only during the build are instrumented, and some memory leak was detected there, which means the build aborted before reaching the problem I was interested in. To disable leak detection during build, and disable parallelism (I needed this, as I did the build in the background while using the machine for something else):

make build-nocheck ASAN_OPTIONS=detect_leaks=0 PARALLELISM=1

This also means that I explicitly disabled running any tests, as I knew which is the single unit test I want to run for the purposes of reproducing and fixing the problem.

Testing

Once the build completed, it turns out that the stack-use-after-return detection is disabled at runtime by default, which means I could not see any problem locally. Here is the commandline to run one specific CppunitTest with this detection on:

cd sw; make -sr CppunitTest_sw_tiledrendering ASAN_OPTIONS=detect_leaks=0:detect_stack_use_after_return=1

Again, this is just one possible setup, you can use other -fsanitize=... options, other environment variables during build and during testing — but hopefully it helps in the future to avoid pushing fixes for such problems detected by sanitizers just blindly.

Update, 2019-01-25

The above described "do it yourself" way doesn’t work with LibreOffice master (towards 6.3) and openSUSE Leap 15.0 anymore. I tried to debug what is the exact problem, but there are many moving parts here:

  • gcc version (providing libstdc++)

  • clang version (5.0.2 is too old; 7 failed to build the plugins, trunk towards 9 also generated false positives for me)

  • various sanitizer-related environment variables

  • various sanitizer-related compiler options

So it’s much easier to just use the combination used by the Jenkins_Linux_Ubsan tinderbox than something custom. Doing that is reasonably straightforward, but still there are a couple of non-trivial steps. What worked for me is:

  • Set up LODE according to its wiki page.

  • Instead of plain ./setup --dev, do:

./setup --jenkins
./setup --jenkins-san
./setup --dev
cd $LODE_HOME/dev/core

This will build a working version of both gcc and clang for you.

  • Instead of manually setting up the environment variables, do:

. $LODE_HOME/bin/lode_ubsan_env
  • Instead of manually setting autogen options, use this autogen.input:

--with-distro=Jenkins/Linux_ubsan_master
  • Finally to build the code and run a specific test based on tinderbox mail:

make build-nocheck
cd sw; make -sr CppunitTest_sw_unowriter CPPUNIT_TEST_NAME="testPasteListener"

Update, 2019-11-14

The 2019-01-25 setup can be fine-tuned further. Sadly LODE mixes two goals: setting up dependencies / environment and cloning repos. I recently added support for bypassing any cloning, so you can use LODE even if you already built the code and want to keep using your own way. The tricks are:

  • Use autogen.env-san in two ways: first comment out the second half to run the above ./setup --jenkins && ./setup --jenkins-san, then to source the full environment. This means you get all the up to date environment variables from LODE, but you don’t need a chroot to avoid LODE messing up your environment.

  • Use autogen.input-san to again inherit all up to date autogen switches from LODE, but avoid the submodule pain that would be the default. Now you can do a make check gb_SUPPRESS_TESTS=y to build (but not run) all the code & tests.

  • Use up.sh to build and run all Online code & tests — with new enough core.git master and online.git master the online.git make check passes for me in this environment.

I believe this setup mostly delegates (envrionment, config switches and toolchain) maintenance to LODE, still allows not running in a chroot and managing your git clones without LODE.

Update, 2023-12-02

These days it works to use the system clang compiler, so no need to to execute ./setup in lode.git. This means:

  • Use (source) autogen.env-san to set the compiler flags and environment variables from LODE.

  • Use autogen.input-san as autogen.input in the core.git build to set the autogen options.

  • Use up.sh to build Online the usual way.


On LibreOffice's ViewContact/ViewObjectContact/ObjectContact

Estimated read time: 1 minutes

I’ve recently fixed a missing-repaint problem in LibreOffice’s headless backend, but the root cause wasn’t close to the symptom I saw first. Part of the debugging process was to understand what’s the relation between sdr::contact::ViewContact, sdr::contact::ViewObjectContact and sdr::contact::ObjectContact.

See this old presentation and the review of my documentation update for the details, but the short version is that:

  • somewhat confusingly, sdr::contact::ViewContact is part of the model, and there is one sdr::contact::ViewContact object per shape

  • sdr::contact::ViewObjectContact is part of a view, and there is one sdr::contact::ViewObjectContact per shape, per view

  • finally sdr::contact::ObjectContact is part of a view, and there is one sdr::contact::ObjectContact per view

So the answer to my original Is it normal that I have two object contacts and a single view contact for a shape and two views? question is: yes, that’s expected. ;-) Hopefully the updated documentation is now more clear, the incorrect 1:N relation in the original class diagram first confused me.


RTF shape import: group scaling and flip in LibreOffice Writer

Estimated read time: 1 minutes

Some kind of simple logo was reported to be mis-imported in the RTF filter, it looked like this:

https://farm8.staticflickr.com/7451/27320164444_f437d0418e_o.png

which is interesting, but the reference output is different:

https://farm8.staticflickr.com/7317/27898228066_f5918e59bf_o.png

With a bit of investigation, it turns out this was a flipped group shape with a few rectangles, so the mis-rendering of the logo was due to two independent problems. The first is that the child shapes inside a group shape were scaled incorrectly. See the commit for the exact details, after fixing scaling, it looked closer to the original:

https://farm8.staticflickr.com/7301/27320164424_4ce2ced16c_o.png

The second problem was that the group itself was flipped, and this was again ignored on import. After fixing that problem:

https://farm8.staticflickr.com/7228/27898228076_f89fabeb00_o.png

the result is basically the same as the reference. Both fixes are not only on master (towards LibreOffice 5.3) but also backported to LibreOffice 5.2. :-)


Classification toolbar in LibreOffice: Multiple Categories

Estimated read time: 2 minutes

I explained the concept of the classification toolbar appearing in LibreOffice 5.2 in a previous post. One incremental update on top of that is support for multiple categories, which I’m describing here.

TSCP in its BAILSv1 spec defines 3 different policy types (IntellectualProperty, NationalSecurity and ExportControl), and you can set different classification categories for different policy types. Giving a practical example, if you’re communicating with someone, then you can declare what policy type will you be using for that communication, and tag a single document multiple times, once for all used policy types.

This multiple-categories feature wasn’t supported by LibreOffice previously, we simply read the IntellectualProperty type from the document, and also only wrote that. Now the user interface still reacts to the IntellectualProperty policy type (since in case there are multiple policies and each of them wants a different e.g. watermark, the UI has to pick one in some way), but other than that we read all types from the document, all values are shown on the toolbar and of course you can also set all of them.

All internal APIs and the .uno command that can be used from macros take a type parameter to get/set a given type of category, if wanted. As usual, you can try this right now with a 5.2 or 5.3 daily build. :-)


Recent undo/redo fixes in LibreOffice Impress

Estimated read time: 2 minutes

I’ve recently spent some time fixing a few bugs around undo/redo in Impress, in the area of table shapes. I’m mentioning these here as they’re all bugfixes, so they are backported to LibreOffice 5.1, and no major release notes will point them out. So if you are using Impress table shapes and you consider their usability suboptimal, then read on, I have some great news. :-)

The first problem is tdf#99396, where there were actually two problems:

  1. Vertical alignment is a cell property, but when setting that property, the undo code was simply missing.

  2. When editing cell text (the user is inside "text edit") the undo stack is in a special mode — and ending text edit made the cell property undo items go away. This wasn’t a problem for vertical alignment only, it was a problem for example when the background color of the cell was changed, too. These cell property changes are now added to the undo stack after finishing text edit, so you can still undo them later.

The second bugreport is tdf#99452 where resizing a table shape row separator and then undoing the resize didn’t restore the original state. See the commit for all the details, but the bottom line is: it isn’t a good idea to automatically re-layout the table when we’ve already resized the shape as part of undo, but the table rows were not yet resized to reflect their original sizes.

As usual, you can try this right now with a 5.2 daily build. :-) (Or even with an 5.1 one, actually.)


Classification toolbar in LibreOffice

Estimated read time: 3 minutes

In the past few posts in this blog I wrote about various digital signing-related improvements that will land in LibreOffice 5.2. In this post I would like to cover an other aspect of helping secure document handling: classification. First, thanks to the Dutch Ministry of Defense who made this work possible (as part of a project implementing trusted signing and communication in LibreOffice) in cooperation with Nou&Off. The basic idea is that in case the user is required to follow a policy when editing a document, then LO can help the user respect these rules in case LO is informed about the rules.

Luckily TSCP produced a number of open standards around this, which LO can implement without going after a specific vendor. For the scope of this post, two of them are interesting:

So how does this look like? View → Toolbars → Classification can enable a toolbar that’s disabled by default:

It has a list box that contains the categories described by the BAF policy. LO comes with such an example policy by default, that’s why you can see categories there already. If you want to use your own policy, you can do so: Tools → Options → LibreOffice → Paths has a Classification row to configure a custom policy:

And if you select the Internal Only category, you’ll see most of the features described by a category: it can add an info-bar (UI only), header/footer fields and a watermark (stored in the document) as well:

I would like to point out that the watermark is a proper scalable customshape, not a poor bitmap. :-) Perhaps this part could be extracted to a separate Add Watermark feature later, as I think it’s quite useful on its own as well.

Finally, one feature is that LO knows how secure the document is once it has a classification category, which means a classification scale and level. For two documents that have the same scale, LO can detect if the user would accidentally try to leak sensitive content from a document with higher classification level to a document that has a lower one. This is implemented when copy&pasting:

Most of these features work in all Writer, Calc and Impress. The header/footer fields and the watermark are Writer-only, and also Calc/Impress does classification checks only in its internal copy&paste code (e.g. not when doing paste special and choosing RTF).

Putting all of these together, LO can now help users required to follow classification rules in a number of different ways, as long as the rules they have to follow are available as a BAF XML policy. As usual, you can try this right now with a 5.2 daily build. :-)


OOXML signature export in LibreOffice

Estimated read time: 4 minutes

After adding support for reading OOXML signatures in LibreOffice, I continued with implementing OOXML signature export (as in: not only verification, but signing).

By verification, I mean that I count the signature of the input document, then compare it with an existing signature, and if they match, it is verified. This can be also called "import", as I only read an existing signature, I don’t create one. By signing, I mean the creation of a new signature, which is always good — if it isn’t, that’s a programming error. This can be also called "export", as I write the new signature into the document.

First, thanks to the Dutch Ministry of Defense who made this work possible (as part of a project implementing trusted signing and communication in LibreOffice), this included:

  • signing a previously unsigned document

  • appending a signature to an already signed document

  • removing a signature from a document with multiple signatures

  • removing the last signature of a signed document, turning it into an unsigned one

Obviously the hardest part was the initial success: signing a previously unsigned document, in a way that is accepted by both LibreOffice and MSO. One trick here is that while in ODF the signature stream is simply added to an existing document storage, in OOXML the storage has to refer to the signature sub-storage (it’s not a stream, as it has a stream for each individual signature), then it has to be signed, and finally the signature can be added to the document storage. So instead of reading the document, then appending the signature, here we need to modify the document, and then we can append the signature. By referring the signature sub-storage, I mean it is necessary to modify [Content_Types].xml (so it contains a mime type for both the .sigs extension, and also for the individual /_xmlsignatures/sigN.xml streams) and also the _rels/.rels stream has to refer _xmlsignatures/origin.sigs, which will contain the list of actual signatures. A surprising detail is that the signature is required to contain quite some software and hardware details about your environment, like monitor resolution, Windows version and so on. For a cross-platform project like LibreOffice this isn’t meaningful, not to mention we have no interest in leaking such information. So what I did instead is writing hardcoded values based on what my test environment would produce, just to please MSO. ;-)

After the initial OOXML signature exporter was ready, the next challenge was adding multiple signatures. The problem here is that you have to roundtrip the existing signatures perfectly. And when I write perfectly, I really mean it: if a single character is written differently, then the hash of the signature will be different, so the roundtrip (when we write back an existing and a new signature to the document) will invalidate the signature. And there is no way around that: the very point of the signature is that only the original signer can re-calculate the signature hash. :-) So what we do is simply threating the existing signatures as a byte array, and when writing back, then we don’t try to re-construct the signature stream based on the xmlsecurity data model, but simply write back the byte array. This way it’s enough to extract parts of the signature which are presented to the user (date, certificate, comment), and we don’t need to parse the rest.

Removing one of multiple existing signatures isn’t particularly hard, you just need to update _xmlsignatures/_rels/origin.sigs.rels and [Content_Types].xml which refer each and every signature stream. It’s a good idea to truncate them before writing, otherwise you may get a not even well-formed XML as a result.

Finally removing the last signature is a matter of undoing all changes we did while adding the first signature (the content type list and the toplevel relation list), finally removing the signature sub-storage all-together. I also factored out all this signature management code from DigitalSignaturesDialog (which is a graphical dialog) to DocumentSignatureManager, so that all the above mentioned features can be unit-tested.

Putting all of these together, LO can now do all signature add, append, remove and clean operations a user would expect from what is referred as simply OOXML signature support. As usual, you can try this right now with a 5.2 daily build. :-)


OOXML signature import in LibreOffice

Estimated read time: 3 minutes

(via ascertia)

After adding support for SHA-256 hashes in LibreOffice, I turned towards implementing OOXML signature import (as in: verification, not signing) in LibreOffice. First, thanks to the Dutch Ministry of Defense who made this work possible (as part of a project implementing trusted signing and communication in LibreOffice), I collected a list of building blocks needed for this to work:

  • support for the Relationships Transform Algorithm (described in ISO/IEC 29500-2:2012) in xmlsec

  • an actual XML parser for the OOXML signature in xmlsecurity/

  • a new filter flag, so that our code no longer assumes "is ODF" means "supports digital signing" and

  • some refactoring in xmlsecurity/, so that our digital signature code doesn’t assume that multiple signatures are always written to a single file

The xmlsec bits are now upstream, it seems to me that new algorithm is needed, so that MSO can avoid signing a number of streams (files in ZIP containers), while still being able to verify that all normal streams are signed. Given that MSO by default doesn’t sign all streams (so that e.g. the metadata of the document can be modified without invalidating signatures), this is in use even for a hello-world document. This implies that a typical OOXML signature will never gain the best "signed" category in LO, as we’ll always warn that even though the signature is valid, not all streams are signed. This is a bit of a rant, but better not hide the reality: a default ODF signature covers more than a default OOXML signature.

The OOXML signature parser had to extract all information from the signature markup that’s interesting for LibreOffice, like the certificate, the signature date or the signature description. I considered extending the ODF signature parser instead of implementing a new one for OOXML, since both markups are based on the same W3C signing spec, but they are different enough that the added complexity doesn’t outweigh the benefit of code sharing here.

The next step was to add a new SUPPORTSSIGNING filter flag in filter/, and mark the DOCX, XLSX and PPTX file filters as such, and then of course find places mostly in sfx2/ and xmlsecurity/ that assume only ODF files can be signed, and modifying those checks to also handle this new flag.

Finally, a difference between ODF and OOXML signatures is that ODF puts all of them in a single stream, and all the signing and verifying code works with that stream. However, in case of OOXML, all signatures are in separate streams, so if we want to work with a single object as kind of a signature context, we need a storage (a sub-directory inside the ZIP container), and work with that.

Putting all of these together, we now have unit tests that take test documents having "good" and "bad" signatures, and the verification result in LO will match with the one of MSO. As usual, you can try this right now with a 5.2 daily build. :-)


SHA-256 hashes for ODF signatures in LibreOffice

Estimated read time: 2 minutes

As it happened with MD5 hashes in the past, the world is currently moving from SHA1 hashes to SHA-256 hashes these days. This affects LibreOffice’s ODF signing feature as well, where we previously wrote and read SHA-1 hashes, but not SHA-256 ones. First, thanks to the Dutch Ministry of Defense who made this work possible (as part of a project implementing trusted signing and communication in LibreOffice), I could start work on tdf#76142 which attached a reproducer document as well, helping the implementation of this feature.

If you’re not into the digital signature details, SHA-256 is relevant in two aspects here:

  • it can be a signature method, denoted by the http://www.w3.org/2001/04/xmldsig-more#rsa-sha256 URI, and

  • it can be a digest method, denoted by the http://www.w3.org/2001/04/xmlenc#sha256 URI

Hashing is interesting in the context of digital signatures because typically not the whole document is signed, just a hash of it, and crypto frameworks like nss or mscrypto typically tie these two together, so you just say you sign with rsa-sha256, which in more detail means hashing with SHA-256 and then signing using rsa.

A valid signed document using SHA-256 hashing looked like this before:

I.e. we failed to validate the signature, and presented a dialog that suggested the signature is not valid. After my changes, it looks like this:

I.e. no error on loading, and the status bar icon tells the user that everything is fine, except that we can’t validate the certificate used for signing.

As for when should LibreOffice start writing (not reading) SHA-256 hashes when creating signatures, it’s an open question. Probably best to wait till most users already have a version that can read those hashes. Then we would still keep support for reading SHA-1 hashes, but we would use SHA-256 when creating new signatures.

Another detail is that the hard work of signing in LibreOffice is done by using libxmlsec. We bundled a heavily patched version from 2009, and it wasn’t clear how much work it is to port our patches to a newer upstream version, so I’ve initially backported the SHA-256 patches to our older version (for the nss and mscrypto backends of libxmlsec, as that covers what LibreOffice uses on Linux, Windows and OS X). At the end I managed to update our bundled libxmlsec to a newer (even if not the newest yet) version, so latest master got rid of those custom backports. As usual, you can try this right now with a 5.2 daily build. :-)


Signature descriptions in LibreOffice

Estimated read time: 1 minutes

LibreOffice’s user interface prohibited creating multiple signatures by the same author on a document, because there was no semantic meaning of signing the same document multiple times. I’ve recently extended the user interface to be able to provide a signature description: this way it makes sense to allow multiple signatures from the same author, because now each signature can have a different meaning. First, thanks to the Dutch Ministry of Defense who made this work possible.

When the user selects File → Digital Signatures, the dialog lists existing signatures together with their description (if they have any):

When the user clicks on the Sign Document button, the dialog for certificate selection now also asks for an optional description:

Changing the value of the description invalidates the signature. For this feature to work, I have extended LibreOffice’s ODF signature markup to store not only a <dc:date> element as signature metadata, but also the <dc:description>. Given that the metadata of an ODF signature is not part of the ODF specification, it is allowed to extend the metadata with custom child elements, so it was not necessary to submit an ODF enhancement proposal for this file format change at this stage. As usual the commits are in master, so you can try this right now with a 5.2 daily build. :-)

© Miklos Vajna. Built using Pelican. Theme by Giulio Fidente on github.