Index ¦ Archives ¦ RSS > Category: libreoffice ¦ RSS

OOXML signature export in LibreOffice

Estimated read time: 4 minutes

After adding support for reading OOXML signatures in LibreOffice, I continued with implementing OOXML signature export (as in: not only verification, but signing).

By verification, I mean that I count the signature of the input document, then compare it with an existing signature, and if they match, it is verified. This can be also called "import", as I only read an existing signature, I don’t create one. By signing, I mean the creation of a new signature, which is always good — if it isn’t, that’s a programming error. This can be also called "export", as I write the new signature into the document.

First, thanks to the Dutch Ministry of Defense who made this work possible (as part of a project implementing trusted signing and communication in LibreOffice), this included:

  • signing a previously unsigned document

  • appending a signature to an already signed document

  • removing a signature from a document with multiple signatures

  • removing the last signature of a signed document, turning it into an unsigned one

Obviously the hardest part was the initial success: signing a previously unsigned document, in a way that is accepted by both LibreOffice and MSO. One trick here is that while in ODF the signature stream is simply added to an existing document storage, in OOXML the storage has to refer to the signature sub-storage (it’s not a stream, as it has a stream for each individual signature), then it has to be signed, and finally the signature can be added to the document storage. So instead of reading the document, then appending the signature, here we need to modify the document, and then we can append the signature. By referring the signature sub-storage, I mean it is necessary to modify [Content_Types].xml (so it contains a mime type for both the .sigs extension, and also for the individual /_xmlsignatures/sigN.xml streams) and also the _rels/.rels stream has to refer _xmlsignatures/origin.sigs, which will contain the list of actual signatures. A surprising detail is that the signature is required to contain quite some software and hardware details about your environment, like monitor resolution, Windows version and so on. For a cross-platform project like LibreOffice this isn’t meaningful, not to mention we have no interest in leaking such information. So what I did instead is writing hardcoded values based on what my test environment would produce, just to please MSO. ;-)

After the initial OOXML signature exporter was ready, the next challenge was adding multiple signatures. The problem here is that you have to roundtrip the existing signatures perfectly. And when I write perfectly, I really mean it: if a single character is written differently, then the hash of the signature will be different, so the roundtrip (when we write back an existing and a new signature to the document) will invalidate the signature. And there is no way around that: the very point of the signature is that only the original signer can re-calculate the signature hash. :-) So what we do is simply threating the existing signatures as a byte array, and when writing back, then we don’t try to re-construct the signature stream based on the xmlsecurity data model, but simply write back the byte array. This way it’s enough to extract parts of the signature which are presented to the user (date, certificate, comment), and we don’t need to parse the rest.

Removing one of multiple existing signatures isn’t particularly hard, you just need to update _xmlsignatures/_rels/origin.sigs.rels and [Content_Types].xml which refer each and every signature stream. It’s a good idea to truncate them before writing, otherwise you may get a not even well-formed XML as a result.

Finally removing the last signature is a matter of undoing all changes we did while adding the first signature (the content type list and the toplevel relation list), finally removing the signature sub-storage all-together. I also factored out all this signature management code from DigitalSignaturesDialog (which is a graphical dialog) to DocumentSignatureManager, so that all the above mentioned features can be unit-tested.

Putting all of these together, LO can now do all signature add, append, remove and clean operations a user would expect from what is referred as simply OOXML signature support. As usual, you can try this right now with a 5.2 daily build. :-)


OOXML signature import in LibreOffice

Estimated read time: 3 minutes

(via ascertia)

After adding support for SHA-256 hashes in LibreOffice, I turned towards implementing OOXML signature import (as in: verification, not signing) in LibreOffice. First, thanks to the Dutch Ministry of Defense who made this work possible (as part of a project implementing trusted signing and communication in LibreOffice), I collected a list of building blocks needed for this to work:

  • support for the Relationships Transform Algorithm (described in ISO/IEC 29500-2:2012) in xmlsec

  • an actual XML parser for the OOXML signature in xmlsecurity/

  • a new filter flag, so that our code no longer assumes "is ODF" means "supports digital signing" and

  • some refactoring in xmlsecurity/, so that our digital signature code doesn’t assume that multiple signatures are always written to a single file

The xmlsec bits are now upstream, it seems to me that new algorithm is needed, so that MSO can avoid signing a number of streams (files in ZIP containers), while still being able to verify that all normal streams are signed. Given that MSO by default doesn’t sign all streams (so that e.g. the metadata of the document can be modified without invalidating signatures), this is in use even for a hello-world document. This implies that a typical OOXML signature will never gain the best "signed" category in LO, as we’ll always warn that even though the signature is valid, not all streams are signed. This is a bit of a rant, but better not hide the reality: a default ODF signature covers more than a default OOXML signature.

The OOXML signature parser had to extract all information from the signature markup that’s interesting for LibreOffice, like the certificate, the signature date or the signature description. I considered extending the ODF signature parser instead of implementing a new one for OOXML, since both markups are based on the same W3C signing spec, but they are different enough that the added complexity doesn’t outweigh the benefit of code sharing here.

The next step was to add a new SUPPORTSSIGNING filter flag in filter/, and mark the DOCX, XLSX and PPTX file filters as such, and then of course find places mostly in sfx2/ and xmlsecurity/ that assume only ODF files can be signed, and modifying those checks to also handle this new flag.

Finally, a difference between ODF and OOXML signatures is that ODF puts all of them in a single stream, and all the signing and verifying code works with that stream. However, in case of OOXML, all signatures are in separate streams, so if we want to work with a single object as kind of a signature context, we need a storage (a sub-directory inside the ZIP container), and work with that.

Putting all of these together, we now have unit tests that take test documents having "good" and "bad" signatures, and the verification result in LO will match with the one of MSO. As usual, you can try this right now with a 5.2 daily build. :-)


SHA-256 hashes for ODF signatures in LibreOffice

Estimated read time: 2 minutes

As it happened with MD5 hashes in the past, the world is currently moving from SHA1 hashes to SHA-256 hashes these days. This affects LibreOffice’s ODF signing feature as well, where we previously wrote and read SHA-1 hashes, but not SHA-256 ones. First, thanks to the Dutch Ministry of Defense who made this work possible (as part of a project implementing trusted signing and communication in LibreOffice), I could start work on tdf#76142 which attached a reproducer document as well, helping the implementation of this feature.

If you’re not into the digital signature details, SHA-256 is relevant in two aspects here:

  • it can be a signature method, denoted by the http://www.w3.org/2001/04/xmldsig-more#rsa-sha256 URI, and

  • it can be a digest method, denoted by the http://www.w3.org/2001/04/xmlenc#sha256 URI

Hashing is interesting in the context of digital signatures because typically not the whole document is signed, just a hash of it, and crypto frameworks like nss or mscrypto typically tie these two together, so you just say you sign with rsa-sha256, which in more detail means hashing with SHA-256 and then signing using rsa.

A valid signed document using SHA-256 hashing looked like this before:

I.e. we failed to validate the signature, and presented a dialog that suggested the signature is not valid. After my changes, it looks like this:

I.e. no error on loading, and the status bar icon tells the user that everything is fine, except that we can’t validate the certificate used for signing.

As for when should LibreOffice start writing (not reading) SHA-256 hashes when creating signatures, it’s an open question. Probably best to wait till most users already have a version that can read those hashes. Then we would still keep support for reading SHA-1 hashes, but we would use SHA-256 when creating new signatures.

Another detail is that the hard work of signing in LibreOffice is done by using libxmlsec. We bundled a heavily patched version from 2009, and it wasn’t clear how much work it is to port our patches to a newer upstream version, so I’ve initially backported the SHA-256 patches to our older version (for the nss and mscrypto backends of libxmlsec, as that covers what LibreOffice uses on Linux, Windows and OS X). At the end I managed to update our bundled libxmlsec to a newer (even if not the newest yet) version, so latest master got rid of those custom backports. As usual, you can try this right now with a 5.2 daily build. :-)


Signature descriptions in LibreOffice

Estimated read time: 1 minutes

LibreOffice’s user interface prohibited creating multiple signatures by the same author on a document, because there was no semantic meaning of signing the same document multiple times. I’ve recently extended the user interface to be able to provide a signature description: this way it makes sense to allow multiple signatures from the same author, because now each signature can have a different meaning. First, thanks to the Dutch Ministry of Defense who made this work possible.

When the user selects File → Digital Signatures, the dialog lists existing signatures together with their description (if they have any):

When the user clicks on the Sign Document button, the dialog for certificate selection now also asks for an optional description:

Changing the value of the description invalidates the signature. For this feature to work, I have extended LibreOffice’s ODF signature markup to store not only a <dc:date> element as signature metadata, but also the <dc:description>. Given that the metadata of an ODF signature is not part of the ODF specification, it is allowed to extend the metadata with custom child elements, so it was not necessary to submit an ODF enhancement proposal for this file format change at this stage. As usual the commits are in master, so you can try this right now with a 5.2 daily build. :-)


Import of DOCX and RTF linked graphic into LibreOffice Writer

Estimated read time: 1 minutes

As it has been reported, the RTF includepicture field was ignored on import. As writerfilter has quite some shared code for DOCX and RTF import, I also looked at the state of linked graphics in the DOCX import, and that wasn’t better, either.

Although, the root causes were different. ;-) Regarding DOCX, a linked and a non-linked graphic has quite similar drawingML markup: the only difference is if the graphic has a relationship alias (embedded case) or a (possibly relative) external URL. Relative external URLs were broken, as the writerfilter → oox call (to import the graphic) did not forward the base URL, so oox had no chance to properly resolve a relative URL.

Regarding RTF, a linked graphic is represented as an includepicture field, and now the RTF tokenizer resolves that to a real graphic. As you can see on the above screenshot series (new Writer behavior, old Writer, and reference), we now behave the same way as the reference (or the Writer DOC import).

A related interesting fact I noticed is that includepicture fields in OOXML are valid, but it seems Word never writes them: either their expanded field result is outdated (e.g. it’s some text), or if the user updates the field, then their implementation instantly replaces the field with a drawingML markup that links the graphic.


Mail merge embedding in LibreOffice Writer FOSDEM talk

Estimated read time: 1 minutes

Yesterday I gave a Mail merge embedding in LibreOffice Writer talk at FOSDEM 2016, in the Open document editors developer room. The room was well-crowded — seems this year LibreOffice Online was a hot topic. ;-)

We also had a hackfest with about 20 hackers attending, (again) kindly hosted by Betacowork on Thursday and Friday, before FOSDEM.

There were a few topics I hacked on:

  • .uno:Paste AnchorType param for Writer

  • tdf#97371 DOCX import regression fix about TextBoxes

  • tdf#96175 RTF export feature about company doc property

  • refactoring around Writer’s new (in 5.1) hide-whitespace feature, as requested by Ashod

  • code coverage: RtfExport::WriteRevTab() was completely untested previously, now fixed

A full list of achievements is available, if you were at the hackfest and you did not contribute to that section, please write a line about what did you hack on. :-)

Quite some other slides are now available on Planet, don’t miss them.


RTF page background export in LibreOffice Writer

Estimated read time: 1 minutes

While I added support for page background colors in the RTF import back in 2013, the export part was missing up to now.

If you set a solid color fill for a page style, and you export it to RTF, here is how the reference rendering output looks like:

However, in Libreoffice only the background of the paragraph reflected the color set by the user:

After implementing this feature in the RTF export filter, it now looks much closer to the reference:

At the moment only solid fill is implemented, so other advanced fill types like graphics or gradients are still missing.


Rich RTF comment export in LibreOffice Writer

Estimated read time: 1 minutes

As it has been reported in tdf#94377, the state of Writer comment contents in the RTF export filter wasn’t great.

With two recent changes, however, the situation is now much better:

  • I’ve added support for multiple paragraphs

  • I’ve added support for both paragraph and text portion formatting

It wasn’t necessary to implement this from scratch, because comment contents uses the same editeng store as the shape text, and there formatting was already handled. A benefit of this code sharing is that shape text also handles multiple paragraphs without a problem now. :-)

The commits are backported to libreoffice-5-1, so users will see them already in the upcoming 5.1.0 release.


Sanitizing member variable names in LibreOffice Writer

Estimated read time: 1 minutes

Robinson just branched off libreoffice-5-1 from master in LibreOffice’s core.git repository, so time to talk about what happened behind the scenes in the 5.0 → 5.1 development cycle from my side.

One stylistic detail that annoyed me for a while was the inconsistency around naming class member variables. In new code it’s common to give them an m_ (or at least an m) prefix, but in older code that wasn’t that common, and various custom hacks were invented to differentiate between pointers which point to the same memory address, but one being a parameter of a member function, and the other being a member variable.

Probably the worst scenario is when one was an abbreviation of the other, like pTable and pTbl or pCursor and pCrsr. I took this as an opportunity to play with Clang’s LibTooling, and I wrote two tools back during the Cambridge hackfest to automate the process of finding and fixing missing prefixes.

To scope the renaming, I changed all classes in the sw module having more than 20 unprefixed members to follow the above convention, hopefully this nicely improves code readability, together with the mass-rename of pointless abbreviations, also done before the branch-off, so affecting both libreoffice-5-1 and master. :-)


PNG export in LibreOffice Calc

Estimated read time: 1 minutes

Both LibreOffice Writer and Impress has the ability to export the document as PNG, which is one way to create thumbnails for documents — i.e. being able to preview them before the real loading of the document happens. It turns out Calc did not have this feature, and given that ScModelObj also supports the css::view::XRenderable interface (just like Writer), I hoped that it won’t be too complex to add one.

You can refer to Fridrich’s overview blog post for the complete list of steps on how to add a new filter to LibreOffice, here the following steps were needed:

  • improve DocumentToGraphicRenderer, so that it can handle that Calc does not implement the text::XTextViewCursorSupplier interface (Writer uses this one to expose the cursor is on what page)

  • register png_Portable_Network_Graphic as filter type for Calc

  • create a new calc_png_Export filter fragment

  • register a Calc graphic filters configuration type and filter group in Configuration_filter and CustomTarget_registry

  • testcase

If you can’t wait till LibreOffice 5.1 is released to try out this new feature, you can get a daily build. :-)

© Miklos Vajna. Built using Pelican. Theme by Giulio Fidente on github.