vmiklos.hu
shameless self-promoting website
»Root
»Rejourn root
»LibreOffice Community Blogs
Search:
  • Tuesday, 04 July 2017
    Using LibreOffice with xmlsec from the system (Comments)

    LibreOffice uses a number of external libraries, and most of them can be configured to use a bundled version or a system version. libxmlsec was an exception previously (only the bundled version was usable), but LibreOffice master (towards 6.0) doesn’t have this limitation anymore.

    Using a bundled version is a good choice in case:

    • you want to create self-contained binaries

    • you want to bisect a regression, where possibly the regression was introduced by upgrading one of the external libraries

    • the system (e.g. macOS, Windows) doesn’t provide the relevant library

    Using a system version is a good thing in case:

    • you want to work with the system, not against it (if a Linux distro already provides libxmlsec, why ship a duplicated copy inside LibreOffice?)

    • being able to use a system version means our bundled version does not have custom patches which affect the functionality of the library

    • not having custom patches also means upstream benefit from our submitted patches, these patches are reviewed by competent maintainers and upgrading the external is easier, as there is no patchset to rebase.

    With that in mind, let’s have a look what blocked using system-xmlsec in the past:

    • LibreOffice inherited a large patchset from OpenOffice.org, commit 694a2c53810dec6d8e069d74baf51e6cdda91faa (2012-11-30) had 16 patches, with this scary diffstat:

     43 files changed, 5888 insertions(+), 1885 deletions(-)
    • I even increased this when I added the SHA256 patches, as back then I wasn’t sure if it’ll be ever possible to upgrade to a newer libxmlsec version.

    • Step by step I got rid of most of these patches, either by upstreaming them or realizing they are no longer necessary. Upstreaming wasn’t always trivial, as for our purposes it was always easy to patch something, but for upstream non-compatible changes always have to be conditional. Today we have 3 build-specific patches, 1 backport and 1 feature patch that is (at least) not necessary when signing / verifying documents with software-based certificates.

    • At the end two more commits were necessary to support building against system-xmlsec, only adding minimal differences when using the system or the bundled xmlsec variants.

    If you are a Linux distro packager then --with-system-libs already implies --with-system-xmlsec, so you probably don’t have to do anything. If you build LO for static analysis (e.g. Coverity) then this should be also useful, so not relevant issues in 3rd-party code don’t have to be ignored manually.


  • Wednesday, 31 May 2017
    LibreOffice Perugia HackFest 2017 (Comments)

    (via ogervasi)

    Last weekend I attended the LibreOffice Perugia HackFest 2017, with the primary goal of mentoring students (together with Eike and Christian): provided they manage to contribute at least one non-trivial easy hack, they get university credits for their work.

    I worked with Arianna, Claudio, Francesco and Gian, all of them managed to achieve something by the end of the third day.

    When I was not helping others, I also fixed a few bugs:

    • tdf#107976 sw: let a view handle multiple transferables

    • tdf#107837 DOCX export: fix balanced multi-col section at doc end

    • tdf#107684 DOCX export: fix duplicated <w:outlineLvl> element for styles

    • tdf#106950 sw: support CharShadingValue property on paragraph styles

    Some photos I took during the event are available.

    Thanks the organizers for the great event, also kudos to Collabora, Red Hat and TDF for allowing mentors to come! :-)


  • Wednesday, 17 May 2017
    xmlsec improvements in LibreOffice 5.4 (Comments)

    This post summarizes the plumbing work around ODF/OOXML digital signatures that I did on LibreOffice master after the 5.3 branch-off up to now. The big thing is the integration of the libxmlsec 1.2.24 release. Among other things, this contains 2 larger changes that I contributed upstream triggered by the needs of LibreOffice:

    • The ECDSA-SHA256 feature is something I already mentioned, but I did not bother to backport the SHA1 and the SHA256 part, so those now arrived to LibreOffice as well.

    • xmlsec’s XMLSEC_KEYINFO_FLAGS_X509DATA_DONT_VERIFY_CERTS flag (while verifying signatures) was there, but its behavior was not clear (neither for nss nor for mscrypto). I’ve changed it to be in sync what you have in other commands to avoid certificate validation (like wget -k or curl -k), which means as a next step there will be one less xmlsec patch in LibreOffice that prevents us from using xmlsec from the system on Linux. (Adding tests also detected that in the nss case not using that flag also didn’t do verification by accident, this is now fixed as well.)

    After the release I also noticed that creating signatures on Windows was broken, this is now fixed on xmlsec master and also backported to LibreOffice.

    All this is available in LibreOffice master, towards 5.4.


  • Tuesday, 18 April 2017
    Improved rountrip of PDF images in LibreOffice (Comments)

    This is a follow-up to the previous post that described how it is now possible to insert a PDF file as an image in LibreOffice and export that back to PDF, while keeping the original PDF contents. I’ve recently improved this feature so the resulting file is smaller and the vector image can be viewed in more viewers. First, thanks to PMG who made this work possible.

    Let’s look at the previously mentioned front page of a magazine sample when it’s viewed in okular. (A KDE pdf viewer, i.e. something that’s not Adobe Acrobat). The previously used reference XObject PDF markup is not handled by it, so the bitmap fallback was displayed:

    https://farm4.staticflickr.com/3947/34031939205_5315a9afb4_o.png

    Compare it with the new result:

    https://farm3.staticflickr.com/2830/34031939425_24b9a126ee_o.png

    Notice the sharp text in the first line.

    Also the size of this sample is smaller now, since we don’t write a large bitmap, and the not shown second page of the PDF image: 2 385 984 → 1 605 558 bytes (about one third of the output is avoided).

    Both techniques have pros and cons, here is a summary:

    • The reference XObject approach allows you to preserve the full PDF data of the image: if it was of multiple pages, even that. Also, the LibreOffice code for this is simple: we just preserve a byte array — that can hardly go wrong. The problem is that no non-Acrobat PDF viewer implements this, including e.g. your printer most probably.

    • The new approach uses the tokenizer I originally wrote for PDF signature verification purposes — it extracts the page stream of the first page from the original file and uses it as a form XObject in the export result — this is the same as how e.g. pdfcrop works. This markup is handled by almost all PDF viewers and also the resulting size is smaller, since the data of other pages is dropped and there is no fallback bitmap. The problem may be that this is a much more complex scenario, so it may go wrong (as usual, bugreports are welcome).

    Nevertheless, the new approach seems like a much better default, so LibreOffice no longer writes the reference XObject approach unless you explicitly request it in the PDF export dialog.

    Some perhaps interesting details:

    • PDF page streams may be provided by multiple objects, but form XObjects must have a single stream, so it we handle the case when different parts of the page stream are compressed in different ways.

    • LibreOffice writes PDF-1.4 by default, in case you insert a PDF image that uses PDF-1.5+, we use pdfium to downgrade that markup to 1.4, and only then insert it.

    • Copying the page stream of the image is not enough, we also recursively copy all referenced objects from the source PDF, while rewriting all contained references, since the objects IDs in the old and new files differ. We also take care of proper scoping of named references in the resource dictionary, so you can use this feature recursively (insert a document as a PDF image, even if that document itself contains PDF images already). :-)

    All this is available in LibreOffice master, towards 5.4.


  • Monday, 20 March 2017
    LibreOffice now uses pdfium to render inserted PDF images (Comments)

    pdfium is the rendering library used in Chromium’s pdf viewer. It’s based on the foxit pdf renderer and its rendering quality is much better compared to the pre-existing "convert PDF to ODG, then to an image" code when it comes to just viewing a PDF file. First, thanks to PMG who made this work possible.

    Let’s look at a few samples that compare the old pdfimport rendering result and the new pdfium-based one. One important feature is that embedded fonts are handled. This is how this inserted PDF looked like previously:

    https://farm4.staticflickr.com/3727/33163219940_3a2a3278a0_o.png

    Compare it with the new result:

    https://farm3.staticflickr.com/2927/33547029855_92c1a5150d_o.png

    Now let’s see the front page of a magazine, you can see 4 unexpected artifacts:

    https://farm4.staticflickr.com/3948/33563793222_8a6b8e8a6b_z.jpg

    New result:

    https://farm3.staticflickr.com/2809/33547029645_de7cbcd800_z.jpg

    Finally a problem with pdfium was that LibreOffice got bitmaps from it, so in case you re-exported to PDF, the quality of these PDF images were worse than in the original PDF file. The PDF specification has a reference XObject feature that helps in this case: it allows the PDF export to still write the bitmap to the exported PDF, but in case the reader supports this feature, the vector-based original file will be shown, not the bitmap.

    Here is a simple hand-crafted star in a PDF file, as it looked initially:

    https://farm3.staticflickr.com/2915/33163219680_30f63b4a82_z.jpg

    This is how it looks after LibreOffice’s PDF export learned to emit reference XObjects:

    https://farm4.staticflickr.com/3933/33547029485_4f487bb26c_z.jpg

    All this is available in LibreOffice master, towards 5.4.


  • Monday, 13 March 2017
    ECDSA support in xmlsec-nss, bundled by LibreOffice (Comments)

    Last month a LibreOffice bugreport was filed, as the ODF signature created with Hungarian citizen eID cards is not something LibreOffice can verify. After a bit of research it seemed that LibreOffice and NSS (what we use for crypto work on Linux/macOS) is not a problem, but xmlsec’s NSS backend does not recognize ECDSA keys (RSA or DSA keys work fine).

    The xmlsec improvements happened in these pull requests:

    After this the xmlsec code looked good enough. I had to request an update of the bugdoc in the TDF bug twice, as the signature itself looked also incorrect initially:

    • an attribute type in the signature that had no official abbreviation was described as "UNDEF" instead of the dotted decimal form

    • RFC3279 specifies that an ECDSA signature value in general should be ASN1-encoded in general, but RFC4050 is specific to XML digital signatures and that one says it should not be ASN1-encoded. The bugdoc was initially ASN1-encoded.

    Finally a warning still remains: while trying to parse the text of the <X509IssuerName> element, the dotted decimal form is still not parsed (see this NSS bugreport). The bug is confirmed on the mailing list, but no other progress have been made so far.

    Oh, and of course: Windows is still untouched, there a bigger problem remains: we use CryptoAPI (not CNG) there, and that does not support ECDSA at all. Hooray for open-source libs where you can add such support yourself. ;-)


  • Thursday, 16 February 2017
    LibreOffice PDF export now supports videos (Comments)

    https://farm4.staticflickr.com/3924/32549564340_4d0990cfa4_o.png

    PDF supports screen annotations, which means it’s possible to play embedded and linked videos on top of a static image. Given that LibreOffice also supports videos, it made sense to add support for this in our PDF export filter. First, thanks to PMG who made this work possible. This is currently added for Writer and Impress.

    Linked videos

    Linked videos are the situation when the video is not part of the document itself, but it’s located somewhere else, e.g. a http:// location. This is helpful if you want to email around a PDF file, and want to avoid sending large files when it has video content.

    tdf#104841 is about this situation, first I added support for linked videos in Impress, then also in Writer.

    The result can be played using Adobe Acrobat Reader — for some reason okular on Linux is a bit confused about http:// URLs, wants to convert them to relative ones, and then fails as of today.

    Embedded videos

    https://farm3.staticflickr.com/2666/32115175413_ec6f64243a_z.jpg

    tdf#105093 is the embedded video case, this is handy in case you want to create an entirely self-contained PDF, where even the video content is inside the PDF file as an embedded file.

    After Impress support (and a trick around Draw vs Impress shapes) the Writer part wasn’t too complicated.

    Regarding the situation around various video containers and codecs, the above code is quite agnostic. :-) On the LibreOffice side all we require is to be able to extract a key frame from the video to provide a preview image, so e.g. on Linux the support depends on what gstreamer plugins you have installed. The video content is written to the PDF file as-is, so again if it will work in the PDF reader is up to the reader’s codec support. On Linux e.g. okular uses vlc for video playback, so the range of supported formats is quite wide. The same is true on Windows, what I personally tested is LibreOffice’s VLC backend and the embedded QuickTime player in Acrobat Reader.

    All of this is available on LibreOffice master towards 5.4.


  • Tuesday, 31 January 2017
    Impress bugfixes, in time for FOSDEM 2017 (Comments)

    https://farm1.staticflickr.com/334/32605456735_ac88121be8_o.png

    FOSDEM 2017 is here this weekend, and as Michael Stahl pointed out, this (together with the LibreOffice annual conference) are two time periods each year when lots of Impress bugfixes are made, as people start dogfooding. ;-) So below you can read about a pair of Impress bugs I fixed recently.

    Changing font size now takes table selection into account

    tdf#105502 is a situation where you have an Impress table shape, and you select part of the cells, then you click on the sidebar to change the font size. Previously this affected all cells of the table shape, now only the selected cells are updated.

    Background fill for shapes

    https://farm1.staticflickr.com/277/31761747774_4b1e6b8d38_o.png

    tdf#105150 is a PPT(X) filter bug where a shape was previously imported as transparent, but it actually has to have the same fill type as the slide background. In case of PPTX this was already handled in general, but not in case the slide had no explicit background. The result was that in case the shape was used to cover other shapes, they were visible, leading to e.g. this unexpected red rectangle on the screenshot.

    The same bug was present in the PPT import, though there existing support was even more limited: just the "background colored objects" were collected, but nothing was done to them. Now the above use-case should be as good for PPT as it is for PPTX.


  • Tuesday, 17 January 2017
    Hack-(rest-of-the)-week at Collabora (Comments)

    https://farm1.staticflickr.com/726/32306648426_b4ee93f6a1_o.png

    As mentioned in the blog post of Mike already, last month we were allowed to hack on anything we want in LibreOffice for a few days. I used this time to progress with 3 different topics.

    Stepping through TextBoxes using the keyboard

    Given that a Writer shape with a TextBox is internally two shapes, this needed explicit support. After my TextBox bugfix it’s possible to have two such shapes in a document, and once you select one of them, tab properly jumps between the two shapes; previously nothing happened.

    What did happen is we tried to activate the TextBox of the selected shape, which selected the shape itself, so at the end nothing happened.

    RTF improvements

    For some time it was already possible to import and export custom string document properties from/to RTF, but just in case the value type of the property was string. Now I extended support for these custom properties, so also the remaining types are handled: numbers, bools, doubles and dates.

    xmlsec patch upstreaming

    Last, I’ve started working on upstreaming external/libxmlsec/xmlsec1-noverify.patch.1. xmlsec has no ability to disable the verification of certificates (think of curl -k or wget -k), so in LibreOffice currently we just patch out that code as we don’t need it. So I wanted to add a new verification flag to avoid patching, but it turns out that in the NSS case xmlsec didn’t do the verification, so as a first step I fixed that instead in this xmlsec GitHub pull request. Now that it’s merged, the next step will be to add such a flag, and then LibreOffice can get rid of the patch after the next xmlsec release.


  • Tuesday, 20 December 2016
    PAdES support for PDF files in LibreOffice (Comments)

    Building on top of the previously mentioned signing of existing PDF files work, one more PDF feature coming in LibreOffice 5.3 is initial support for the PDF Advanced Electronic Signatures (PAdES) standard. First, thanks to the Dutch Ministry of Defense in cooperation with Nou&Off who made this work possible.

    Results

    PAdES is an extension of the ISO PDF signature with additional constraints, so that it conforms to the requirements of the European eIDAS regulation, which in turns makes it more likely that your signed PDF document will be actually legally binding in many EU member states.

    The best way to check if LibreOffice produces such PDF signatures is to use a PAdES validator. So far I found two of them:

    As it can be seen above, the PDF signature produced by LibreOffice 5.3 by default conforms to the PAdES baseline spec.

    Implementation

    I implemented the followings in LO to make this happen:

    • PDF signature creation now defaults to the stronger SHA-256 (instead of the previously used weaker SHA-1), and the PDF verifier understands SHA-256

    • the PDF signature creation now embeds the signing certificate into the PKCS#7 signature blob in the PDF, so the verifier can check not only the key used for the signing, but the actual certificate as well

    • the PDF signature import can now detect if such an embedded signing certificate is present in the signature or not

    Note
    Don’t get confused, LO does signature verification (checks if the digest matches and validates the certificate) and now shows if the signing certificate is present in the signature or not, but it doesn’t do more than that, the above mentioned DSS tool is still superior when it comes to do a full validation of a PAdES signature.

    As usual, this works both with NSS and MS CryptoAPI. In the previous post I noted that one task was easier with CryptoAPI. Here I experienced the opposite: when writing the signing certificate hash, I could provide templates to NSS on how the ASN.1 encoding of it should happen, and NSS did the actual ASN.1 DER encoding for me. In the CryptoAPI case there is no such API, so I had to do this encoding manually (see CreateSigningCertificateAttribute()), which is obviously much more complicated.

    Another pain was that the DSS tool doesn’t really separate the validation of the signature itself and of the certificate. The above screenshot was created using a non-self-signed certificate, hence the unclear part in the signed-by row.

    If you want to try these out yourself, get a daily build and feel free to play with it. This work is part of both master or libreoffice-5-3, so those builds are of interest. Happy testing! :-)


more »