Index ¦ Archives ¦ RSS > Category: libreoffice ¦ RSS

Handling PDF digital signatures with PDFium FOSDEM talk

Estimated read time: 2 minutes

Figure 1. Slides of the talk

The next step in the recent PDFium-based signature verification story is my Handling PDF digital signatures in LibreOffice with PDFium talk at FOSDEM 2021, in the LibreOffice devroom (pre-recorded video). The talk gives you an overview of digital signing in general, all the ODF/OOXML/PDF handling, signing/verification, various other related past Collabora projects, and then goes into details regarding how PDFium was improved and is used to do a better PDF signature verification in LibreOffice when opening PDF files in Draw.

The virtual room had around 150 participants and the Matrix based online conference was well-organized. Speakers even got a free t-shirt before the event, I appreciated the "bring your own beer" joke :-)

An other benefit of this unusual setup was to avoid the dreaded room is full problems, where you carefully selected a talk to attend and then failed to hear it.

I expect quite some other slides from other Collaborans and the wider community will be available on Planet, don’t miss them.

Want to start using this?

You can get a snapshot / demo of Collabora Office and try it out yourself right now: try unstable snapshot. Collabora intends to continue supporting and contributing to LibreOffice, the code is merged so we expect all of this work will be available in TDF’s next release too (7.2).


Shadow for tables from PPTX in Impress

Estimated read time: 2 minutes

Impress now has much better support for the shadow of table shapes: not only shape styles can result in table shadows, but it’s also possible to add this as direct formatting. Also the shadow result is PowerPoint-compatible in the direct formatting case.

First, thanks to our partner SUSE for working with Collabora to make this possible.

Motivation

We got a PPTX document, which has a table shape with blurry shadow. The shadow was completely missing in Impress. It was discovered that in case you configure the default shape style to have shadow, then there is some initial table shadow support in Impress, but that was not used in the PPTX case.

The request was to improve the shadow rendering to be PowerPoint-compatible and in general support table shadows as direct formatting as a new feature.

Results so far

The table shadow now looks like this:

https://share.vmiklos.hu/blog/sd-table-shadow/new.png
Figure 1. New render result in Impress

Matching the reference rendering:

https://share.vmiklos.hu/blog/sd-table-shadow/ref.png
Figure 2. Reference render result

While shadow was just missing previously:

https://share.vmiklos.hu/blog/sd-table-shadow/old.png
Figure 3. Old render result in Impress

You can see that not only the shadow is there, but also the cell backgrounds and the blurry shadow is rendered properly.

How is this implemented?

If you would like to know a bit more about how this works, continue reading… :-)

As usual, the high-level problem was addressed by a series of small fixes:

With these, it’s now possible to add, edit, render and delete these table shadows, while preserving them during ODP and PPTX import/export.

Want to start using this?

You can get a snapshot / demo of Collabora Office and try it out yourself right now: try unstable snapshot. Collabora intends to continue supporting and contributing to LibreOffice, the code is merged so we expect all of this work will be available in TDF’s next release too (7.2).


Better PDF signature verification in Draw

Estimated read time: 2 minutes

Draw now has much better support for detecting unsigned incremental updates between signatures at the end of PDF documents. We now also make sure that incremental updates introduced for adding signatures really just add annotations and don’t change the actual content.

Motivation

There has been a recent evaluation of PDF signature verification, which included Draw. While we got a checkmark on their Shadow Hide test, their Shadow Replace test found conditional problems and their Shadow Hide-and-Replace test was not happy, either.

So time to look at what are those corner-cases and how the situation can be improved, so people keep trusting that if Draw says a signature is valid, it’s indeed valid.

Results so far

There were 4 incremental improvements in this area:

These were enough so that talking to the authors of that evaluation now confirmed that these problems are all gone.

How is this implemented?

If you would like to know a bit more about how this works, continue reading… :-)

PDF signature verification works by using a custom PDF tokenizer. You can read about that code in the PDF tokenizer section of this post. The bottom line is that we now have both PDFium and this custom tokenizer, somewhat duplicating the functionality.

After talking to the PDFium developers (see the relevant mailing list thread), there were open regarding adding all the high level API to allow PDF signature verification based on PDFium, and not via our own tokenizer. See this header file for the set of relevant APIs added. A combinations of those allowed to adapt the code on our side and use PDFium for signature verification, not the own tokanizer.

Want to start using this?

You can get a snapshot / demo of Collabora Office and try it out yourself right now: try unstable snapshot. Collabora intends to continue supporting and contributing to LibreOffice, the code is merged so we expect all of this work will be available in TDF’s next release too (7.1).


Better handling of cached field results in Writer

Estimated read time: 2 minutes

Writer now has much better support for preserving the cached result of fields in documents. This is especially beneficial for Word formats where the input document may have a field result which is not only a cache, but re-calculating the formula would yield a different result, even in Word.

Motivation

A Collabora Office customer gave us a DOCX document, which is essentially a calendar for planned IT maintenance windows at some organization. These calendars are tables with fields in it. The document is halfway through towards changing it to a newer year: the formulas are already changed to calculate a newer year, but all the cached field results are still for the old year.

The request was to keep showing these results and not throw them away during save, either. Their primary workflow is to fill the calendar with manual entries, not to tweak the calendar layout itself.

Results so far

The calendar now looks like this:

https://lh3.googleusercontent.com/6o7pvix-dJ9QhCX65FUkWeQZ60B89sHqDpBvd7WVRLtAzBW1323odrQ13aV_CgEFvgh7Iee-ePq95oPOf1Q-jMxvX1MBsz9FhgKd9vymyrdMBIZbF459hNKE1dM4XLcwXkGYh8ksmok=w1920
Figure 1. New render result in Writer

Matching the reference rendering:

https://lh3.googleusercontent.com/GJd2zcnspXDb7Wa2p32TInf9C8MAgt92h3G6PYuUwUvpQi5f3AdRbl5yGq8FN7kUPMcZwuFpohTKmX33s8u-AxFSO9rZFgH4X-fwrg8jShtJoA1KyGws_-ymUvINmK-5xo2_hd7YmLI=w1920
Figure 2. Reference render result

While it looked like a broken calendar previously:

https://lh3.googleusercontent.com/bpOVqcZX2CcKouuADNyPx1PMyI3I6CyjIDIAnUbylsT-ZimxSkPcUaRbMDd8MzHlG3Uqw2d-TunD4m7U4DUlm_O_esJt6CAY-H7Z5tdQxZ6q_MYxgJphutr_-JRVYh8uLmspiiI532U=w1920
Figure 3. Old render result in Writer

You can see that the day numbers were broken previously and now they line up properly.

How is this implemented?

If you would like to know a bit more about how this works, continue reading… :-)

As usual, the high-level problem was addressed by a series of small fixes:

With these, it’s now possible to edit these calendars, without breaking the fields which provide the day numbers.

Want to start using this?

You can get a snapshot / demo of Collabora Office and try it out yourself right now: try unstable snapshot. Collabora intends to continue supporting and contributing to LibreOffice, the code is merged so we expect all of this work will be available in TDF’s next release too (7.1).


Detecting 0-byte files based on extension in Impress and elsewhere

Estimated read time: 3 minutes

Impress (and Writer and Calc) now has support for detecting 0-byte files on open/import based on their extension. This builds on top of the previous language-independent template improvements. This means that e.g. a 0-byte PPTX file will open as an empty Impress presentation, not in Writer.

Motivation

We regularly see customers wanting minimal templates, which are language independent and have no content. Such files are handy if your workflow is to first name an empty document (create it) and only then edit it (and not the other way around: first create the document, then save it by giving it a name). This is easy for .txt files: if it’s zero bytes, it’s empty. But then this approach is also expected to work for other file formats as well, where our original approach was more technical: if it’s an empty file, that that can be only plain text, so we (almost) always opened it in Writer, not matching the user expectations.

Instead of explaining the problem to people again and again (that a literally empty PPTX file is not a PPTX template), there is value in just adapting the code instead to "do what I mean".

Results so far

An empty PPTX file is now handled like this:

https://lh3.googleusercontent.com/zk3b0f2Rx3t5vFVuKiimujSJWYwPNH05PCf5Indih3OwMDeBrOUH1X7N22PO46kIbxTVzI0V3IV-bE0sMycTHGj2eRqKT6K7eQkZ0Py9QVCPIhV0pdKdGPLGH08xpw72wFQ-3eGyX4k=w1920
Figure 1. Empty PPTX file opening in Impress

You can see this is no longer opening in Writer as plain text but in Impress, which is clearly a less surprising behavior.

Here is what happens if you open an empty DOTX (template):

https://lh3.googleusercontent.com/cVB_kK2wDyNIJjLt9v9UcNS4AagRCifwBofp70mHfNVzopvrN1cxcsVLhWfEArhab_PwSFkAvLlMUS1witevRcKeEn9UXYtw5o4VeGSztvnNUi6YMtR3t2DUIu1k2LLOUhnpckAnrwQ=w1920
Figure 2. Empty DOTX file creates a new Writer document

You can see that it is even recognized that this is a template format, so a new document is created, not the template itself is opened for editing.

How is this implemented?

If you would like to know a bit more about how this works, continue reading… :-)

You can see the code change in this commit. First, we restrict this trick to file URLs, and also to empty files.

Second, we look at the extension of the file and try to match an import filter that usually handles that extension. This helps, because then nominally the correct filter will be used for the import, so save will not ask for a filename (as it happens for new documents), but it will know what target filename and export filter to use.

Finally we need to avoid actually invoking the import filter, because no file content is not something an import filter has to handle if its filter detection would reject the file. (E.g. PPTX is expected to be a valid ZIP file.) This is important, because we want to avoid touching each & every file filter to not fail for empty file content — instead we want to handle this centrally, at a single place.

Want to start using this?

You can get a snapshot / demo of Collabora Office and try it out yourself right now: try unstable snapshot. Collabora intends to continue supporting and contributing to LibreOffice, the code is merged so we expect all of this work will be available in TDF’s next release too (7.1).


OOXML / PDF Digital Signing in Draw and elsewhere conference talk

Estimated read time: 1 minutes

Today I gave a OOXML / PDF Digital Signing in Draw and elsewhere talk at the LibreOffice Conference 2020. The (virtual) room was well-crowded — somehow people find digital signatures interesting. ;-)

It contains an overview of the ODF/OOXML/PDF signing feature set and also details the latest improvements, like visible PDF signing.

I expect quite some other slides from other Collaborans and the wider community will be available on Planet, don’t miss them.

You can get a snapshot / demo of Collabora Office and try the presented features out yourself right now: try unstable snapshot. Collabora intends to continue supporting and contributing to LibreOffice, the code is merged so we expect all of this work will be available in TDF’s next release too (7.1).


SmartArt improvements in Impress, part 6

Estimated read time: 3 minutes

Impress now has support for an improved auto-fit-of-text layout across multiple shapes, also the snake algorithm now handles width requests from constraints much better for SmartArt graphics from PPTX files. This builds on top of the previous improvements around SmartArt support.

First, thanks to our partner SUSE for working with Collabora to make this possible.

Motivation

SmartArt allows declaring your content and requirements for a graphic, then the layout will take care of arranging that in a suitable way. It is allowed to ask for an automatic font size, which is small enough so that all the content fits into the shape. At the same time, you can ask that the font size is the same in multiple shapes. Impress lacked the ability to do the latter, leading to different font sizes in different shapes, all automatic inside a single shape.

Results so far

Here is how the automatic text scaling across multiple shapes works in practice:

https://lh3.googleusercontent.com/5f-rH0nKGed-6GhBn3bAOMH6sVUeZUeqt2TsFydVSFlL_185Hj6BjNkchKn7DVKpAQmRsg6bGNwKyBIN9bR1sRYacqcKnLYOeqasGZB2IWRohN8mtgFG9aNN_k5ofC_ZqunSeHqIYTc=w640
Figure 1. Autofit synchronization, new output
https://lh3.googleusercontent.com/lncpRp13-vUBJH5Kt4ccYHMULGQ8U1Qw8v5z7LmRSE9bv6yjukFMfuiJolCKbVOpjT-85zw_BQMj72dKJLVnMI242CQlIxR7tDUbhBuVaYDuGPRVnAqhCsGbDmGLmyu-7ueA39kNXIg=w640
Figure 2. Autofit synchronization, old output
https://lh3.googleusercontent.com/Go9LGPftmbtFnQXgzxITJVLhEVJF1B13Ge3PGbyKPNEzCJ2zi2DfYBMak92v127PJGYyzjL8V9fTh8Fb_vZXpAdBrBRQizd2onXM8dBka38BkBEi2FE8UP3JCPecKN1m9u8fR591GMM=w640
Figure 3. Autofit synchronization, reference output

You can see how the old output used to have unexpected large text in shape A, but now has the same text size as shape B. This is not applied unconditionally, shape C can request to have an independent, fixed font size.

https://lh3.googleusercontent.com/7IBC-z9NfhP0mjutFPQLPN312AH5Jch6Gss-75kROjLksQ3MnSZnhTodrPDJBm3MmkcQ-rHKvzozgB1O8j8rDBJEkzCf9vgmgrSYa3kH7GqnDS0BgBnlSOWC0GQxVBCIMYX0-Blf_F8=w640
Figure 4. Snake rows, new output
https://lh3.googleusercontent.com/4y0pEF3utBcpXMcCsrvrkvnNCdKKyhVlwejiwsI6cMUrA1nV4u1VuE4l1Xhuw60jQYrkeQD54Y0JuB4NR571kwtluUGceclQPZPcYITEyqf0GF1Y7fr_GXNnSRCtnXO1jjtcO_nSLS0=w640
Figure 5. Snake rows, old output
https://lh3.googleusercontent.com/JqIugvyKapfY6Hw0bs7OtWMJ2sj5mdFOv8ebJwZac_BgmuJXKyHxDUdzCj0xZl9zcksXDjdqthce1xrHJzZdGG_024CLbVBSoCmR-X_qFxdWupFwXBa281LId18qezAU80vuT69kGl0=w640
Figure 6. Snake rows, reference output

You can see that the old output laid shapes all over the place, while the new output puts them to a 3 by 2 matrix. The reason this works is because now we parse width requests from constraints correctly. This means we give spacings a smaller width, real shapes a larger width, so the content fits in less rows and the layout looks like a grid, matching the reference rendering.

How is this implemented?

If you would like to know a bit more about how this works, continue reading… :-)

As for the autofit synchronization:

Beyond that, for the snake rows:

Want to start using this?

You can get a snapshot / demo of Collabora Office and try it out yourself right now: try unstable snapshot. Collabora intends to continue supporting and contributing to LibreOffice, the code is merged so we expect all of this work will be available in TDF’s next release too (7.1).


Locale-independent Writer templates

Estimated read time: 2 minutes

The problem

Users create new documents in various ways. When they do so in Online or via Windows Explorer’s context menu (New → …) then actual templates are not involved in the process, technically. What happens instead is that there is a plain empty Writer (or Calc, Impress) document that gets copied. The reason for this is that by the time the document gets created, the WOPI-like protocol or Windows Explorer doesn’t have a running soffice process to create a document instance from a template: it’ll just copy a file.

With that aside, users expect that when they create new documents, the language of their new document matches the locale of Writer itself. This conflicts with the idea that languages in the documents are explicit, so if a German users writes a piece of German text, the spellcheck passes and the next user is English, then the text should remain German, not introducing new spellcheck errors.

Result

https://lh3.googleusercontent.com/OnDdNBGLsYhicnEbt_G6XW3Tmrn17XUT4XyBczgm0eETha9ZQ0y62t74QxeUFi3BfzfZrbBzZaMikglblqQBqTnWdYQzEQ72iBh3gZMHb9akFpQRVztOW7_0pK1Uyn9fvaNhLfugHfQ=w640
Figure 1. Locale-indepentent Writer template

The solution to this problem is what Mike and Ezinne implemented: make these "templates" minimal, so they don’t refer to any language. Then Calc or Impress will fill the language from the locale of the soffice process and it’ll be part of the document on the first save. This solves the problem of multi-language templates while it does not break the spellcheck use-case.

Andras copied the same templates to various Online integrations to have the same problem solved in that use-case as well.

Writer was still problematic, though. sw: default to UI locale when language is missing now fixes this. You can see on the above screenshot that the stock soffice.odt was opened with a Hungarian locale and the status bar shows that the document language is Hungarian, not the confusing "multiple languages", as before.

Want to start using this?

You can get a snapshot / demo of Collabora Office and try it out yourself right now: try unstable snapshot. Collabora is a major contributor to LibreOffice and all of this work will be available in TDF’s next release too (7.1).


SmartArt improvements in Impress, part 5

Estimated read time: 3 minutes

Impress now has support for considering rules next to constraints when it comes to lay out SmartArt graphics from PPTX files. This builds on top of the previous improvements around SmartArt support.

First, thanks to our partner SUSE for working with Collabora to make this possible.

Motivation

SmartArt allows declaring your content and requirements for a graphic, then the layout will take care of arranging that in a suitable way. It is allowed to declare conflicting requirements, and rules can specify how to resolve those conflicts. The below example document has shape widths defined in a way that multiple child shapes wants to have a width of 100%, but simply scaling down all child shapes does not give a correct result. Rules define what to scale down and what to leave unchanged.

Results so far

Here is how this works in practice:

https://lh3.googleusercontent.com/_AL6ARVsbgdaovqKPxr0n0I1kSn2zX_5xGg5y_4M8whkT6K0-mXIsGXeYI2Uo6u2YQAVwfLtbfy8XeYHggaPWpIHV4yaA4CaaIFUK4LQLRbV-JIbhy9A-Xz5JEEbcXp3TRWK4CzVcl0=w640
Figure 1. Linear layout with multiple 100% width shapes, new output
https://lh3.googleusercontent.com/UmK7-j0WxUHamDA-g3FepAOYYgbD5LJJhssleqv2jLnfXX-62fP82uA_5t__9HOQWIZfJUl6hoZVVQX5-LuIdOxz2M0HS90zcaoov_SbxQHuv4DN48be8dZkvySb_QtAbmNOTcMpJ5c=w640
Figure 2. Linear layout with multiple 100% width shapes, old output
https://lh3.googleusercontent.com/i2ScJOwjQfQeeFrw-yu6EQt67nt5Xx7o325WnaOeprXH4jc_CPLuXt0Mwb2iiT9rBamjooEA271HY48P6v8ieuWMUcoSq5HTjMsJkJnUOcrCrF_7uutebYGfO2WOZzAJRh6k-ibbglc=w640
Figure 3. Linear layout with multiple 100% width shapes, reference output

How is this implemented?

If you would like to know a bit more about how this works, continue reading… :-)

  • The initial heavy-lifting is done in this commit, which parses the rules from the XML input.

  • Then once we had rule info around, the linear algorithm was improved to scale down child shapes based on rules (and not just all of them, equally).

  • Then it was necessary to scale spacings (between child shapes) based on rules as well.

  • It was also needed to limit the height request of a shape, since they should not leave the canvas of the SmartArt.

  • Finally it was necessary to support the "top" child order. This can be declared using the following markup:

<dgm:layoutNode ... chOrder="t">

This declares that an earlier shape in a linear layout is on top of a later shapes, not the opposite. The default is that newer shapes are on top of older shapes. This is not a visible problem usually, but once you start using negative widths in a linear layout, you can have overlapping shapes. The above example has 3 text shapes, which are overlapping with the "background" arrow shape. This is expressed by having 100% width for child shapes (OK to scale down), then a -100% width for a dummy shape (not scaling) and finally a 100% width for the background arrow (not scaling).

All in all, now the background arrow shape has a good position and size, and the text on the arrow is readable.

Want to start using this?

You can get a snapshot / demo of Collabora Office and try it out yourself right now: try unstable snapshot. Collabora is a major contributor to LibreOffice and all of this work will be available in TDF’s next release too (7.1).


Adding visible signatures to existing PDF files in Draw

Estimated read time: 3 minutes

Draw now has support for adding visible signatures to an existing PDF file. This is in contrast with the old functionality which was limited to invisible signatures.

First, thanks to the Dutch Ministry of Defense in cooperation with Nou&Off who made this work by Collabora possible.

Motivation

The PDF format allows assigning a shape (a form xobject) to a digital signature in the PDF file, and if you use e.g. Adobe Acrobat, then it fills this shape with some visible information about the digital signature. Draw used to write a placeholder widget there (a 0x0-sized rectangle on the first page, at position 0x0). This is valid, but it’s not close to real-world signatures, where signing has a visual effect as well.

Results so far

Here is how this works in practice:

Figure 1. Demo of adding a visible signature to an existing PDF file in Draw

You can see how the 2 added signatures are visible and Adobe Acrobat confirms they are valid, too.

How is this implemented?

If you would like to know a bit more about how this works, continue reading… :-)

  • Signature lines were already working in Writer and Calc, this effort brings them to Draw, improving consistency.

  • Signing existing PDFs were already possible, this allows adding a visible signature with the correct markup. This is important for automated processing of PDFs, maybe even helps accessibility. (I think DocuSign doesn’t get this right currently.)

  • This uses the existing "export selected shape to PDF" code to produce that object, so it’s not a bitmap, but a scalable format. (As I know, DocuSign doesn’t do this, either.)

  • If you didn’t get the signature rectangle right for the first time, you can still move and resize it before the actual signing happens (Acrobat doesn’t support this currently, I believe.)

  • The generated object is locale-aware when it comes to the actual signature string and date format.

  • The feature works for multiple signatures and multiple pages as well.

  • The final step was this commit, with much more grounding before that one.

  • Note that the signing is a two step process: first you draw the signature rectangle and optionally finalize its position / size, and only then you use the Finish Signing button on the infobar to trigger the actual signing:

https://lh3.googleusercontent.com/TMPrD20O0PvPLB7Uru_mmxfeQTaWhJwNQ80jgLj23TWLNqkm44Ww8F9Azce0sEN1TzmjmmVW7MvHZTwtR6Us2H7qpzOSC07CQ0p_myEsM1WRQOToAEus0vsgpTh1yeD65YemFQvv_A=w640
Figure 2. After drawing a signature rectangle, before finishing the signing.

If you use a HW-based certificate, this second step will ask for your certificate PIN.

Want to start using this?

You can get a snapshot / demo of Collabora Office and try it out yourself right now: try unstable snapshot. Collabora is a major contributor to LibreOffice and all of this work will be available in TDF’s next release too (7.1).

© Miklos Vajna. Built using Pelican. Theme by Giulio Fidente on github.