Index ¦ Archives ¦ RSS > Category: libreoffice ¦ RSS

Better PDF signature verification in Draw

Estimated read time: 2 minutes

Draw now has much better support for detecting unsigned incremental updates between signatures at the end of PDF documents. We now also make sure that incremental updates introduced for adding signatures really just add annotations and don’t change the actual content.

Motivation

There has been a recent evaluation of PDF signature verification, which included Draw. While we got a checkmark on their Shadow Hide test, their Shadow Replace test found conditional problems and their Shadow Hide-and-Replace test was not happy, either.

So time to look at what are those corner-cases and how the situation can be improved, so people keep trusting that if Draw says a signature is valid, it’s indeed valid.

Results so far

There were 4 incremental improvements in this area:

These were enough so that talking to the authors of that evaluation now confirmed that these problems are all gone.

How is this implemented?

If you would like to know a bit more about how this works, continue reading… :-)

PDF signature verification works by using a custom PDF tokenizer. You can read about that code in the PDF tokenizer section of this post. The bottom line is that we now have both PDFium and this custom tokenizer, somewhat duplicating the functionality.

After talking to the PDFium developers (see the relevant mailing list thread), there were open regarding adding all the high level API to allow PDF signature verification based on PDFium, and not via our own tokenizer. See this header file for the set of relevant APIs added. A combinations of those allowed to adapt the code on our side and use PDFium for signature verification, not the own tokanizer.

Want to start using this?

You can get a snapshot / demo of Collabora Office and try it out yourself right now: try unstable snapshot. Collabora intends to continue supporting and contributing to LibreOffice, the code is merged so we expect all of this work will be available in TDF’s next release too (7.1).


Better handling of cached field results in Writer

Estimated read time: 2 minutes

Writer now has much better support for preserving the cached result of fields in documents. This is especially beneficial for Word formats where the input document may have a field result which is not only a cache, but re-calculating the formula would yield a different result, even in Word.

Motivation

A Collabora Office customer gave us a DOCX document, which is essentially a calendar for planned IT maintenance windows at some organization. These calendars are tables with fields in it. The document is halfway through towards changing it to a newer year: the formulas are already changed to calculate a newer year, but all the cached field results are still for the old year.

The request was to keep showing these results and not throw them away during save, either. Their primary workflow is to fill the calendar with manual entries, not to tweak the calendar layout itself.

Results so far

The calendar now looks like this:

https://lh3.googleusercontent.com/6o7pvix-dJ9QhCX65FUkWeQZ60B89sHqDpBvd7WVRLtAzBW1323odrQ13aV_CgEFvgh7Iee-ePq95oPOf1Q-jMxvX1MBsz9FhgKd9vymyrdMBIZbF459hNKE1dM4XLcwXkGYh8ksmok=w1920
Figure 1. New render result in Writer

Matching the reference rendering:

https://lh3.googleusercontent.com/GJd2zcnspXDb7Wa2p32TInf9C8MAgt92h3G6PYuUwUvpQi5f3AdRbl5yGq8FN7kUPMcZwuFpohTKmX33s8u-AxFSO9rZFgH4X-fwrg8jShtJoA1KyGws_-ymUvINmK-5xo2_hd7YmLI=w1920
Figure 2. Reference render result

While it looked like a broken calendar previously:

https://lh3.googleusercontent.com/bpOVqcZX2CcKouuADNyPx1PMyI3I6CyjIDIAnUbylsT-ZimxSkPcUaRbMDd8MzHlG3Uqw2d-TunD4m7U4DUlm_O_esJt6CAY-H7Z5tdQxZ6q_MYxgJphutr_-JRVYh8uLmspiiI532U=w1920
Figure 3. Old render result in Writer

You can see that the day numbers were broken previously and now they line up properly.

How is this implemented?

If you would like to know a bit more about how this works, continue reading… :-)

As usual, the high-level problem was addressed by a series of small fixes:

With these, it’s now possible to edit these calendars, without breaking the fields which provide the day numbers.

Want to start using this?

You can get a snapshot / demo of Collabora Office and try it out yourself right now: try unstable snapshot. Collabora intends to continue supporting and contributing to LibreOffice, the code is merged so we expect all of this work will be available in TDF’s next release too (7.1).


Detecting 0-byte files based on extension in Impress and elsewhere

Estimated read time: 3 minutes

Impress (and Writer and Calc) now has support for detecting 0-byte files on open/import based on their extension. This builds on top of the previous language-independent template improvements. This means that e.g. a 0-byte PPTX file will open as an empty Impress presentation, not in Writer.

Motivation

We regularly see customers wanting minimal templates, which are language independent and have no content. Such files are handy if your workflow is to first name an empty document (create it) and only then edit it (and not the other way around: first create the document, then save it by giving it a name). This is easy for .txt files: if it’s zero bytes, it’s empty. But then this approach is also expected to work for other file formats as well, where our original approach was more technical: if it’s an empty file, that that can be only plain text, so we (almost) always opened it in Writer, not matching the user expectations.

Instead of explaining the problem to people again and again (that a literally empty PPTX file is not a PPTX template), there is value in just adapting the code instead to "do what I mean".

Results so far

An empty PPTX file is now handled like this:

https://lh3.googleusercontent.com/zk3b0f2Rx3t5vFVuKiimujSJWYwPNH05PCf5Indih3OwMDeBrOUH1X7N22PO46kIbxTVzI0V3IV-bE0sMycTHGj2eRqKT6K7eQkZ0Py9QVCPIhV0pdKdGPLGH08xpw72wFQ-3eGyX4k=w1920
Figure 1. Empty PPTX file opening in Impress

You can see this is no longer opening in Writer as plain text but in Impress, which is clearly a less surprising behavior.

Here is what happens if you open an empty DOTX (template):

https://lh3.googleusercontent.com/cVB_kK2wDyNIJjLt9v9UcNS4AagRCifwBofp70mHfNVzopvrN1cxcsVLhWfEArhab_PwSFkAvLlMUS1witevRcKeEn9UXYtw5o4VeGSztvnNUi6YMtR3t2DUIu1k2LLOUhnpckAnrwQ=w1920
Figure 2. Empty DOTX file creates a new Writer document

You can see that it is even recognized that this is a template format, so a new document is created, not the template itself is opened for editing.

How is this implemented?

If you would like to know a bit more about how this works, continue reading… :-)

You can see the code change in this commit. First, we restrict this trick to file URLs, and also to empty files.

Second, we look at the extension of the file and try to match an import filter that usually handles that extension. This helps, because then nominally the correct filter will be used for the import, so save will not ask for a filename (as it happens for new documents), but it will know what target filename and export filter to use.

Finally we need to avoid actually invoking the import filter, because no file content is not something an import filter has to handle if its filter detection would reject the file. (E.g. PPTX is expected to be a valid ZIP file.) This is important, because we want to avoid touching each & every file filter to not fail for empty file content — instead we want to handle this centrally, at a single place.

Want to start using this?

You can get a snapshot / demo of Collabora Office and try it out yourself right now: try unstable snapshot. Collabora intends to continue supporting and contributing to LibreOffice, the code is merged so we expect all of this work will be available in TDF’s next release too (7.1).


OOXML / PDF Digital Signing in Draw and elsewhere conference talk

Estimated read time: 1 minutes

Today I gave a OOXML / PDF Digital Signing in Draw and elsewhere talk at the LibreOffice Conference 2020. The (virtual) room was well-crowded — somehow people find digital signatures interesting. ;-)

It contains an overview of the ODF/OOXML/PDF signing feature set and also details the latest improvements, like visible PDF signing.

I expect quite some other slides from other Collaborans and the wider community will be available on Planet, don’t miss them.

You can get a snapshot / demo of Collabora Office and try the presented features out yourself right now: try unstable snapshot. Collabora intends to continue supporting and contributing to LibreOffice, the code is merged so we expect all of this work will be available in TDF’s next release too (7.1).


SmartArt improvements in Impress, part 6

Estimated read time: 3 minutes

Impress now has support for an improved auto-fit-of-text layout across multiple shapes, also the snake algorithm now handles width requests from constraints much better for SmartArt graphics from PPTX files. This builds on top of the previous improvements around SmartArt support.

First, thanks to our partner SUSE for working with Collabora to make this possible.

Motivation

SmartArt allows declaring your content and requirements for a graphic, then the layout will take care of arranging that in a suitable way. It is allowed to ask for an automatic font size, which is small enough so that all the content fits into the shape. At the same time, you can ask that the font size is the same in multiple shapes. Impress lacked the ability to do the latter, leading to different font sizes in different shapes, all automatic inside a single shape.

Results so far

Here is how the automatic text scaling across multiple shapes works in practice:

https://lh3.googleusercontent.com/5f-rH0nKGed-6GhBn3bAOMH6sVUeZUeqt2TsFydVSFlL_185Hj6BjNkchKn7DVKpAQmRsg6bGNwKyBIN9bR1sRYacqcKnLYOeqasGZB2IWRohN8mtgFG9aNN_k5ofC_ZqunSeHqIYTc=w640
Figure 1. Autofit synchronization, new output
https://lh3.googleusercontent.com/lncpRp13-vUBJH5Kt4ccYHMULGQ8U1Qw8v5z7LmRSE9bv6yjukFMfuiJolCKbVOpjT-85zw_BQMj72dKJLVnMI242CQlIxR7tDUbhBuVaYDuGPRVnAqhCsGbDmGLmyu-7ueA39kNXIg=w640
Figure 2. Autofit synchronization, old output
https://lh3.googleusercontent.com/Go9LGPftmbtFnQXgzxITJVLhEVJF1B13Ge3PGbyKPNEzCJ2zi2DfYBMak92v127PJGYyzjL8V9fTh8Fb_vZXpAdBrBRQizd2onXM8dBka38BkBEi2FE8UP3JCPecKN1m9u8fR591GMM=w640
Figure 3. Autofit synchronization, reference output

You can see how the old output used to have unexpected large text in shape A, but now has the same text size as shape B. This is not applied unconditionally, shape C can request to have an independent, fixed font size.

https://lh3.googleusercontent.com/7IBC-z9NfhP0mjutFPQLPN312AH5Jch6Gss-75kROjLksQ3MnSZnhTodrPDJBm3MmkcQ-rHKvzozgB1O8j8rDBJEkzCf9vgmgrSYa3kH7GqnDS0BgBnlSOWC0GQxVBCIMYX0-Blf_F8=w640
Figure 4. Snake rows, new output
https://lh3.googleusercontent.com/4y0pEF3utBcpXMcCsrvrkvnNCdKKyhVlwejiwsI6cMUrA1nV4u1VuE4l1Xhuw60jQYrkeQD54Y0JuB4NR571kwtluUGceclQPZPcYITEyqf0GF1Y7fr_GXNnSRCtnXO1jjtcO_nSLS0=w640
Figure 5. Snake rows, old output
https://lh3.googleusercontent.com/JqIugvyKapfY6Hw0bs7OtWMJ2sj5mdFOv8ebJwZac_BgmuJXKyHxDUdzCj0xZl9zcksXDjdqthce1xrHJzZdGG_024CLbVBSoCmR-X_qFxdWupFwXBa281LId18qezAU80vuT69kGl0=w640
Figure 6. Snake rows, reference output

You can see that the old output laid shapes all over the place, while the new output puts them to a 3 by 2 matrix. The reason this works is because now we parse width requests from constraints correctly. This means we give spacings a smaller width, real shapes a larger width, so the content fits in less rows and the layout looks like a grid, matching the reference rendering.

How is this implemented?

If you would like to know a bit more about how this works, continue reading… :-)

As for the autofit synchronization:

Beyond that, for the snake rows:

Want to start using this?

You can get a snapshot / demo of Collabora Office and try it out yourself right now: try unstable snapshot. Collabora intends to continue supporting and contributing to LibreOffice, the code is merged so we expect all of this work will be available in TDF’s next release too (7.1).


Locale-independent Writer templates

Estimated read time: 2 minutes

The problem

Users create new documents in various ways. When they do so in Online or via Windows Explorer’s context menu (New → …) then actual templates are not involved in the process, technically. What happens instead is that there is a plain empty Writer (or Calc, Impress) document that gets copied. The reason for this is that by the time the document gets created, the WOPI-like protocol or Windows Explorer doesn’t have a running soffice process to create a document instance from a template: it’ll just copy a file.

With that aside, users expect that when they create new documents, the language of their new document matches the locale of Writer itself. This conflicts with the idea that languages in the documents are explicit, so if a German users writes a piece of German text, the spellcheck passes and the next user is English, then the text should remain German, not introducing new spellcheck errors.

Result

https://lh3.googleusercontent.com/OnDdNBGLsYhicnEbt_G6XW3Tmrn17XUT4XyBczgm0eETha9ZQ0y62t74QxeUFi3BfzfZrbBzZaMikglblqQBqTnWdYQzEQ72iBh3gZMHb9akFpQRVztOW7_0pK1Uyn9fvaNhLfugHfQ=w640
Figure 1. Locale-indepentent Writer template

The solution to this problem is what Mike and Ezinne implemented: make these "templates" minimal, so they don’t refer to any language. Then Calc or Impress will fill the language from the locale of the soffice process and it’ll be part of the document on the first save. This solves the problem of multi-language templates while it does not break the spellcheck use-case.

Andras copied the same templates to various Online integrations to have the same problem solved in that use-case as well.

Writer was still problematic, though. sw: default to UI locale when language is missing now fixes this. You can see on the above screenshot that the stock soffice.odt was opened with a Hungarian locale and the status bar shows that the document language is Hungarian, not the confusing "multiple languages", as before.

Want to start using this?

You can get a snapshot / demo of Collabora Office and try it out yourself right now: try unstable snapshot. Collabora is a major contributor to LibreOffice and all of this work will be available in TDF’s next release too (7.1).


SmartArt improvements in Impress, part 5

Estimated read time: 3 minutes

Impress now has support for considering rules next to constraints when it comes to lay out SmartArt graphics from PPTX files. This builds on top of the previous improvements around SmartArt support.

First, thanks to our partner SUSE for working with Collabora to make this possible.

Motivation

SmartArt allows declaring your content and requirements for a graphic, then the layout will take care of arranging that in a suitable way. It is allowed to declare conflicting requirements, and rules can specify how to resolve those conflicts. The below example document has shape widths defined in a way that multiple child shapes wants to have a width of 100%, but simply scaling down all child shapes does not give a correct result. Rules define what to scale down and what to leave unchanged.

Results so far

Here is how this works in practice:

https://lh3.googleusercontent.com/_AL6ARVsbgdaovqKPxr0n0I1kSn2zX_5xGg5y_4M8whkT6K0-mXIsGXeYI2Uo6u2YQAVwfLtbfy8XeYHggaPWpIHV4yaA4CaaIFUK4LQLRbV-JIbhy9A-Xz5JEEbcXp3TRWK4CzVcl0=w640
Figure 1. Linear layout with multiple 100% width shapes, new output
https://lh3.googleusercontent.com/UmK7-j0WxUHamDA-g3FepAOYYgbD5LJJhssleqv2jLnfXX-62fP82uA_5t__9HOQWIZfJUl6hoZVVQX5-LuIdOxz2M0HS90zcaoov_SbxQHuv4DN48be8dZkvySb_QtAbmNOTcMpJ5c=w640
Figure 2. Linear layout with multiple 100% width shapes, old output
https://lh3.googleusercontent.com/i2ScJOwjQfQeeFrw-yu6EQt67nt5Xx7o325WnaOeprXH4jc_CPLuXt0Mwb2iiT9rBamjooEA271HY48P6v8ieuWMUcoSq5HTjMsJkJnUOcrCrF_7uutebYGfO2WOZzAJRh6k-ibbglc=w640
Figure 3. Linear layout with multiple 100% width shapes, reference output

How is this implemented?

If you would like to know a bit more about how this works, continue reading… :-)

  • The initial heavy-lifting is done in this commit, which parses the rules from the XML input.

  • Then once we had rule info around, the linear algorithm was improved to scale down child shapes based on rules (and not just all of them, equally).

  • Then it was necessary to scale spacings (between child shapes) based on rules as well.

  • It was also needed to limit the height request of a shape, since they should not leave the canvas of the SmartArt.

  • Finally it was necessary to support the "top" child order. This can be declared using the following markup:

<dgm:layoutNode ... chOrder="t">

This declares that an earlier shape in a linear layout is on top of a later shapes, not the opposite. The default is that newer shapes are on top of older shapes. This is not a visible problem usually, but once you start using negative widths in a linear layout, you can have overlapping shapes. The above example has 3 text shapes, which are overlapping with the "background" arrow shape. This is expressed by having 100% width for child shapes (OK to scale down), then a -100% width for a dummy shape (not scaling) and finally a 100% width for the background arrow (not scaling).

All in all, now the background arrow shape has a good position and size, and the text on the arrow is readable.

Want to start using this?

You can get a snapshot / demo of Collabora Office and try it out yourself right now: try unstable snapshot. Collabora is a major contributor to LibreOffice and all of this work will be available in TDF’s next release too (7.1).


Adding visible signatures to existing PDF files in Draw

Estimated read time: 3 minutes

Draw now has support for adding visible signatures to an existing PDF file. This is in contrast with the old functionality which was limited to invisible signatures.

First, thanks to the Dutch Ministry of Defense in cooperation with Nou&Off who made this work by Collabora possible.

Motivation

The PDF format allows assigning a shape (a form xobject) to a digital signature in the PDF file, and if you use e.g. Adobe Acrobat, then it fills this shape with some visible information about the digital signature. Draw used to write a placeholder widget there (a 0x0-sized rectangle on the first page, at position 0x0). This is valid, but it’s not close to real-world signatures, where signing has a visual effect as well.

Results so far

Here is how this works in practice:

Figure 1. Demo of adding a visible signature to an existing PDF file in Draw

You can see how the 2 added signatures are visible and Adobe Acrobat confirms they are valid, too.

How is this implemented?

If you would like to know a bit more about how this works, continue reading… :-)

  • Signature lines were already working in Writer and Calc, this effort brings them to Draw, improving consistency.

  • Signing existing PDFs were already possible, this allows adding a visible signature with the correct markup. This is important for automated processing of PDFs, maybe even helps accessibility. (I think DocuSign doesn’t get this right currently.)

  • This uses the existing "export selected shape to PDF" code to produce that object, so it’s not a bitmap, but a scalable format. (As I know, DocuSign doesn’t do this, either.)

  • If you didn’t get the signature rectangle right for the first time, you can still move and resize it before the actual signing happens (Acrobat doesn’t support this currently, I believe.)

  • The generated object is locale-aware when it comes to the actual signature string and date format.

  • The feature works for multiple signatures and multiple pages as well.

  • The final step was this commit, with much more grounding before that one.

  • Note that the signing is a two step process: first you draw the signature rectangle and optionally finalize its position / size, and only then you use the Finish Signing button on the infobar to trigger the actual signing:

https://lh3.googleusercontent.com/TMPrD20O0PvPLB7Uru_mmxfeQTaWhJwNQ80jgLj23TWLNqkm44Ww8F9Azce0sEN1TzmjmmVW7MvHZTwtR6Us2H7qpzOSC07CQ0p_myEsM1WRQOToAEus0vsgpTh1yeD65YemFQvv_A=w640
Figure 2. After drawing a signature rectangle, before finishing the signing.

If you use a HW-based certificate, this second step will ask for your certificate PIN.

Want to start using this?

You can get a snapshot / demo of Collabora Office and try it out yourself right now: try unstable snapshot. Collabora is a major contributor to LibreOffice and all of this work will be available in TDF’s next release too (7.1).


Page-content-bottom vertical relation in Writer

Estimated read time: 3 minutes

Writer now has support for anchoring shapes relative to the bottom of the page content frame.

First, thanks to our partner SUSE for working with Collabora to make this possible.

Motivation

Users sometimes want to specify the vertical position of their shapes in text documents in a way that is relative from the bottom of the page content area. Also, this improves consistency, specifying a position that is relative from the top of the page content area is already possible.

Alternatively, it is possible to have the same calculated position when positioning from the top of the page content area. The downside of this approach is that the position changes when the page height changes. So if the user intention is to position a shape 2 cm above the bottom of the page content area and the page height changes, the shape has to be manually re-positioned. This manual re-positioning is not needed with the new page-content-bottom vertical relation.

For example, if a shape has a height of 10 cm and a 2 cm spacing is wanted between the bottom of the shape and the bottom of the page content area, the position can be set to -12 cm, and then the 2 cm spacing will be maintained, even after the page height changes.

Results so far

Here is how this works in practice:

Figure 1. Demo of a stable vertical position (relative to the bottom of the page content frame), during a page height change

You can see how the distance of the shape from the bottom is 2 cm, even if the page height changes.

How is this implemented?

If you would like to know a bit more about how this works, continue reading… :-)

  • the UNO API now supports a new PAGE_PRINT_AREA_BOTTOM relative orientation

  • the have the expected layout, SwToContentAnchoredObjectPosition::CalcPosition() now considers this from-bottom anchoring when handling the vertical relative orientation

  • the DOCX format already had a markup for this:

<wp:positionV relativeFrom="bottomMargin">
  <wp:posOffset>...</wp:posOffset>
</wp:positionV>

now the import/export code maps this markup to the new feature.

  • The UI allows setting a new "Page text area bottom" frame position. It turns out that the UI is quite nice here, it tries to prevent you from setting positions which would be outside the limits of the current page. This logic in SwFEShell::CalcBoundRect() now handles the new relative orientation.

  • The ODF filter now has a markup to represent the new vertical relation:

<style:graphic-properties ... loext:vertical-rel="page-content-bottom" .../>

There is a proposal to promote this from our extension namespace to normal ODF (thanks to Regina for the help there!).

Want to start using this?

You can get a snapshot / demo of Collabora Office and try it out yourself right now: try unstable snapshot. Collabora is a major contributor to LibreOffice and all of this work will be available in TDF’s next release too (7.1).


Export larger pages from Draw using PDF 1.6

Estimated read time: 2 minutes

Draw/Impress now has support for exporting larger page sizes into PDF. The previous limit was 200 " (508 cm), and now practically there is no such limit.

First, thanks Vector who made this work by Collabora possible.

Motivation

You can use Draw with a document which has a single page, which more or less acts as a canvas with unlimited size to handle vector graphics. The current limit of such a canvas in size is 600 x 600 cm. (And that can be increased further if there is demand without too large problems.)

Exporting such a document to PDF is a different matter, though. The specification (up to, and including version 1.5) says that the unit to specify sizes is points, and the maximum allowed value is 14 400. This means that there is no markup to describe that your page is 600 cm wide. PDF 1.6 (and newer versions) introduce a UserUnit markup to allow unlimited page size, and now Draw (and other apps) can use this to describe the increased size.

Another use-case can be a large sheet in Calc, exporting it to a single PDF page, so you can pan around easily on a touch device. If you have enough rows, then getting rid of this limit is helpful to deal with the large page height.

Results so far

Here is how a large page looks like now in Draw and then in Adobe Reader:

Figure 1. Export of 6m-wide.odg to PDF

You can see how both Draw and Adobe Reader show that the page width is larger than 200 ".

How is this implemented?

If you would like to know a bit more about how this works, continue reading… :-)

  • The PDF export already converts from an internal unit (e.g. Draw uses 100th millimeters, Writer uses twips) to PDF’s unit

  • The trick is that now PDF’s unit is no longer points all the time, but we can dynamically switch to a larger unit as needed.

Here is how the PDF markup looks like for a 600 cm wide page:

1 0 obj
<</Type/Page/Parent 4 0 R/Resources 11 0 R/MediaBox[0 0 8503.93700787402 396]
/UserUnit 2/Group<</S/Transparency/CS/DeviceRGB/I true>>/Contents 2 0 R>>
endobj

Notice how we still avoid values larger than 14 400, but now the UserUnit says that 1 unit means 2 points.

Want to start using this?

You can get a snapshot / demo of Collabora Office and try it out yourself right now: try unstable snapshot. Collabora is a major contributor to LibreOffice and all of this work will be available in TDF’s next release too (7.0).

© Miklos Vajna. Built using Pelican. Theme by Giulio Fidente on github.