Fixing an Interoperability Bug in LibreOffice: Missing Lines from DOCX (part 1/3)

By Hossein Nourikah, Developer Community Architect at The Document Foundation

In LibreOffice, interoperability is considered a very important aspect of the software. Today, LibreOffice can load and save various file formats from many different office applications from different companies across the world. But bugs are inevitable parts of every software:there are situations where the application does not behave as it should, and a developer should take action and fix it, so that it will behave as it is expected by the user.

What if you encounter a bug in LibreOffice, and how does a developer fix the problem? In these series of articles, we discuss the steps needed to fix a bug. In the end, we will provide a test and make sure that the same problem does not happen in the future.

The article is presented in three parts:

  1. Understanding the Bugs and QA
  2. Developing a Bug Fix
  3. Writing the Tests and Finishing the Task

This blog post is the first part.

1. Understanding the Bugs / QA

Bugs can be found in any software and in a broader view, every technological product. There are situations that a system behaves in a way that the user does not like or expect, and then the engineers come in, take action and fix that. It is proven that no software can be completely bug-free, because there is no exact way to describe all the aspects of a software, even mathematically using formal methods.

Although we can not avoid the bugs completely, we can improve the situation. The important thing here is to document these bugs, reproduce them using sample documents, investigate the cause(s), and hopefully fix them as soon as possible while considering the priority. The whole process of dealing with the bugs and improving the quality of the software is is called: “Quality Assurance”, or simply “QA”.

You can take a look at LibreOffice Wiki QA section and ask QA questions in the IRC channel #libreoffice-qa on the Libera.Chat network connect via webchat

1.1. Bug Report

So, you have encountered a problem in the LibreOffice! In this case, first of all you should try to report the bug to Bugzilla. If a bug is not recorded in Bugzilla, there is a little chance that it will be fixed.

So, everything starts with a bug report in TDF’s Bugzilla. A detailed explanation of what a good bug report should contain is available in the TDF Wiki: https://wiki.documentfoundation.org/BugReport

In short, a good bug report should have suitable summary for the problem, appropriate description and specific components/hardware/version for the context of the problem. Steps to reproduce the bug are important parts of every report, because a developer should be able to reproduce the bug in order to fix it.

The bug reporter should carefully describe the “actual results” and why it is different from the “expected results”. This is also important because the desired software’s behaviour is not always as obvious as it seems to be for the bug reporter.

Let’s talk about a regression bug that is recently fixed. The “NISZ LibreOffice Team” reported this bug. The National Infocommunications Service Company Ltd. (NISZ) has an active team in LibreOffice development and QA.

This is the bug title and description provided by the NISZ LibreOffice Team:

Bug 123321 – FILEOPEN | DOC, missing longer line when saved in LO (shorter line remains)

https://bugs.documentfoundation.org/show_bug.cgi?id=123321

Description:
When the attached original document gets saved in LO as doc the middle line gets its length resized.

Steps to Reproduce:

  1. Open the attached doc in LO.
  2. Save it as doc.
  3. Reload.
  4. Notice the changes.

Actual Results: The middle arrow gets smaller.
Expected Results: It should stay the same size even after saving in LO.
Reproducible: Always
User Profile Reset: No

Every bug has a number associated to it, and at TDF Bugzilla, it is referred to by its number, like tdf#123321.

1.2. Bug Confirmation

After a bug is reported, it is needed that someone else check and see If it is reproducible or not. If so, the bug is then confirmed and its status will be set to “New”. In this case, a user named Timur from the QA team of volunteers has confirmed this bug. It is needed that someone else other than the original bug reporter confirms the bug report.

Here, the bug reporter has provided several examples:

Attachments

  • The original file (39.50 KB, application/msword)
  • The saved file. (24.00 KB, application/msword)
  • Screenshot of the original and exported document side by
    side in Writer. (298.50 KB, image/png)
  • Minimized test document in docx format (19.66 KB,
    application/vnd.openxmlformats-officedocument.wordprocessingml.document)

Opening the first file, we see that it contains several shapes. 4 ellipses, 2 diagonal lines, and a vertical line. But if we look closely, we find out that the vertical line actually consists of 3 different vertical lines. This can be understood by try selecting the line, and then pressing the tab to select the other lines.

The Shapelinelength_min.docx only contains 3 overlapped vertical lines (the overlap is not important).

  • First one on the top with the length 0.17″ (Verified in Word 2007)
  • Second one in the middle with the length of 1″ (Verified in Word 2007)
  • Third one at the bottom with the length of 2.77″ (Verified in Word 2007)

When you save it in LibreOffice and reload it, the first and the third vertical lines disappear, but if you select the second one (the only visible after save and reload), you can select the two other lines by pressing “tab” button. If you look at the size of these two lines, you will see that both have the length of 0″.

By opening the examples, saving and reloading them, we can verify that the bug is present even in the latest master build.

Images showing the bug

(Good)

(Bad)

Figure 1. The visible lines in the middle become smaller after save and reload

1.3. Bisect / Bibisect

Regression bugs are special kinds of bugs. They describe a feature that was previously working well, but at some point a change in the code has caused it to stop working. They are source of disappointment for the user, but they are easier to fix for the developers compared to other bugs! Why? Because every single change to the LibreOffice code is kept in the source code management system (Git), it is possible to find that which change actually introduced the bug.

Git has a special tool for this purpose, which is call bisect. It uses a binary search to find the commit that introduced the bug. But for LibreOffice which is a huge piece of software consisting of roughly 10 million lines of code, this can take a lot of time. So, a trick is used here: bisecting with the binaries! If you have access to every built LibreOffice for the commits, you can use git bisect command to find the source for problem in a very short time: This is called bibisect!

A very detailed description is on the wiki: https://wiki.documentfoundation.org/QA/Bibisect

Aron Budea from Collabora Productivity Ltd’s core engineering team did the bibisect, and now we know the exact commit that caused the problem:

Bibisected to the following commit using repo bibisect-win32-6.0.

https://git.libreoffice.org/core/+/d72e0cadceb0b43928a9b4f18d75c9d5d30afdda

Watermark: tdf#91687 correct size in the .doc

Export:
* Watermarks saved using Writer were very small in the MSO.
  Export fUsegtextFStretch property in the Geometry Text
  Boolean Properties.
* tdf#91687: SnapRect contains size of Watermark after rotation.
  We have to export size without rotation.

Import:
* When import set height depending on used font and width.
  Text will keep the ratio. Remember the padding for export.

* added unit test
* introduced enum to avoid magic numbers for stretch and best fit
  properties.

Change-Id: 
I3427afe78488d499f13c543ca401c096161aaf34
Reviewed-on: 
https://gerrit.libreoffice.org/38979
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Andras Timar <andras.timar@collabora.com>

This bug is a regression. The DOCX import/export was working well before this commit, but after that, it doesn’t work: Good catch!

In the second part of this series, we talk about how we can create a fix for the bug.


Hossein Nourikhah is the Developer Community Architect at the The Document Foundation (TDF), the non-profit behind LibreOffice. If you need assistance getting started with LibreOffice development, you can get in touch with him:

E-Mail: hossein@libreoffice.org

IRC: hossein in the IRC channel #libreoffice-dev on the Libera.Chat network connect via webchat

Comments

  1. By Alexander E. Patrakov

    • By Mike Saunders