An artificially complex XML schema as a lock-in tool

A document format is a tool for sharing knowledge and, as such, should be as simple and accessible as possible in relation to the complexity of the document content itself. This remains true even when the format is based on an XML schema that is hidden from users when the document is displayed on screen.

Unfortunately, while an XML schema can be simple, it can also be unnecessarily complex, bloated, convoluted and difficult to implement without specific knowledge of its features. This is true even if the on-screen documents are identical. In this case, complexity is an intentional tactic used to lock users into a vendor, as is the case with the Microsoft 365 document format.

An XML schema comprises the structure, data types and rules of an XML document and is described in an XML Schema Definition (XSD) file. This tells the PC what to expect and checks that the data follows the rules. In theory, XML and XSD together form the basis of the concept of interoperability. However, in practice, an XML schema can be made so complex that it becomes a barrier rather than a bridge.

An “artificially complex” XML schema goes beyond the level of complexity needed to display even the most intricate content on screen. In fact, it is completely disconnected from the actual complexity of the content, to the extent that even a simple sentence such as “To be, or not to be, that is the question” becomes an inextricable sequence of tags that users cannot access.

This artificial complexity is characterised by a deeply nested tag structure with excessive abstraction, dozens or even hundreds of optional or overloaded elements, non-intuitive naming conventions, the widespread use of extension points and wildcards, the multiple import of namespaces and type hierarchies, and sparse or cryptic documentation.

In the case of the Microsoft 365 document format, the only characteristic not present is sparse or cryptic documentation, given that we are talking about a set of documents totalling over 8,000 pages. All the other characteristics are present to a greater or lesser extent, making life almost impossible for a developer trying to implement the schema.

To illustrate how this translates into a lock-in strategy, consider a railway system where the tracks are accessible to all, but the main train manufacturer imposes its own incredibly complicated control system. In theory, anyone could build a train compatible with the tracks, but the control system specifications are so convoluted that only the main train manufacturer can ultimately offer rail services.

The worst thing is that passengers don’t realise they are being held hostage by technical constraints that they cannot understand until ticket prices rise or the number of cities served declines. At this point, the main manufacturer can dictate its terms, which passengers are forced to accept.

This is very similar to what is happening in the world of information technology, where Microsoft is effectively forcing its customers to switch from Windows 10 to Windows 11 against their will. This switch has no technical justification and locks customers into using Windows 11 and Microsoft 365. This is because customers have completely ignored the problems that arise from using proprietary technologies.

If, over the years, the millions of Microsoft users who uncritically accepted a narrative that was credible to non-technical users but divorced from technological reality had taken a critical stance towards this monopoly, which would have raised doubts in any other sector, we would be in a very different situation today.

Instead, these users – including governments and supranational organisations – have allowed lock-in strategies, in which Microsoft 365’s artificially and unnecessarily complex XML document schema plays a fundamental strategic role, to become increasingly sophisticated and pervasive.

Therefore, if you are developing or choosing an XML-based system, bear in mind that complexity imprisons people, whereas simplicity and clarity set them free.

The Document Foundation announces LibreOffice 25.2.5

LibreOffice 24.8 has now reached the end of life, so all users have to update their free office suite to the latest release

Berlin, 17 July 2025 – The Document Foundation announces the release of LibreOffice 25.2.5, the fifth maintenance release of the LibreOffice 25.2 family for Windows (Intel, AMD and ARM), macOS (Apple Silicon and Intel) and Linux OSs, available for download at www.libreoffice.org/download [1].

LibreOffice 24.8 has reached the end of life, which means that this release – which includes dozen of fixes and enhancements that further improve reliability, performance and interoperability – is suggested for production environments, and all users should update their installation as soon as possible.

LibreOffice 25.2.5 is based on the LibreOffice Technology, which enables the development of desktop, mobile and cloud versions – either from TDF or from the ecosystem – that fully support the two ISO standards for document formats: the open ODF or Open Document Format (ODT, ODS and ODP) and the closed and proprietary Microsoft OOXML (DOCX, XLSX and PPTX).

Products based on the LibreOffice Technology are available for all major desktop operating systems (Windows, macOS, Linux and ChromeOS), mobile platforms (Android and iOS) and the cloud.

For enterprise-class deployments, TDF recommends a LibreOffice Enterprise-optimized version from one of the ecosystem companies, with dedicated value-added features and other benefits such as SLAs and security patch backports for three to five years.

English manuals for LibreOffice 25.2 Writer, Calc, Impress, Draw and Math are available for download at books.libreoffice.org/en/. End users can get first-level technical support from volunteers on the user mailing lists and the Ask LibreOffice website: ask.libreoffice.org.

Downloading LibreOffice

All available versions of LibreOffice for the desktop can be downloaded from the same website: www.libreoffice.org/download/.

LibreOffice users, free software advocates and community members can support The Document Foundation and the LibreOffice project by making a donation: www.libreoffice.org/donate.

[1] Fixes in RC1: wiki.documentfoundation.org/Releases/25.2.5/RC1. Fixes in RC2: wiki.documentfoundation.org/Releases/25.2.5/RC2.

LibreOffice Podcast, Episode #4 – Documentation in Free and Open Source Software

Good software needs good documentation. But how do we define “good” in this sense? And what does the future hold? Find out in episode 4 of the LibreOffice Podcast! (This episode is also available on PeerTube.)

Please confirm that you want to play a YouTube video. By accepting, you will be accessing content from YouTube, a service provided by an external third party.

YouTube privacy policy

If you accept this notice, your choice will be saved and the page will refresh.

The Role of XML in Interoperability

When different systems, applications or organisations need to communicate with each other and actually understand what is being said, interoperability is key. It enables a hospital’s software to communicate with an insurance company, for example, or one vendor’s inventory system to synchronise with another’s logistics platform.

At the heart of many of these data exchanges is XML.

XML (Extensible Markup Language) may not be new or flashy, but it remains one of the most powerful tools for achieving reliable, structured interoperability across diverse platforms.

Why is interoperability so hard?

Systems are built using different programming languages, data models and communication protocols. Without a shared format or structure, exchanging data can result in a complex web of custom APIs, ad hoc conversions, and manual adjustments.

To get systems working together seamlessly, you need:

  • A standardised structure for data.
  • A way to validate that structure.
  • A format that is language-agnostic and platform-neutral.

XML ticks all these boxes.

How XML enables interoperability

1. Self-describing structure

XML uses tags to clearly label data:

<customer>
   <name>Maria Ortega</name>
   <id>87234</id>
</customer>

This means that a receiving system doesn’t have to guess what each field means, as it is explicitly defined. This reduces the risk of misinterpretation and supports automated parsing.

2. Schema validation

Using XSD (XML Schema Definition) or DTD (Document Type Definition), you can define the rules that an XML document must adhere to, such as which elements are required, which data types are valid and what the structure must be.

This is critical for:

  • verifying incoming data
  • preventing malformed or incomplete exchanges
  • ensuring consistency across multiple systems

3. Namespaces for avoiding collisions

XML namespaces prevent tag name conflicts when data from different sources is combined.

<doc xmlns:h=”http://www.w3.org/TR/html4/” xmlns:f=”http://www.w3schools.com/furniture”>
   <h:table>…</h:table>
   <f:table>…</f:table>
</doc>

Without namespaces, systems could misinterpret elements with the same name but different meanings.

4. Cross-platform compatibility

XML is plain text. Any system that can read a file can read it, whether it’s written in Java, .NET, Python or COBOL. This makes it ideal for long-term data exchange and integration between legacy and modern systems.

XML in real-world interoperability

Healthcare: HL7 CDA/FHIR

Hospitals, clinics, insurance providers and pharmacies rely on XML-based formats to exchange clinical records, billing data and prescriptions. HL7’s CDA (Clinical Document Architecture) is a strict XML schema that is used worldwide.

In government, XML is used for e-government forms and tax data.

Tax filings, business registrations and compliance documents are often submitted in XML format. This ensures consistent structure across various jurisdictions and software vendors.

Publishing: DITA and JATS

XML standards are used for modular content creation and journal publishing to allow interoperability between authors, editors, publishers, and archive systems, even if they are using different tools.

Finance: XBRL

XBRL (eXtensible Business Reporting Language) uses XML to standardise financial reports, enabling regulators, investors and analysts to automatically process and compare data from thousands of companies.

Summary

Interoperability isn’t just about convenience. It’s about accuracy, consistency and trust. XML’s rigidity helps to enforce that trust.

XML may not be trendy, but it remains the backbone of system-to-system interoperability. Its structured format, validation tools and long track record make it essential wherever precision and compatibility are non-negotiable.

If your systems need to communicate reliably and seamlessly across platforms, XML is one of the best languages they can use.

Danish Ministry switching from Microsoft Office/365 to LibreOffice

Flag and text saying Danish Ministry switching from Microsoft Office/365 to LibreOffice

Following the example of the German state of Schleswig-Holstein, which is moving 30,000 PCs from Microsoft Office/365 to LibreOffice, the Danish Ministry of Digitalisation is doing the same.

Caroline Stage Olsen, the country’s Digitalisation Minister, plans to move half of the employees to LibreOffice over the summer, and if all goes as expected, the entire Ministry will be free from Microsoft Office/365 later in the year.

In a LinkedIn post, Olsen summarised the reasons for switching to LibreOffice:

We must never make ourselves so dependent on so few that we can no longer act freely. Because far too much public digital infrastructure is today tied up with very few foreign suppliers. This makes us vulnerable. Also financially.

That is why we are now testing in parallel at the Ministry of Digitization how it works in practice when we work with open source solutions. Several municipalities are doing the same.

Not because we think it’s easy – but because we know it’s necessary to lead the way if we want to create more competition and innovation – and reduce our dependence on the few.

We in the LibreOffice project welcome this move, and look forward to seeing more governments and organisations getting control of their digital sovereignty and using public money for public code.

XML: a technology at the heart of our daily lives

In my last article, I mentioned XML several times, perhaps assuming that all users had a basic understanding of it. Rereading it, I realised that an introduction to XML was needed for non-technical users, those who use XML every day without realising it, when they open a document, check the weather, place or receive an order online, or issue a digital invoice. XML works silently behind the scenes.

But what exactly is XML and why should it matter to non-techies? I will try to explain it in simple terms.

XML stands for eXtensible Markup Language, a way of organising information in a format that is easy for both people and computers to understand, helping different applications communicate and exchange data using a common language. Put simply, XML is a digital container that clearly labels information.

For example, this is a shopping list in XML format:


<groceryList>
  <item>
    <name>Bread</name>
    <quantity>1 loaf</quantity>
  </item>
  <item>
    <name>Milk</name>
    <quantity>2 litres</quantity>
  </item>
</groceryList>

Labelling helps computers and software understand exactly what each piece of information means.

In a hyperconnected world like ours, where apps and systems share data, XML allows that data to move between very different systems, such as credit card management apps and online shops. Without a common language like XML, communication between these systems would be much more complicated and slower, or even impossible.

So, XML is integrated into most everyday activities, even though it is completely hidden from users:

  • All documents created by all office suites use XML, in some cases to facilitate transparency and interoperability, and in other cases to create a hidden layer of complexity with the aim of preventing transparency and interoperability.
  • All apps that provide weather forecasts obtain updates by reading XML data issued by weather agencies.
  • Almost all e-commerce applications use XML to manage communication between the website, the payment system, the bank and the shipping service.
  • All blogs and news sites use XML to automatically transmit new content to readers.

XML is clear and easy to read because it organises data in an orderly manner with labels that are understandable to both humans and computers; it is flexible, as it is not limited to a single type of information and can be customised for different scenarios, from cooking recipes to flight schedules; and it is compatible with all platforms.

To appreciate the value of XML, you don’t need to have a deep understanding of the language, just know that it exists and that – when used properly, as in the case of the ODF format – it has the potential to help users achieve and protect their digital sovereignty.

Of course, it is equally important to know that XML can be used in exactly the opposite way, as is the case with Microsoft 365’s OOXML format (and previously Office), to limit users’ digital sovereignty and perpetuate lock-in through artificial file complexity.

In summary, XML is a silent enabler that ensures that users’ apps, services and data all speak the same language.

The next time you open a document, check your favourite news site or follow an online delivery, remember that XML is working silently behind the scenes to ensure that everything runs smoothly. And try to imagine a digital world without XML, where a single company controls the data and, through it, the users.