ODF vs OOXML, an issue that should never have existed

A number of journalists read last week’s piece as an attack on Microsoft. We want to explain what they walked past.

Whenever we address the contrast between ODF and OOXML, some people perceive it as a campaign against a company. It is not. We are trying to do something far more useful: to make the structural problem with the standard document format clear to those who have to live with it: public officials, educators, and above all, individual citizens.

All these people find themselves facing a problem they did not create, but which affects them daily, and of which they are often the unwitting victims, every time they create a document or receive one.

The least we can do – and in fact we have been doing it for twenty years, though until now almost no one has listened – is to explain, clearly and without drama, how the problem arose, why it persists, and why ODF is the only way out. It is an educational and selfless goal – we do not sell software, so we have no commercial interest to protect – and not an attack on a company.

The problem concerns the current document landscape, based almost exclusively on a proprietary format controlled by a single company, and what we could have had instead: a standard format controlled by an independent community of stakeholders.

Microsoft features in this story because of the rational-monopolist behaviour it has exhibited since 2006, during and after the standardization of the proprietary OOXML format: first promising the standard and then doing everything possible to ensure it was first ignored and then forgotten, quietly but with extreme determination. All of this to protect a market share now worth over $30 billion, which would have been at risk of erosion if the document format had been genuinely standardized: migration to any other office suite would then have been free of cost and complexity.

Today, most organizations – public agencies, supranational bodies, companies – and most individual users face a problem that, had everyone listened to independent experts between 2006 and 2008, would never have existed. The international standards system and national governments allowed a single vendor – rather than the community of developers, systems analysts and standards scholars who raised objections – to set the terms under which documents would be archived. That vendor chose its own proprietary format.

The problem, in other words, was created by institutions – ISO, national standards bodies, public officials and ultimately politicians – who approached the choice of format for public documents in a completely uncritical manner. They trusted the process despite repeated and legitimate protests about its transparency, and never thought to perform a simple file analysis that would, in a few minutes, have raised more than a few doubts. The industry then followed the vendor’s lead, for convenience, because it expanded the business – without weighing the medium- and long-term consequences for institutions and individual users. What is troubling is that even a segment of the open-source industry went with the flow, and continues to do so, as shown by the fact that today only two open-source office suites – LibreOffice and Collabora Office – use ODF as their native file format.

If between 2006 and 2008 everyone had done their part, today there would be a single open, multi-vendor interoperability standard for office documents – our ODF – governed neutrally and implemented by all. Everyone would have benefited, because document exchange based on a true standard is completely transparent and independent of operating system and application software. Microsoft could have kept its own internal proprietary format as a mere implementation detail, invisible to users, because documents would have flowed seamlessly through the standard. An ideal world that never became reality.

Instead, the accelerated standardization of OOXML through ISO in 2008, against all technical objections, produced the OOXML Transitional format we use today: a temporary compatibility mode, explicitly defined as a bridge to be crossed once and then dismantled. It was not dismantled. It became the only variant used, at every level, by the majority of office suites. Today the vast majority of office documents worldwide – including the public documents of public institutions and of governments everywhere – are saved in a format that its own designers had declared provisional.

Even OOXML Strict would not solve the problem. Microsoft has never promoted it – part, as we have explained, of an understandable strategy – and none of those who were supposed to oversee the process ever requested or verified its implementation by the deadlines promised at standardization, from 2010 onward. But the deeper point is this: Strict is simply a different variant of the same single-vendor format. A standard is not open because its specification has been published. It is open when it is developed through a transparent process that no single company can control, and maintained by an independent community of users and implementers. Replacing Transitional with Strict changes the variant but leaves governance – which is what determines sovereignty – exactly where it was.

So when we advocate for ODF, we are not criticizing anything. We are trying to clarify a problem that was artificially created, and to ask why a problem that was artificially created is treated by most stakeholders – organizations, governments, companies and individuals – as an established fact of nature.

Attention to digital sovereignty is growing, even if resistance remains strong, because awareness of this issue – which should never have arisen in the first place – is still virtually nonexistent, not only among users but among industry professionals themselves.

We continue to believe ODF can regain the role it should have had after 2006, when it was approved – rightly – as an ISO standard, because it had every characteristic of an open standard. The Deutschland Stack restores that role to ODF, and we hope the German government’s decision will not remain isolated.

New Web and Mobile Strategy for LibreOffice

New Web and Mobile Strategy for LibreOffice

LibreOffice is a desktop application, and we will continue making it. But we have constant requests for web and mobile versions, so here is our updated plan. These are minutes from the TDF Team and Board of Directors meetings on web and mobile strategy for LibreOffice:

Who was present

Team: Michael Weghorn, Jonathan Clark, Sophie Gautier, Neil Roberts, Mike Saunders, Guilhem Moulin, Heiko Tietze, Ilmari Lauhakangas, Dan Williams, Xisco Fauli, Christian Lohmaier, Vissarion Fysikopoulos, Juan José Gonzalez, Olivier Hallot, Florian Effenberger, Hossein Nourikah

Board: Eliane Domingos, Mike Saunders, Paolo Vecchi, Simon Phipps, Sophie Gautier

Summary

The meetings, which took place April 20, April 22 and May 19, focused on discussing LibreOffice and TDF strategies for the evolving development landscape and the future of LibreOffice across all platforms – desktop, mobile, and cloud. Team roles were reviewed, and new assignments were proposed.

Status of the current foundation team activities

Since 2020, the development of LibreOffice within the foundation focused almost uniquely on the desktop version of LibreOffice (and to a lesser extent, the Android viewer app) and that part will continue unchanged. Therefore the foundation will continue to deliver two major LibreOffice releases per year.

Engineering Steering Committee (ESC)

The current ESC members and activities remain unchanged, and weekly meetings continue with reports on activities, releases, topics and project management. The meeting, as always, is open to the development community.

Community support

No changes in vision for community support. Regional events and special projects remains as they are, and require proper and timely project submission and available budget. Google Summer of Code and Outreachy will continue as before. The LibreOffice Conference 2026 is planned and will take place in Pordenone, northern Italy.

Marketing and communications

Marketing and communications will adapt to the current situation of the foundation and LibreOffice . More communication of team activities and product development is needed, as well as improving the use of social networks for mass communication. Unification of the several different blogs is under consideration.

Challenges ahead

The foundation is challenged to address the following areas:

  • Develop an online and mobile version of the suite. The challenge is to select the technology that fulfill both end-user and server side management
  • Innovate in collaboration such as peer-to-peer document editing
  • Continue to produce two releases per year of the desktop and Android viewer versions
  • Improve the user interface and usability of LibreOffice
  • Keep the quality and security of the office suite
  • Develop new features and improve current features
  • Cherry-pick relevant features and improvements from other software producers
  • Full support of the Open Document Format (ODF)
  • Produce adequate documentation for development processes and the current and new products
  • Be an active participant of the major open source communities and government initiatives for FOSS and nations’ sovereignty
  • Preserve donation inflow and pursue corporate or government donations through development projects

New assignments of the team

It was suggested that the team be distributed in two parts, with proper interaction between the groups. Additional headcounts, as well as external contracts are considered to fulfil the mission. New community developers will be assigned to tasks upon demand.

Of critical importance, the suite security and CVE’s management were assigned to Christian Lohmaier (Release engineer) and Xisco Fauli (Quality Control). Coverity and OSS-Fuzz services are assigned to Xisco Fauli. These new missions require additional manpower, and provisions for hiring an additional QA specialist is needed.

The team will select valuable technology and code under FOSS licenses, and from companies using LibreOffice Technology.

Mobile, cloud and peer-to-peer development

Mobile and cloud development management is assigned to Jonathan Clark (leader), with support from Dan Williams, Michael Weghorn and Neil Roberts. The planning and priority goals established are based on Jonathan Clark’s “Web and Mobile Development Strategy Proposal” for the remainder of 2026, and include:

  • WebAssembly (WASM) Optimization: Enhancing and polishing our functional prototype based on Qt 6 and WebAssembly. This technological route will run the application robustly and natively directly inside the user’s browser, without overloading hosting servers.
  • Accelerating the mobile project: The goal for 2026 involves technical advancement in the graphical user interface (GUI) code and testing builds on Android and iOS emulators, with advisory support from Dan Williams for iOS-specific topics.
  • Smart collaborative editing: We will initiate practical collaborative editing tests using a stable client-server architecture (via direct TCP/IP connections), paving the way before advancing to peer-to-peer (P2P) network research.

Conclusion

The Document Foundation is challenged to evolve and expand LibreOffice to other computing platforms, and include collaboration editing. This requires changes in the current team activities, mission and organization. The Board and the team are fully committed to addressing these challenges and reporting to the public the development and achievements obtained. Freedom has never been so valuable for the LibreOffice community.

Discuss our plan and strategy on our forum here

LibreOffice Native Language Projects – TDF Annual Report 2025

TDF Annual Report 2025 banner

LibreOffice is available in over 120 languages, thanks to the work of localisation communities around the world. We asked them to summarise their work in 2025 – here’s what they had to say…

Czech

The Czech community maintained an active presence both online and in-person. Their localisation efforts remained strong, keeping the UI fully translated and the Help files at 95% completion. The team also stayed connected with their user base through the Czech Ask LibreOffice site along, with social media presence across X, Facebook, Instagram and Mastodon.

There was also outreach at events. The team hosted dedicated LibreOffice booths at InstallFest in April and LinuxDays in October, both held in Prague. Documentation also saw significant updates, with the publication of the Getting Started Guide (24.8), the Calc Guide (25.2), and the Draw Guide (25.8).

LibreOffice booth at LinuxDays 2025 in Prague

Danish

The Danish community focused on multimedia education and consistent localisation in 2025. There was the launch of the @libreofficeskolen (“LibreOffice School”) YouTube channel. This initiative provides the Danish-speaking public with a series of instructional videos designed to lower the barrier to entry for new users. Alongside this output, the community kept the UI and Help files fully translated at 100%, and ensured that LibreOffice promotional videos were accessible via localised subtitles.

Dutch

Beyond maintaining the local website and providing assistance via the Ask LibreOffice website and mailing lists, the Dutch-speaking community worked on many documentation updates.

Beginning in January with the Calc Guide for 24.8, the community then published a steady stream of translated manuals for version 25.2, including the Writer, Impress, Math, and Getting Started Guides. This effort then lead to the release of the updated 25.2 Calc Guide in July. On the localisation front, the Dutch team continued their work on Weblate, successfully maintaining 100% translation coverage for both the User Interface (UI) and the Help system, following upstream changes.

Finnish

The Finnish community focused on steady and ongoing translation efforts. The team prioritised localisation of the UI, with secondary work continuing on the Help system. To ensure the long-term sustainability of these efforts, the community has been proactive in outreach, utilising the vapaaehtoistyo.fi online platform to recruit new volunteers.

French

On the technical front, the French-speaking team maintained 100% translation coverage for both the UI and Help systems across all versions of LibreOffice. Their localisation work extended to the new Hugo-based website, release notes, and the Extensions wiki page. Significant progress was also made on the translation of Calc functions on the wiki and the subtitling of promotional videos.

Outreach was a major topic in 2025, with the community representing LibreOffice at events like Capitole du Libre in Toulouse, and Open Source Experience in Paris. The team also worked on academic ties, coordinating with UBO University to involve translation students in user guide writing. Beyond documentation and QA, the French team supported users through the Ask LibreOffice site and published various articles on LinuxFR. In addition, there were REGEX tutorials for civil servants and introductory presentations at public media libraries.

German

Throughout the year, the German-speaking community wrote blog posts (and translated others from the English-language blog), maintained its social media activity on Mastodon, and worked on user interface translations. Community members also attended local events on behalf of the LibreOffice project, such as the Augsburger Linux-Infotag 2025 and Digitaltag 2025 in Duisburg.

LibreOffice booth at the Augsburger Linux-Infotag 2025

Irish

The Irish-speaking community made significant steps in 2025 to bring the suite to native speakers. Currently, the UI and website translations are nearing 100%, with the LibreOffice 26.2 user interface already reaching a 96% completion rate. The team’s primary focus is now on finalising these remaining strings and resolving technical checks.

Italian

The Italian-speaking community maintained 100% translation status for the UI and Help files across all active versions of the suite. The team helped with localising the project’s new Hugo-based website and kept the Italian-speaking public informed by translating all release notes and press releases. Current efforts are focused on the ongoing translation of Calc functions on the wiki and a comprehensive revision of various wiki pages.

In 2025, the Associazione LibreItalia organised a full-day LibreItalia conference in Gradisca d’Isonzo, following the adoption of a regional law mandating the use of free open source software in Friuli Venezia Giulia, an eastern Italian region bordering Slovenia. The politician who signed the law provided an overview of the approval process.

The event was organised by Marco Marega, a long-standing member of LibreItalia who is active in the localisation team and other areas of the project. Several members of the Pordenone LUG attended the conference and initiated a discussion about organising the 2026 LibreOffice Conference in their city. This discussion then evolved into an official proposal.

Japanese

The Japanese community had a busy year in terms of events. There was the LibreOffice Asia Conference 2025 in Tokyo, a two-day event that brought together 70 attendees. Outreach extended internationally as Japanese members traveled to COSCUP 2025 in Taiwan to deliver three talks and strengthen ties with the Taiwanese community.

The community also organised:

  • Online Hackfests: Held 46 times via Jitsi and YouTube Live
  • Online Study Parties: Three sessions dedicated to user knowledge sharing
  • LibreOffice Days: Monthly offline meetups in Osaka, co-hosted with the Open Data Mokumoku-kai
  • Open Source Conferences (OSC): Booths and hackfests at seven locations across Japan, from Hokkaido to Fukuoka

On the documentation front, the team published the Writer Guide for LibreOffice 25.2 in Japanese. Localisation efforts currently stand at 91% for the UI and 46% for Help. The team also remained responsive to end users, answering nearly 50 new questions on Ask LibreOffice, publishing 20 blog articles, and maintaining a steady presence on X, Facebook and Bluesky.

LibreOffice Asia Conference 2025 logo

Kazakh

Starting in late 2025, the community launched a refresh of its translation efforts, achieving 100% UI completeness in time for the LibreOffice 26.2 release. This work extended to the localisation of the official website and the activation of the Help master branch, preparing for future documentation projects.

To improve consistency across other open source projects, the team is currently developing a unified Kazakh glossary derived from various localisation projects. Furthermore, the community has begun testing the use of AI-assisted translations, reporting high-quality results to improve their workflows in 2026.

Tagalog

The Tagalog community made steps forward in localisation, maintaining the user interface and Help files at a high completion rate of 98–99% across all versions. The team continued to integrate Deep Language Modeling to automate accuracy verification. While the community experiences a natural ebb and flow of contributors, there is growing interest in expanding support to regional dialects, such as Ilocano.

The team also wishes to extend a special note of gratitude to the dedicated group of US-based translation helpers whose contributions were vital to success in 2025.

TDF says: many thanks to all native-language projects for their work in 2025! Of course, this is just a selection of their activities, based on communities that reported their activities, but there are many more too.

Like what we do? Support the LibreOffice project and The Document Foundation – get involved and help our volunteers, or make a donation. Thank you!

How your donations help the LibreOffice project and community

LibreOffice is free thanks to your donations. Here’s how your support helps us to improve the software and grow the community that makes it 😊 (Note: this video is also available on PeerTube.)

Please confirm that you want to play a YouTube video. By accepting, you will be accessing content from YouTube, a service provided by an external third party.

YouTube privacy policy

If you accept this notice, your choice will be saved and the page will refresh.

There is no digital sovereignty without ODF

Any other choice is a choice of dependence on a single vendor

Digital sovereignty begins with the document format. Everything else – server location, hosting jurisdiction, procurement clauses – is downstream of this single decision. If the format is standard and open, the user controls the document. If the format is proprietary the vendor controls it, even when the file sits on the user’s own hard drive.

This is why LibreOffice, and its derivatives such as Collabora Office and Online, are today the only legitimate choice for governments, supranational bodies, businesses and organisations that want to protect the digital freedom of their users. Only software based on the LibreOffice source code – the LibreOffice Technology – uses ODF as its native document format. Every document saved, stored, retained and exchanged in ODF remains the exclusive property of its author, and remains so over the years.

ODF – Open Document Format, as the name says – was designed and developed in accordance with the characteristics of a true open standard: clearly documented, transparently developed by an independent body, properly versioned, built on existing standards, and stored in XML files that any user can read.

None of this applies to OOXML. The name is itself an oxymoron: XML stands for eXtended Markup Language, which is open by definition, but OOXML’s syntax is so complex that it is unreadable even to advanced users. The format was deliberately designed to become a sophisticated lock-in tool at a moment when Microsoft’s other strategies had already been uncovered and analysed.

The Transitional/Strict bait-and-switch

OOXML was approved as an ISO standard through a process that was an affront to transparency, ethics, common sense and respect for users. The format is documented in a way that discourages consultation – over 7,500 pages – and is developed by Microsoft behind closed doors in Redmond.

It is not versioned. It uses no independent standards. On the contrary, it relies on proprietary Microsoft formats wherever possible, in some cases formats that Microsoft itself had deprecated because the market rejected them. It is not even compatible with the Gregorian calendar. The XML schemas are nearly absurd in their complexity.

The bait-and-switch worked like this: “I swear it will be Transitional until 2010, very proprietary and very little of a standard, and after that only Strict, not very proprietary and very much a standard.”

The catch: Strict never materialised in practice. For years it lingered as a last-resort option that no one was meant to use, and it has now disappeared from the Save As options altogether. The standardised version of OOXML – the one ISO was told would become the real format – no longer exists as a user choice. Only Transitional remains.

A pity, because we would have had a laugh with Strict’s bugs. Excel has a thing for getting dates wrong (the (in)famous 1900 leap-year bug, inherited from Lotus 1-2-3 and never fixed), and when Excel gets dates wrong, no other software does it worse.

The political consequences

All of this is hard to grasp by looking at what happens on screen, because the document seems entirely harmless in its apparent simplicity. And yet all of it has been documented in detail since OOXML was first introduced, by independent experts who should have been heard, both by ISO and by those working in advanced technology.

Instead, ISO bought the Transitional/Strict story. And once ISO believed it, governments and politicians believed it too, rushing to adopt OOXML as a document format for fear that Bill Gates and Steve Ballmer might take offence and act accordingly.

In doing so, they placed citizens’ private data in Microsoft’s hands and reinforced a monopoly that was already evident before OOXML’s arrival, and that has become increasingly difficult to dismantle ever since.

The Microsoft ecosystem played its part in all this, and partner companies – SAP foremost among them – have always done everything in their power to push their users toward OOXML for data exchange, openly obstructing the use of the standard ODF format. An uneven struggle, by design.

Worse still, with just a few exceptions, even those who by virtue of their expertise should have recognised OOXML as the cornerstone of Microsoft’s new lock-in strategy fell for it. Some still write today: “we have to accept it, OOXML is an ISO standard.” This is not a serious position.

It is a deference with no rational basis.

Microsoft’s monopoly position is not founded on technological superiority but on the strategic foresight of Bill Gates and the lobbying machinery that flowed from it, deployed well ahead of its time.

The same deference has had consequences in the scientific community as well.

The HUGO Gene Nomenclature Committee was forced in 2020 to rename dozens of human genes – including SEPT1 and MARCH1 – because Excel kept silently converting their symbols to dates. Rather than going to Microsoft and demanding a bug fix, scientists preferred to throw years of established nomenclature down the drain to avoid upsetting Redmond. A revealing precedent.

Supporting ODF is not choosing ODF

There is a distinction that needs to be made plainly, because it is too often blurred, sometimes inadvertently, sometimes by design. Supporting a format is not the same as choosing it.

An office suite that saves OOXML by default is not supporting digital sovereignty, independently from the level of ODF support. It is an OOXML suite with an ODF import/export filter, which inherits all the OOXML based lock-in mechanisms: proprietary schemas, vendor-controlled evolution, hidden binary fragments, format-level dependencies on Microsoft’s roadmap.

Digital sovereignty lives at the native-format layer. Support describes what a piece of software can read. Native format describes what it is. The native format determines the legal and technical character of every document the user creates.

A commitment to “improve ODF support” is not a commitment to digital sovereignty. It is a commitment to keep ODF as a guest in someone else’s house.
This distinction matters for any project, coalition, or procurement decision that claims a digital sovereignty objective. The meaningful question is never whether ODF is supported – it almost always is, at some level – but whether ODF is the native format, chosen and committed to as such.

If the answer is anything other than yes, the sovereignty claim is provisional at best.

What digital sovereignty actually requires

The only viable path to digital sovereignty today is to use ODF as the native document format, and OOXML as the interoperability format for exchange with users who – out of lack of information, or pure convenience – continue to use the proprietary format, and share ownership of their own files with the vendor.

Anything else is false digital sovereignty. Control over a document and over the information it contains depends first on the format and only afterwards on the location of the server.

Standard, open format: the user is in control. Proprietary format: the vendor is in control, even if the document sits on a PC on the user’s desk.

This should be self-evident to anyone working in open source software, because it follows directly from its principles.

A proprietary document respects neither Freedom 1 (the freedom to study and modify) nor Freedom 3 (the freedom to improve and redistribute), as it is not not documented in a way which makes the source code readable and it is not developed through a transparent process.

The decision to adopt OOXML as the native format runs counter to the interests of governments, supranational bodies, organisations of every kind and enterprises. But above all, it runs counter to the interests of users as it exploits their lack of information rather than investing in their education and in their digital sovereignty.

The choice of native format is not a technical detail to be deferred or finessed. It is the choice. Any project that treats it as something less is not supporting digital sovereignty. Full stop.

Why a digital document is a piece of software, and what that means for your freedom

Most people, including many competent software developers, think of a digital document the way they think of a sheet of paper: an inert object that holds words and pictures, indifferent to the tool used to open it. This intuition is wrong, and the consequences of getting it wrong shape everything from vendor lock-in to cybersecurity to the long-term readability of public records.

A digital document is not paper. It is a piece of software.

The HTML parallel

The clearest way to see this is to think about a web page. When you visit a website, your browser receives a file – an HTML document – and executes it. It parses the markup, applies styling rules, runs embedded scripts, fetches additional resources, and assembles the result into something you can read. The page you see on screen is not a static image transmitted from the server, it is the output of a small program that your browser ran on your behalf.

Nobody disputes that a web browser is software. Yet the HTML file it consumes is also, in a meaningful sense, software: a set of instructions describing what should happen when the file is opened. Change the instructions, and the rendered page changes. Withhold the specification of how the instructions should be interpreted, and only the party holding the specification can guarantee a faithful rendering.

It is worth remembering that the openness of HTML did not happen by accident, and was nearly lost. In the early 2000s, Internet Explorer 6 commanded around ninety per cent of the browser market, and Microsoft used that dominance to push proprietary extensions to HTML, CSS, and the document model: non-standard tags, behaviours, and filters that worked only in their browser.

Web developers, desperate to reach users, began coding both to Internet Explorer and to the standard, carrying the cost of that double work themselves, while the vendor reaped the benefit of lock-in either way. The open web did not fragment, but only because developers absorbed the cost of holding it together. Had they stopped, HTML would have quietly become whatever Microsoft shipped next.

It took a sustained effort by the W3C, by competing browsers such as Firefox, and by the community of standards-conscious developers to pull the web back onto open ground. Had that effort failed, HTML today would not be a shared language, but a Microsoft product. The web survived because the standard was defended. Document formats have not always been so lucky.

An office document – a DOCX, an ODT, a PPTX, a PDF – works exactly the same way. It is a structured file containing instructions: this text in this font at this size, this image embedded here, this table laid out this way, this field recalculated automatically, this macro executed on opening. When you “open” the document, an application reads those instructions and runs them. The page you see on screen is the output of a program – the office suite – executing the instructions contained in the document.

The document is the code. The office suite is the interpreter. Together they are a software system, and the user is the one running it, usually without realising.

Why this matters: lock-in is a software property

Once you see a document as software, the question of file formats becomes the question of programming languages. A proprietary file format is a programming language whose specification is owned, controlled, and modifiable at will by a single vendor. The “programs” written in that language – your contracts, your invoices, your books, your public administration archives – can only be reliably executed by software that vendor authorises.

This is the structural mechanism of lock-in. It is not a side effect of user habit or training cost. It is the direct consequence of writing your documents in a language whose grammar belongs to someone else. The moment the vendor changes the grammar – and proprietary formats change constantly, at least with each new product release, but often even more frequently – your existing documents may render differently, lose features, or stop opening altogether. You do not own the language in which your own records are written.

Open standards such as ODF exist precisely to break this dependency. ODF is a publicly specified, independently maintained format whose grammar belongs to no single vendor. Any developer can build a faithful interpreter. Your documents, written in an open language, remain readable regardless of what any single company decides.

Why this matters: attack surface is a software property

The second consequence is security. Software has vulnerabilities, paper does not. The moment we admit that a document is software, the long catalogue of OOXML-related security advisories becomes unsurprising, and inevitable, indeed.

Office document formats are ferociously complex. OOXML in particular runs to thousands of pages of specification, with macro languages, embedded OLE objects, external references, conditional formatting logic, and a substantial layer of binary legacy compatibility. Each of these is a way in for an attacker. A document that arrives by email and “just opens” can run hidden code, download malicious content from the internet, exploit weaknesses in how the file is read, and from there take control of the computer itself. The pattern recurs year after year, vulnerability after vulnerability, because the document is doing what software does: running.

A simpler, more rigorously specified format is harder to weaponise. This is not a guarantee – any sufficiently expressive format has risks – but the principle holds: complexity is the friend of the attacker, and proprietary complexity, never fully documented to outside parties, is the best friend of all.

Why this matters: freedom is a software property

If a digital document is software, then the framework we apply to software ethics applies to documents. The Free Software Foundation defines four freedoms: the freedom to use the program for any purpose, to study and modify it, to redistribute copies, and to distribute modified versions. The second and the fourth – Freedom 1 and Freedom 3 – require access to the source.

A document in a proprietary format violates these freedoms in exactly the way proprietary software does. You cannot fully study how it will be interpreted, because the specification of the format is either secret, partial, or subject to unilateral change. You cannot reliably build or share modified tools to interpret it, because the format’s owner retains the right to declare your interpreter non-conformant. The “source code” of the document – the full and stable specification of what its instructions mean – is not in your hands.

This is not a metaphor. It is the same dependency, structurally, that makes proprietary software unacceptable for any organisation serious about digital sovereignty. The document, as software, inherits the politics of the format it is written in.

The conclusion is unavoidable

A digital document is a small program. It runs every time it is opened. The language it is written in determines who controls it, who can attack it, and whether its readers are free.

Treating documents as paper has allowed a generation of policymakers, public administrators, and even technologists to overlook the fact that the choice of document format is a choice of software dependency, and a choice of whose grammar governs our written record. There is no neutral format, just as there is no neutral programming language. There are only formats whose specifications are open, stable, and collectively governed, and formats that are not.

We have learned, slowly and at cost, to demand openness in our software. The document is software. The demand is the same.