The Document Foundation: the name that pointed at the right thing, 16 years before

When The Document Foundation was announced sixteen years ago, some people found the name a little flat. It didn’t sparkle. It named an object — the document — rather than a product, a movement, or an aspiration. Today, that same name is worth a second look, because it turns out to have pointed at exactly the place the digital sovereignty debate would eventually arrive.

To see why, it helps to ask a simple question: when you are locked into a piece of software, where does the lock actually live?

The intuitive answer is “in the application.” You feel trapped by the program — its menus, its habits, the licence you keep renewing. But the application is replaceable. You can install a different one tomorrow. What you cannot so easily replace is your documents — the years of contracts, records, reports, and correspondence you have produced. And if those documents are saved in a format that only one company’s software can fully read, then the lock was never really in the application at all. It was in the file.

This is the quiet mechanism behind most document lock-in. The format does the trapping. As long as your organisation’s memory is stored in a format controlled by a single vendor, you depend on that vendor to read your own past — and that dependency does not end when you switch programs, because the documents come with you.

This is also why “digital sovereignty” is not, at root, a question about geography or about which company you buy from. It is a question about control: whether you, and not a supplier, hold the keys to your own information over time. An organisation that cannot open its own archives without permission is not sovereign over them, wherever it happens to be located.

The answer is older and simpler than the debate that has grown up around it: open document standards. A document saved in an open, fully published format — one any software can implement, today or in fifty years — belongs to the person who wrote it, not to the company whose program happened to create it. The format stops being a lock and becomes what it should always have been: a neutral container for your own words.

The name said this all along. It put the document at the centre, because the document is where the question is decided. Sixteen years on, the rest of the conversation is catching up — and we have only just begun to scratch the surface.

Euro-Office, open standards, and native ODF

A welcome commitment to open standards — and why it should end with ODF as Euro-Office’s native document format.

The Euro-Office pre-announcement has generated considerable coverage across the European press over the past few days. The Document Foundation welcomes the attention that open standards are receiving — and welcomes still more the commitment the announcement makes to them. Before the discussion settles, we would like to clarify one point and state one expectation.

Several reports have described Euro-Office as “the first European open source office suite.” Reading the pre-announcement carefully, we do not find the coalition making that claim, and it is not one we would endorse. Europe has been building free and open source office software for many years: LibreOffice, developed by this Foundation and a worldwide community, is itself European, mature, and far from alone.

The “first” framing appears to have emerged in the speed of a launch day rather than in the text of the announcement. We note it not to claim precedence — precedence is not the point — but because accuracy serves the cause of open standards better than enthusiasm alone.

Read on its merits, the announcement gives a great deal to welcome. The promise to improve support for the OpenDocument Format is precisely what the European free software community has long asked for, and we take it in good faith and with genuine appreciation. We have always held that sovereignty begins with the format, not with the logo on the application — and a coalition that understands this is one worth encouraging.

We would also state an expectation, in the spirit of encouragement rather than demand. Improved support is a beginning, not a destination. A format that is merely supported is one a suite can read and write as a courtesy, while a native format is the one in which its documents are created, stored, and trusted across the years — and that is precisely where digital sovereignty is won or lost.

The only destination consistent with the sovereignty Euro-Office invokes is ODF as its native document format. A genuinely European, genuinely sovereign office suite cannot treat the open standard as a concession to outsiders, it has to speak ODF as its mother tongue. The Document Foundation looks forward to that moment, and will be glad to acknowledge it when it comes.

An open letter to office suite users, just before the Euro-Office announcement

Dear office suite users,

In recent days you will have read various articles announcing the arrival of Euro-Office, which is being “marketed” as the first open-source office suite developed in Europe. We feel compelled — reluctantly, since open source should rest on transparency, not deception — to correct this claim. The first open-source office suite developed in Europe was OpenOffice.org in 2001, based on StarOffice’s source code, followed by LibreOffice from 2010.

These are two genuine open-source office suites, built from source code that originated in Europe. They are not a freeware clone of MS Office whose code provenance is undisclosed, nor a product that has rebranded itself out of pure opportunism to ride today’s wave of Digital Sovereignty.

It is worth remembering that many of those who champion Digital Sovereignty today were silent back in 2006, when the open ISO/IEC ODF standard — the pillar of Digital Sovereignty — was announced: not only did they not listen to us during all these years, but in some cases they greeted us with a condescending smile.

If we can speak of Digital Sovereignty in Europe today, it is thanks to The Document Foundation and LibreOffice community members at large, who kept the flag of open-source office suites flying when everyone was predicting their demise, and who continued to develop the only truly open and standard format that guarantees Digital Sovereignty, as it provides full user control over content.

Document formats are a subject still rife with misinformation. This is understandable on the part of Microsoft, which developed and controls the horrible proprietary OOXML format, designed precisely to prevent Digital Sovereignty by maintaining content lock-in. It is far less understandable on the part of companies that claim to advocate open source, such as those promoting Euro-Office.

Euro-Office defaults to the fully proprietary OOXML document format, developed and controlled solely by Microsoft. This makes it a de facto ally of Microsoft in its content lock-in strategy, with control remaining firmly in Redmond and far from Europe.

So, despite what is being written in support of Euro-Office — the latest of the office suites developed in Europe, and not the first — the announcement is not against Microsoft. On the contrary, it strengthens Microsoft’s strategy against European Digital Sovereignty, or, if you prefer, against the freedom of European users to control and manage their own content.

A Standard in Name Only: What OOXML Transitional Tells Us About Format Sovereignty

When a public administration is told its documents are stored in “an ISO standard format,” the assumption is reasonable: an ISO standard ought to be a clean, implementable specification that any qualified software vendor can support. Standards exist precisely so that nobody is locked to a single supplier.

OOXML — ISO/IEC 29500, the format behind Microsoft’s docx, xlsx and pptx files — does not work this way.

The standard is split into two conformance classes. Strict is the clean version: a modern document format, free of legacy baggage, that an independent implementer could reasonably support. Transitional is everything else: a vast catalogue of compatibility features, deprecated elements, platform-specific behaviours, and references to undocumented quirks of Microsoft Office versions from the 1990s. The Transitional class exists to ensure that documents converted from the old binary doc, xls and ppt formats can be represented in XML without loss.

There is one detail that matters above all others: Microsoft Office has never produced Strict OOXML by default. The option to save in Strict format is available in the installed desktop applications but is absent from the browser-based versions of Microsoft 365 — and Microsoft’s various editions have long differed in which features they offer, with the macOS version historically providing a different set of options from the Windows version. The “ISO standard” that public administrations are actually storing their documents in, when they use Office, is Transitional — the messy one. Strict is a feature you can find if you know where to look, on the platforms where Microsoft has chosen to support it. That is not the treatment a serious open standard receives.

This has consequences that go well beyond a technicality.

The standard codifies undocumented legacy behaviour. Transitional OOXML contains compatibility flags whose specification amounts to “behave like Word 95” or “lay out footnotes like Word 97.” These are not formal definitions. They are references to the behaviour of specific commercial software products released more than thirty years ago — products whose layout algorithms were never published. An independent implementer wishing to render such a document correctly must reverse-engineer software from the Windows 95 era. This is not standardisation in any meaningful sense; it is the codification of one vendor’s implementation history as a global norm.

The standard perpetuates known bugs. Excel famously treats 1900 as a leap year — it was not — because Lotus 1-2-3 did so in the 1980s, and Microsoft chose binary compatibility with Lotus over arithmetic correctness [1]. OOXML Transitional preserves this bug. The default workbook setting in every xlsx file you have ever opened encodes a date arithmetic error from the era of MS-DOS. A spreadsheet calculating durations across February 1900 will produce wrong answers, and the standard requires this.

The standard includes obsolete graphics formats. Vector Markup Language (VML) was submitted by Microsoft to the W3C in 1998 as a candidate vector graphics standard. The W3C rejected it in favour of SVG. VML should have died there. Instead, it lives on inside OOXML Transitional, because documents converted from doc files contain it, and Microsoft Office continues to emit it. Implementers must support both VML and its modern replacement, DrawingML, to handle real-world files.

The conformance class problem is structural. Strict was meant to be the future and Transitional the temporary bridge. Two decades after standardisation, Transitional remains what Office produces, what users receive, and what any competing implementation must support to be useful. The clean standard exists on paper. The standard that exists in practice — and that Microsoft Office produces by default — is the messy one.

For public administrations, this matters in three specific ways.

For archives. A document format that depends on undocumented behaviour of 1990s applications is not a safe long-term archival format. The ISO label provides a false reassurance: the parts of the standard your documents actually use are precisely the parts that are least specified and most dependent on a single vendor’s tooling.

For procurement. Specifying “ISO/IEC 29500” in a tender does not guarantee interoperability or vendor neutrality. It guarantees that documents will conform to a specification of which the practically deployed variant is, in effect, whatever Microsoft Office does. This is the opposite of what an open standard is meant to deliver.

For sovereignty. European institutions, national governments, and regional administrations increasingly recognise that the choice of document format is a sovereignty question. A format whose definitive reference implementation is a single American company’s commercial product cannot serve as the technical foundation of European digital autonomy — whatever its ISO number.

The alternative is not hypothetical. The OpenDocument Format (ODF), ratified as ISO/IEC 26300 twenty years ago this month, was designed from the outset as an implementer-neutral standard. Its specification is complete, self-contained, and does not require knowledge of any specific commercial product’s history. Multiple independent implementations exist. It is, in the proper sense of the term, an open standard.

For administrations weighing format policy, the question is not whether OOXML is “a standard.” It is. The question is what compliance with that standard actually entails, what it demands of implementers, and whether that serves the long-term interests of the institutions storing their work in it.

For those interested in the technical detail behind these claims, we attach a companion deep-dive [2] cataloguing the Transitional features, their categories, and the specific structural problems they introduce.

[1] The history of the 1900 leap year bug is well documented. Joel Spolsky, who worked on the Excel team at Microsoft in the early 1990s, recounted in My First BillG Review how Excel inherited the bug from Lotus 1-2-3 to preserve binary compatibility. Microsoft’s own support documentation openly acknowledges the bug and explains why it will not be fixed: doing so would invalidate every date in every existing Excel worksheet.

[2] The companion deep dive document in PDF format, cataloguing the Transitional features, their categories, and the specific structural problems they introduce: A Standard in Name Only a Deep Dive

ODF vs OOXML, an issue that should never have existed

A number of journalists read last week’s piece as an attack on Microsoft. We want to explain what they walked past.

Whenever we address the contrast between ODF and OOXML, some people perceive it as a campaign against a company. It is not. We are trying to do something far more useful: to make the structural problem with the standard document format clear to those who have to live with it: public officials, educators, and above all, individual citizens.

All these people find themselves facing a problem they did not create, but which affects them daily, and of which they are often the unwitting victims, every time they create a document or receive one.

The least we can do – and in fact we have been doing it for twenty years, though until now almost no one has listened – is to explain, clearly and without drama, how the problem arose, why it persists, and why ODF is the only way out. It is an educational and selfless goal – we do not sell software, so we have no commercial interest to protect – and not an attack on a company.

The problem concerns the current document landscape, based almost exclusively on a proprietary format controlled by a single company, and what we could have had instead: a standard format controlled by an independent community of stakeholders.

Microsoft features in this story because of the rational-monopolist behaviour it has exhibited since 2006, during and after the standardization of the proprietary OOXML format: first promising the standard and then doing everything possible to ensure it was first ignored and then forgotten, quietly but with extreme determination. All of this to protect a market share now worth over $30 billion, which would have been at risk of erosion if the document format had been genuinely standardized: migration to any other office suite would then have been free of cost and complexity.

Today, most organizations – public agencies, supranational bodies, companies – and most individual users face a problem that, had everyone listened to independent experts between 2006 and 2008, would never have existed. The international standards system and national governments allowed a single vendor – rather than the community of developers, systems analysts and standards scholars who raised objections – to set the terms under which documents would be archived. That vendor chose its own proprietary format.

The problem, in other words, was created by institutions – ISO, national standards bodies, public officials and ultimately politicians – who approached the choice of format for public documents in a completely uncritical manner. They trusted the process despite repeated and legitimate protests about its transparency, and never thought to perform a simple file analysis that would, in a few minutes, have raised more than a few doubts. The industry then followed the vendor’s lead, for convenience, because it expanded the business – without weighing the medium- and long-term consequences for institutions and individual users. What is troubling is that even a segment of the open-source industry went with the flow, and continues to do so, as shown by the fact that today only two open-source office suites – LibreOffice and Collabora Office – use ODF as their native file format.

If between 2006 and 2008 everyone had done their part, today there would be a single open, multi-vendor interoperability standard for office documents – our ODF – governed neutrally and implemented by all. Everyone would have benefited, because document exchange based on a true standard is completely transparent and independent of operating system and application software. Microsoft could have kept its own internal proprietary format as a mere implementation detail, invisible to users, because documents would have flowed seamlessly through the standard. An ideal world that never became reality.

Instead, the accelerated standardization of OOXML through ISO in 2008, against all technical objections, produced the OOXML Transitional format we use today: a temporary compatibility mode, explicitly defined as a bridge to be crossed once and then dismantled. It was not dismantled. It became the only variant used, at every level, by the majority of office suites. Today the vast majority of office documents worldwide – including the public documents of public institutions and of governments everywhere – are saved in a format that its own designers had declared provisional.

Even OOXML Strict would not solve the problem. Microsoft has never promoted it – part, as we have explained, of an understandable strategy – and none of those who were supposed to oversee the process ever requested or verified its implementation by the deadlines promised at standardization, from 2010 onward. But the deeper point is this: Strict is simply a different variant of the same single-vendor format. A standard is not open because its specification has been published. It is open when it is developed through a transparent process that no single company can control, and maintained by an independent community of users and implementers. Replacing Transitional with Strict changes the variant but leaves governance – which is what determines sovereignty – exactly where it was.

So when we advocate for ODF, we are not criticizing anything. We are trying to clarify a problem that was artificially created, and to ask why a problem that was artificially created is treated by most stakeholders – organizations, governments, companies and individuals – as an established fact of nature.

Attention to digital sovereignty is growing, even if resistance remains strong, because awareness of this issue – which should never have arisen in the first place – is still virtually nonexistent, not only among users but among industry professionals themselves.

We continue to believe ODF can regain the role it should have had after 2006, when it was approved – rightly – as an ISO standard, because it had every characteristic of an open standard. The Deutschland Stack restores that role to ODF, and we hope the German government’s decision will not remain isolated.

There is no digital sovereignty without ODF

Any other choice is a choice of dependence on a single vendor

Digital sovereignty begins with the document format. Everything else – server location, hosting jurisdiction, procurement clauses – is downstream of this single decision. If the format is standard and open, the user controls the document. If the format is proprietary the vendor controls it, even when the file sits on the user’s own hard drive.

This is why LibreOffice, and its derivatives such as Collabora Office and Online, are today the only legitimate choice for governments, supranational bodies, businesses and organisations that want to protect the digital freedom of their users. Only software based on the LibreOffice source code – the LibreOffice Technology – uses ODF as its native document format. Every document saved, stored, retained and exchanged in ODF remains the exclusive property of its author, and remains so over the years.

ODF – Open Document Format, as the name says – was designed and developed in accordance with the characteristics of a true open standard: clearly documented, transparently developed by an independent body, properly versioned, built on existing standards, and stored in XML files that any user can read.

None of this applies to OOXML. The name is itself an oxymoron: XML stands for eXtended Markup Language, which is open by definition, but OOXML’s syntax is so complex that it is unreadable even to advanced users. The format was deliberately designed to become a sophisticated lock-in tool at a moment when Microsoft’s other strategies had already been uncovered and analysed.

The Transitional/Strict bait-and-switch

OOXML was approved as an ISO standard through a process that was an affront to transparency, ethics, common sense and respect for users. The format is documented in a way that discourages consultation – over 7,500 pages – and is developed by Microsoft behind closed doors in Redmond.

It is not versioned. It uses no independent standards. On the contrary, it relies on proprietary Microsoft formats wherever possible, in some cases formats that Microsoft itself had deprecated because the market rejected them. It is not even compatible with the Gregorian calendar. The XML schemas are nearly absurd in their complexity.

The bait-and-switch worked like this: “I swear it will be Transitional until 2010, very proprietary and very little of a standard, and after that only Strict, not very proprietary and very much a standard.”

The catch: Strict never materialised in practice. For years it lingered as a last-resort option that no one was meant to use, and it has now disappeared from the Save As options altogether. The standardised version of OOXML – the one ISO was told would become the real format – no longer exists as a user choice. Only Transitional remains.

A pity, because we would have had a laugh with Strict’s bugs. Excel has a thing for getting dates wrong (the (in)famous 1900 leap-year bug, inherited from Lotus 1-2-3 and never fixed), and when Excel gets dates wrong, no other software does it worse.

The political consequences

All of this is hard to grasp by looking at what happens on screen, because the document seems entirely harmless in its apparent simplicity. And yet all of it has been documented in detail since OOXML was first introduced, by independent experts who should have been heard, both by ISO and by those working in advanced technology.

Instead, ISO bought the Transitional/Strict story. And once ISO believed it, governments and politicians believed it too, rushing to adopt OOXML as a document format for fear that Bill Gates and Steve Ballmer might take offence and act accordingly.

In doing so, they placed citizens’ private data in Microsoft’s hands and reinforced a monopoly that was already evident before OOXML’s arrival, and that has become increasingly difficult to dismantle ever since.

The Microsoft ecosystem played its part in all this, and partner companies – SAP foremost among them – have always done everything in their power to push their users toward OOXML for data exchange, openly obstructing the use of the standard ODF format. An uneven struggle, by design.

Worse still, with just a few exceptions, even those who by virtue of their expertise should have recognised OOXML as the cornerstone of Microsoft’s new lock-in strategy fell for it. Some still write today: “we have to accept it, OOXML is an ISO standard.” This is not a serious position.

It is a deference with no rational basis.

Microsoft’s monopoly position is not founded on technological superiority but on the strategic foresight of Bill Gates and the lobbying machinery that flowed from it, deployed well ahead of its time.

The same deference has had consequences in the scientific community as well.

The HUGO Gene Nomenclature Committee was forced in 2020 to rename dozens of human genes – including SEPT1 and MARCH1 – because Excel kept silently converting their symbols to dates. Rather than going to Microsoft and demanding a bug fix, scientists preferred to throw years of established nomenclature down the drain to avoid upsetting Redmond. A revealing precedent.

Supporting ODF is not choosing ODF

There is a distinction that needs to be made plainly, because it is too often blurred, sometimes inadvertently, sometimes by design. Supporting a format is not the same as choosing it.

An office suite that saves OOXML by default is not supporting digital sovereignty, independently from the level of ODF support. It is an OOXML suite with an ODF import/export filter, which inherits all the OOXML based lock-in mechanisms: proprietary schemas, vendor-controlled evolution, hidden binary fragments, format-level dependencies on Microsoft’s roadmap.

Digital sovereignty lives at the native-format layer. Support describes what a piece of software can read. Native format describes what it is. The native format determines the legal and technical character of every document the user creates.

A commitment to “improve ODF support” is not a commitment to digital sovereignty. It is a commitment to keep ODF as a guest in someone else’s house.
This distinction matters for any project, coalition, or procurement decision that claims a digital sovereignty objective. The meaningful question is never whether ODF is supported – it almost always is, at some level – but whether ODF is the native format, chosen and committed to as such.

If the answer is anything other than yes, the sovereignty claim is provisional at best.

What digital sovereignty actually requires

The only viable path to digital sovereignty today is to use ODF as the native document format, and OOXML as the interoperability format for exchange with users who – out of lack of information, or pure convenience – continue to use the proprietary format, and share ownership of their own files with the vendor.

Anything else is false digital sovereignty. Control over a document and over the information it contains depends first on the format and only afterwards on the location of the server.

Standard, open format: the user is in control. Proprietary format: the vendor is in control, even if the document sits on a PC on the user’s desk.

This should be self-evident to anyone working in open source software, because it follows directly from its principles.

A proprietary document respects neither Freedom 1 (the freedom to study and modify) nor Freedom 3 (the freedom to improve and redistribute), as it is not not documented in a way which makes the source code readable and it is not developed through a transparent process.

The decision to adopt OOXML as the native format runs counter to the interests of governments, supranational bodies, organisations of every kind and enterprises. But above all, it runs counter to the interests of users as it exploits their lack of information rather than investing in their education and in their digital sovereignty.

The choice of native format is not a technical detail to be deferred or finessed. It is the choice. Any project that treats it as something less is not supporting digital sovereignty. Full stop.