Long-term archiving with ODF: a future-proof strategy

Digital documents in proprietary formats often become inaccessible within a few years due to undocumented changes to the XML schema that are intentionally employed for lock-in purposes. To avoid this problem, it is advisable to use the Open Document Format (ODF) not only for everyday tasks, but also for long-term storage. This ensures that documents remain accessible for years or even generations.

Without this approach, government documents, academic research, legal documents and corporate archives risk becoming true digital orphans — files that exist, but cannot be read. This is not so much because the software that created them is obsolete, but because the XML schema has been modified to make the files readable by a specific version of a single software program. However, the layering of changes makes them unreadable by any software in the long term.

Why is ODF suitable for archiving?

ODF (ISO/IEC 26300 and subsequent versions) is an open standard, managed transparently by OASIS. Its development process and specifications are documented and publicly accessible, unlike proprietary formats, where the process is undocumented and the ISO/IEC specifications do not reflect the reality of the format. This means that even if the current software disappeared, developers could create new programmes compatible with the standard to handle the files and access their content.

Furthermore, ODF files are compressed archives (ZIP) containing XML files based on a schema that can be easily read by non-technical users, enabling anyone to extract and interpret the content. This transparency of format is a fundamental element of its archival value. In contrast, the XML schema of proprietary files is intentionally designed to be unreadable. In this sense, it is a perfect example of how a language created for simplification, such as XML, can become a subtle lock-in tool if used contrary to its nature.

Finally, ODF maintains strong backwards compatibility between versions. This means that all files created with ODF 1.0 in 2005 — immediately after standardisation by OASIS — can be opened without issue by applications released in 2025. This stability is intentional; the format was designed with long-term preservation in mind.

Best practices for archiving in the ODF format

Although newer versions add functionality, the best option for long-term archiving is to use a version recognised by ISO/IEC, such as ODF 1.2 (ISO/IEC 26300-1:2015) or, in the near future, ODF 1.3 (ISO/IEC 26300:2025). This is because it is mature and well documented, and will remain compatible for decades, offering an excellent balance between functionality and breadth of support.

For documents where faithful visual reproduction is important, it is advisable to embed fonts in ODF files to avoid font substitution issues when files are opened years later in a different environment to the one used to create them.

Additionally, all resources related to the documents (images, graphics, etc.) should be embedded in the ODF file rather than linked externally because external links are at risk of breaking over time if the original file is moved, which could render the documents incomplete.

Finally, to enable recognition of the file years later, take advantage of rich metadata support by adding the creation date, author, subject, and any other contextual information that could help understand the document’s purpose and origin. In any case, even when using an open standard format such as ODF for long-term archiving, it is advisable to plan for the periodic migration of archives to the most recent version of the format, and to check the accessibility of files every few years.

ODF, though, cannot be used to archive documents which have to maintain their original format, without the risk of being inadvertently edited. For these documents, a different approach based on PDF/A should be considered. PDF/A is specifically designed for archiving and complements ODF perfectly in a comprehensive archiving strategy, so is ideal for final documents that are not expected to be modified over time.

Since no format can protect against media failure, it is best to keep multiple copies of each file on different storage media and in different locations, following the 3-2-1 backup rule: three copies on two types of media, with one copy off-site. In addition, the archiving processes should be documented and the documentation should be easily accessible, so that people taking on different roles within the company can reproduce and update the process in a manner consistent with the software tools used, as well as with decisions on strategy and formats.

Looking to the future

The digital preservation landscape continues to evolve, but ODF’s commitment to open standards, transparency and vendor independence positions it as the best long-term choice, thanks to its dedication to ensuring information accessibility extends beyond the lifespan of a single organisation.

In a world where planned obsolescence is an increasingly common strategy and is sometimes imposed, as with the end of support for Windows 10 forcing the abandonment of perfectly functioning hardware despite any talk of sustainability and reducing digital waste, this commitment is rare and valuable.

Leave a Reply