George Orwell’s classic allegory, Animal Farm, presents many perspectives on human behaviour and society. One of these is how people can be led and manipulated through the control of information. In the story, the Seven Commandments formed a de facto constitution for the Animalistic society. Since only a handful of animals could read, the rest were dependent upon what they were told was written. Gradually, the writing was cunningly altered to the benefit of the pigs above all other animals, and the populace was taught to not trust their recollections of what was written in the past.
What made this subversion possible was the inability of most animals to read. The two animals that could read (aside from the pigs) chose not to do anything about what they saw. Amongst other things, the right to access and read information is an important cornerstone of democracy.
This is where open file formats come in. As our lives become increasingly defined by electronic records, there needs to be a way for independent viewing and auditing. Paper is easily read, but computer files require software to decypher them. Imagine if you needed special (and expensive) glasses just to read the letter that you yourself wrote only a few years ago.
There has been a fair amount of discussion in the press regarding the OpenDocument and the so-called ‘Open’ XML formats. The primary focus of this reporting thus far has been on the political and technical facets. This is slowly changing, as the importance of long-term data preservation and freedom of information become apparent to ordinary folk.
The BBC has published a report on the problem, and discusses how the UK National Archives are attempting to deal with it. Alas, it appears that they have opted for a short-sighted approach, relying on virtualisation of older operating systems and applications, through a direct partnership with Microsoft. With this approach, the format decoders/viewers (not to mention the operating system and software performing the virtualisation itself) remain closed in source and specification, and one must deal with a cumbersome virtual machine just to view a document.
Where is the guarantee that files can be read hundreds of years from now, just as we can do today with paper documents such as the historic Magna Carta? How does this partnership benefit me, an ordinary citizen who might wish to view ten- (or even two-) year-old public documents that are only available in a proprietary electronic format?
It’s both sad and frustrating to see that history is yet again repeating itself. Whilst the contents of the Domesday Book can still be read nearly 1000 years after completion, the digital BBC Domesday Project was rendered virtually unreadable a mere 16 years later.
Thankfully, there are efforts to create an infrastructure for long-term preservation and management of digital documents. To start with, there are open formats such as OpenDocument and PDF. The Australian National Archives have long been supporters of OpenDocument, to the extent that they are standardising upon it. Putting their money where their mouths are, they are building a completely open source (GPL, no less) data managment system that anybody can use or improve to suit their needs. Michael Carden gave a great talk [Ogg video] at this year’s linux.conf.au about this technology, known as Xena [PDF]. Whilst their UK counterparts seem to have forgotten that access to data is not just a privilege for those able to make exclusive agreements with purveyors of lock-in technologies, the Australian National Archives have been striving to ensure that nobody is left out of the digital revolution.
Four legs good, two legs… better? Let’s prevent this subversion from happening.