George Orwell’s clas­sic allegory, Anim­al Farm, presents many per­spect­ives on human beha­viour and soci­ety. One of these is how people can be led and manip­u­lated through the con­trol of inform­a­tion. In the story, the Sev­en Com­mand­ments formed a de facto con­sti­tu­tion for the Anim­al­ist­ic soci­ety. Since only a hand­ful of anim­als could read, the rest were depend­ent upon what they were told was writ­ten. Gradu­ally, the writ­ing was cun­ningly altered to the bene­fit of the pigs above all oth­er anim­als, and the popu­lace was taught to not trust their recol­lec­tions of what was writ­ten in the past.

What made this sub­ver­sion pos­sible was the inab­il­ity of most anim­als to read. The two anim­als that could read (aside from the pigs) chose not to do any­thing about what they saw. Amongst oth­er things, the right to access and read inform­a­tion is an import­ant corner­stone of democracy.

This is where open file formats come in. As our lives become increas­ingly defined by elec­tron­ic records, there needs to be a way for inde­pend­ent view­ing and audit­ing. Paper is eas­ily read, but com­puter files require soft­ware to decypher them. Ima­gine if you needed spe­cial (and expens­ive) glasses just to read the let­ter that you your­self wrote only a few years ago.

There has been a fair amount of dis­cus­sion in the press regard­ing the Open­Doc­u­ment and the so-called ‘Open’ XML formats. The primary focus of this report­ing thus far has been on the polit­ic­al and tech­nic­al facets. This is slowly chan­ging, as the import­ance of long-term data pre­ser­va­tion and free­dom of inform­a­tion become appar­ent to ordin­ary folk.

The BBC has pub­lished a report on the prob­lem, and dis­cusses how the UK Nation­al Archives are attempt­ing to deal with it. Alas, it appears that they have opted for a short-sighted approach, rely­ing on vir­tu­al­isa­tion of older oper­at­ing sys­tems and applic­a­tions, through a dir­ect part­ner­ship with Microsoft. With this approach, the format decoders/​viewers (not to men­tion the oper­at­ing sys­tem and soft­ware per­form­ing the vir­tu­al­isa­tion itself) remain closed in source and spe­cific­a­tion, and one must deal with a cum­ber­some vir­tu­al machine just to view a document.

Where is the guar­an­tee that files can be read hun­dreds of years from now, just as we can do today with paper doc­u­ments such as the his­tor­ic Magna Carta? How does this part­ner­ship bene­fit me, an ordin­ary cit­izen who might wish to view ten- (or even two-) year-old pub­lic doc­u­ments that are only avail­able in a pro­pri­et­ary elec­tron­ic format?

It’s both sad and frus­trat­ing to see that his­tory is yet again repeat­ing itself. Whilst the con­tents of the Domes­day Book can still be read nearly 1000 years after com­ple­tion, the digit­al BBC Domes­day Pro­ject was rendered vir­tu­ally unread­able a mere 16 years later.

Thank­fully, there are efforts to cre­ate an infra­struc­ture for long-term pre­ser­va­tion and man­age­ment of digit­al doc­u­ments. To start with, there are open formats such as Open­Doc­u­ment and PDF. The Aus­trali­an Nation­al Archives have long been sup­port­ers of Open­Doc­u­ment, to the extent that they are stand­ard­ising upon it. Put­ting their money where their mouths are, they are build­ing a com­pletely open source (GPL, no less) data man­ag­ment sys­tem that any­body can use or improve to suit their needs. Michael Carden gave a great talk [Ogg video] at this year’s linux​.conf​.au about this tech­no­logy, known as Xena [PDF]. Whilst their UK coun­ter­parts seem to have for­got­ten that access to data is not just a priv­ilege for those able to make exclus­ive agree­ments with pur­vey­ors of lock-in tech­no­lo­gies, the Aus­trali­an Nation­al Archives have been striv­ing to ensure that nobody is left out of the digit­al revolution.

Four legs good, two legs… bet­ter? Let’s pre­vent this sub­ver­sion from happening.


