May 9 2009

The ABC have a piece from National Lib­rary of Aus­tralia web archiv­ing man­ager Paul Koerbin, about the import­ance of digital records pre­ser­va­tion.

Of equal import­ance, how can we be sure that we can actu­ally read those archives in the future? Lit­er­acy of Egyp­tian Hiero­glyphs was long-​​gone by the 18th cen­tury, and it took the dis­cov­ery of the Rosetta Stone for them to start mak­ing sense again.

It’s dif­fi­cult enough deci­pher­ing human lan­guage. Under­stand­ing machine lan­guage is another thing entirely.

I’ve writ­ten about this in the past, con­trast­ing the thousand-​​year-​​old Domes­day Book (which is still legible) with the BBC Domes­day Pro­ject (which was rendered vir­tu­ally unread­able a mere six­teen years after production).

The means of pre­serving our cul­ture for digital pre­ser­va­tion is to use open stand­ards. If the means for ‘read­ing’ the inform­a­tion is widely doc­u­mented and under­stood, without any encum­brances, we stand a much greater chance of being able to inter­pret it in a couple of hun­dred years.

I’ve got essays from school writ­ten only ten years ago, and I can’t read them any more as they’re stored in a pro­pri­et­ary file format that is no longer supported.

Ima­gine you ran a com­pany that had import­ant and valu­able writ­ten records stretch­ing back for dec­ades. Stor­ing vast lib­rar­ies of paper is expens­ive and inef­fi­cient, so you decide to digit­ise them all. That’s great — you now have a sys­tem that is easy to man­age and search. Ten years later, you want to migrate your now-​​ageing data man­age­ment sys­tem to some­thing more mod­ern. Only, you can’t — it’s all stored in a pro­pri­et­ary format that can­not be accessed by any­thing else.

If you had kept those paper records, you would have still had access to that inform­a­tion. Your choices now are to con­tinue with your old, obsol­ete sys­tem for all etern­ity, or hire some clever hacker to decipher the file format. With no equi­val­ent of a Rosetta Stone, that’s no mean task. After spend­ing buck­ets of money on this avoid­able prob­lem, and los­ing even more due to inef­fi­cien­cies and com­pet­it­ive dis­ad­vant­age from the old sys­tem, you’d be wise to make sure it can­not hap­pen again.

This is a very com­mon kind of scen­ario. If our inform­a­tion can’t even last ten years, how can it last a thousand?

From a busi­ness per­spect­ive, open stand­ards pro­tect the inde­pend­ence of a com­pany. It means no vendor lock-​​in, so you are not stuck pay­ing mono­poly prices. Through the cre­ation of a free mar­ket sur­round­ing a method/​technology, open stand­ards give you the free­dom to select the vendors, products, meth­ods and tech­no­lo­gies that suit your require­ments best, or you can even cre­ate your own. They are the ulti­mate in risk mit­ig­a­tion, and through their flex­ib­il­ity can also open aven­ues for com­pet­it­ive advant­age. They just make good busi­ness sense.

LotD: Vioxx maker Merck and Co drew up doc­tor hit list and Merck Makes Phony Peer-​​Review Journal

One Response

  1. Michael Carden Says:

    Srid­har, you’ll be pleased to know that the Digital Pre­ser­va­tion crew at the National Archives of Aus­tralia are also firm believ­ers in Open Stand­ards for long term access to data. So much so that we are engaged in FOSS devel­op­ment to make it pos­sible for all to have access to an easy means of get­ting stuff into open formats.

    Our Xena soft­ware is a good start http://​xena​.source​forge​.net and for those doing this stuff on an indus­trial scale and look­ing for a digital pre­ser­va­tion work­flow man­age­ment and audit tool, there’s our Digital Pre­ser­va­tion Recorder http://​dpr​.source​forge​.net

    All released under the GPL and all cross plat­form (Java).

    We’re com­mit­ted to Open Stand­ards based formats and as an organ­isa­tion we get involved in stand­ards pro­cesses through Stand­ards Aus­tralia. That we can imple­ment all this through Free and Open Source Soft­ware is a bonus and in my opin­ion essen­tial if this kind of work is to be trustworthy.

    Cheers,
    Michael Carden (with my NAA hat on)

Leave a Comment

Please note: Comment moderation is enabled and may delay your comment. There is no need to resubmit your comment.

Will it be Domesday or Doomsday for our information? / 'Til All Are One by Sridhar Dhanapalan is licensed under a Creative Commons Attribution-ShareAlike Australia CC BY-SA AU licence.