Blogs

Bit-rot and Digital History

John Rentoul

tna Bit rot and Digital HistoryJohn Naughton has an interesting comment on a subject in which I have a professional interest. As a student of contemporary history, I worry about digital archiving.

At first, researching and writing about the first British government of the internet age was a thrill and a liberation. The Blair premiership was the first to have so many of its primary historical documents on line. Command papers, Hansard, select committee evidence; not to mention huge quantities of contemporary newspaper and BBC reporting.

So it was a shock to bump into the limit of the information utopia – something that happened quite abruptly on 27 June 2007. The No 10 website was rebuilt overnight, and searches for familiar material brought up an Error 404 File Not Found. Tony Blair had become a digital non-person.

Of course, the Brown supremacy was not quite engaged in Stalinist obliteration of the past. The entire No 10 website of the preceding years had been copied and stored in a series of snapshots (such as this, above) by The National Archives. It became harder to find things, although The National Archives has worked on improving the search functions since.

As time passed, it also became harder to find other older documents, as the growing power of Google’s algorithms could not quite keep up with the spread of broken links and defunct websites.

Anyway, Naughton says:

The longer I’ve been around, the more concerned I become about long-term data loss — in the archival sense. What are the chances that the digital record of our current period will still be accessible in 300 years’ time? The honest answer is that we don’t know. And my guess is that it definitely won’t be available unless we take pretty rigorous steps to ensure it. Otherwise it’s posterity be damned.

It’s a big mistake to think about this as a technical problem — to regard it as a matter of bit-rot, digital media and formats. If anything, the technical aspects are the trivial aspects of the problem. The really hard questions are institutional: how can we ensure that there are organisations in place in 300 years that will be capable of taking responsibility for keeping the archive intact, safe and accessible?

I’m not quite as gloomy as he, having a high opinion of Google’s ability to innovate, and of its institutional integrity. But it is undoubtedly an important question.

Tagged in: ,
  • Dorothea

    I’m about to read this article, but before I do, let me say that just the expression “bit-rot” fills me with joy and excitement. I hope it means that all the nonsense we’re churning out is not going to float around the ether forever. Oh. Drat. It means that it’s all going to be there, we just won’t be able to find the bit we want. Just like all the bits of paper that we used to write notes on.

  • http://twitter.com/unixspiders Andrew MacFarlane

    John

    Have you tried the Wayback machine on the internet archive?

    cheers
    andy

  • http://www.searchofficespace.com/ Office Space

    I’ve just tried it, thanks :)

  • http://twitter.com/tawalker T.A. (Tim) Walker

    “What are the chances that the digital record of our current period will still be accessible in 300 years’ time?”

    Given my own personal experience of the subject (e.g. MS Word files from the mid-1990s that can’t now be opened because I used the “Quick Save” option), web hosts being shut down without much warning (RIP Geocities), the onward march of technology, etc…

    …I’m more concerned about our digital records being available/accessible in 30 years, let alone 300. If you think I’m joking, do a Web search for “BBC Domesday Project” – suffice it to say, I think we’d be unwise to move away from paper entirely…

  • http://pulse.yahoo.com/_FSMYUAS5ZQUB44MUQLOTUS6WZ4 Brian

    Many long term capital projects need to keep records for the life of the finished item. In the case of a power station this could be 50 years, maybe 100 for a bridge or railway tunnel. Whilst it is still possible to read Brunel’s drawings, very few organisations can now handle punched tape, CNC machine tapes or even 5 1/4 floppies, but whilst most companies can justify keeping an archive room for the future, the payback on refreshing digital data is too far away for the accountants to justify. It isn’t just digital media though, almost all of us must have gone to a file and found faxes faded to blank, whilst telexes are still readable. Although at present e-mails are (embarassingly) persistent, format changes could lead to the records being lost from servers, robbing us of vast amounts of information.
    Will these be the new ‘dark ages’ to future historians?


Most viewed

Read

N/A

Property search
Browse by area

Latest from Independent journalists on Twitter