Mile End Institute

The digital archive: challenges for historians

Shane Dillon, who attended the Mile End Institute's conference Contemporary Political History in the Digital Age, reflects on the issues raised during the discussion.

29 February 2016

news image

When history is served up to us in a book or on television we may not readily appreciate the leg work that has gone in to making it feel seamless. 

That leg work can involve going through a paper archive, days spent in specialist libraries, newspaper archives, listening to oral accounts and reading other history books on the subject. 

For a great many historians paper will be a constant or vellum for those who research UK parliamentary history. 

How does this all work in the digital age were paper is not the primary place to find an account of history? Part of the answer is they will increasingly find accounts of history in what are termed ‘born digital records’ on websites, social networks, blogs and organisations digital records. 

What will be valuable for historians writing social history ten years from now will be the archive of content from the UK's first social network Friends United that closed down in January 2016. MySpace is another source, that has recently been acquired for its user data by Time Inc.

For political historians in the UK websites and blogs in particular those run by government will be the born digital records that they get to work with when writing history. 

The National Archives are very much on to this by archiving UK government websites and blogs. 

Like this blog post from Philip Barclay and Grace Mutandwa ‘Legacy and Long-legged Birds’ (2008). 

If blogs are said to be the first draft of history then Philip and Grace’s blog led to something more or less substantial depending on your view; a book published in 2010 “Zimbabwe: Years of Hope and Despair" (2010) by Philip Barclay. This content was born digital but ended up on paper. A hybrid of sorts.

Link rot

Does it matter what the format records of ‘historical importance are retained on for historians and others to research? should we care if records are contained on a blog, website, paper, microfilm, vellum or microfiche? for an archivist this matters a great deal. Even vellum has it’s fans.

As a film fan I can appreciate what can be lost as nitrate film stock degrades and with it large parts of our cinema history lost. The nearest internet equivalent is link rot. If you click on a link then see “Error 404 or page “Not Found” this is evidence of link rot. Andy Baio has wrote about efforts to preserve our digital history here.

For many historians of the 18th, 19th and 20th centuries the paper file is highly valued. You open the file; it has a start and an end. The digital record is not always as clear cut with a clear beginning and end. In the digital era, 'The hard disk is the new paper archive'.

For historians writing an historical account of the Government Digital Service (GDS) or the Foreign Office (FCO) from 1990 they will be working with a hybrid of digital and paper sources. 

The FCO relied for much of its existence on an army of registry clerks abroad to keep it’s institutional memory intact and transported back to it’s “archive” in Hanslope Park. As more work was done on computers the FCO managed information with its Aramis system in the early 1990’s then moving to Firecrest in 2001.

What gives historians a more complete picture? records retained on digital formats or those on paper? 

Too early to say but even with the long history of material kept on paper there have always been gaps. Those gaps will remain in the digital era. The question is; how big those gaps will be?

Institutional memories

If a historian were to write an institutional history of GDS. The sources they could draw from within the public digital sphere would be blogs, tweets and the oral history of those behind GDS. The internal history of how we arrived at website would be drawn from GitHub, Google Documents then Basecamp and Slack conversations. 

We know how an organisation like the FCO with a long history of maintaining records retains it’s institutional memory. 

But how does a relatively new kid on the block GDS manage information for future historians to research? What of all those post it notes in the margins? Will all those conversations on Slack be available twenty years from now? 

This is a challenge for all organisations as they place information on services many of them originating in America. This is a challenge not just for organisations but for all of us. Facebook, Google and Slack are businesses.

For us as individuals how will Google and Facebook retain our own digital histories? Data is stored in the cloud, in reality are huge data centres dotted around the world. 

This online storage space seems infinite but it’s not. To store information has a monetary and environment cost. In 20 years' time will it still be in the business interests of Facebook to retain all those Facebook and Instagram posts? 

What happens when a service closes down as Google Picasa did recently? What happens to that information? In fairness Google does a better job than most when it shuts down services. Google Takeout allows you download your data from many Google services at anytime.

Another positive is Twitter’s arrangement with the US Library of Congress to archive every tweet. However the devil is in the detail as every tweet deposited does not show images or identify if the tweet has come from a bot or a real person.

Incomplete picture

Twitter records will be important to historians writing the history of the first part of the 21st century. 

For historians researching Egypt’s revolution the Twitter hashtag #Jan25 is one source to draw on. 

Although if we turn to the 2013 protests in Gizi Park Twitter was closed down by the Turkish government. This produces an incomplete record another instance of historians getting an incomplete picture. 

In politics recent U.S. presidential elections have played out on Twitter. This alone underscores the importance of the Twitter archive for historians. However 2016’s presidential election will play out much more on Snapchat an ephemeral social network were content is designed to disappear.

This presents challenges for historians. How do they meet those challenges? Some are learning to code so they can better manage the vast amount of material the the digital ages produces. 

Does this mean historians will need to learn to code? clearly some will but most historians will need instead an appreciation of coding that allows them to have better conversations with developers who will help them research history in the digital age. 

Coding is important but equally for historians of any century is the ability to know a foreign language much more important.

For big organisations like government they can strike enterprise deals with big tech to ensure continuous access to information that can be made available to historians a long time into the future. 

Sensitivity reviews

For organisations working on documents in the cloud like Google Docs the idea of transporting documents to the national archive or making them public is just a matter of pressing publish. Then anyone from the public to the national archive can just grab a url link. 

Before that can happen it may be the case that a team of sensitivity reviewers will need to trawl through the documents before they are released to the public. 

These sensitivity reviewers will be used to reviewing paper documents. The 21st century sensitivity reviewer may end up trawling through the many iterations of a document collaboratively drafted on Google docs or Office 365. 

MS Word has been the writing tool for civil servants since the 1990's but with some parts of government ‘going Google’ Google docs and drive are becoming as familiar as the S/Drive. 

The information is out there but it’s on different clouds held by different companies with different enterprise agreements.

History in the digital age presents challenges but also opportunities. The abundance of information on social networks is a treasure trove for social historians writing ‘history from below’. 

Big data tools will allow historians to research masses of information to detect trends in society. Digital transformation of government and big organisations will encompass better information and records management. 

Historians for example in the FCO have challenged themselves to put together history purely from digital sources. The result was an account of the G8 Gleneagles Summit.

This blog was first published on Shane Dillon's personal website.