Thursday 28 November 2013

Preserving StatCan data



I’d like to take some liberties in answering this question to respond instead in regard to preservation of StatCan data (that I work with).  I’ve worked with data for the last three years, and in that time my attention to preservation has grown with experience.  Some of the early censuses (late 1800s and early 1900s) have valuable information about Canada – but only some of these files are available to researchers because most have not yet been digitized.  It is a painstaking process to manually enter this old data.

Fast-forward 50 years and we encounter a similar problem; although the existing data is machine-readable, those machines (and software) no longer exist.  It takes an expert to reformat the data and syntax to be used with contemporary technologies.

Last year, I worked with Dataverse – an online platform for research data.  It runs on R (an open source statistical software) and claims to be able to automatically reformat data over time.  Only time will tell to what extent Dataverse is able to maintain usability.  In any case, its attention to preservation is progressive.

No comments:

Post a Comment