January 6, 2015

U.S. Digital Preservation Collaborations from 1994-2014 State-by-State Overviews

In 2012, the Educopia Institute was approached by the Library of Congress’ National Digital Information Infrastructure and Preservation Program (NDIIPP) to identify and document collaborative digital preservation activities in the U.S., across relevant memory sectors (e.g., academic, government, archival, museum, nonprofit, and commercial). Using these “state profiles”, Educopia was also asked to help NDIIPP identify future opportunities for collaborative activities. The Educopia Institute proposed and was funded by a Cooperative Agreement to conduct this research as part of the Identifying Continuing Opportunities for National Collaboration (ICONC) project (September 2012-December 2013).

As Educopia scoped this research, the project team determined that the data it would gather might have multiple use cases within the memory community and beyond. A well-constructed relational dataset resulted, comprised of information about collaborative digital preservation activities and the domestic organizational collaborators behind such activities. Activity data captured included important information such as start and end dates, descriptions, NDIIPP funding status, URLs, and founding or host institutions. Data about collaborating organizations included their general location, sector, and organizational focus. Most importantly, the relationship between activities and organizations was preserved, so that one could view both the activities a specific organization is involved in and the organizations involved in a given activity.

Rather than producing a stand-alone report, the ICONC team has produced three deliverables. First, an open (yet beta) relational dataset that contains 211 collaborative digital preservation activities and 1856 related collaborating organizations to support a wide range of research questions. Second, a set of initial interactive data-driven dashboards serve to allow NDIIPP and other groups to begin viewing custom defined subsets of the data underlying this report in maps, bubble charts, and other graphical means. Third, this narrative overview both highlights research findings and activities underway in each of the 50 states and the District of Columbia, and recommends future directions based on the initial analyses of the 3,298 records in this dataset.

Due to time and budget constraints, this dataset is only a pilot, and should be thought of as a beta version. While the aim was to be exhaustive in capturing activities and collaborators, due to the sheer volume of information and the manual data mining, items may not yet be reflected in this dataset. There are also two specific areas involving taxonomy classifications where the pilot dataset can be refined. To facilitate searching for collaborators hailing from a particular industry sector (e.g. video production or news media), preliminary organizational coding occurred. However, taxonomy development requires more time and community feedback than what was available within the pilot period. The initial basic taxonomy classifying the activities within the dataset1 should also be revisited and refined by the U.S. Digital Preservation Collaborations 1994-2014 2 broader community, adding additional information about the targeted users or beneficiaries of specific activities when known.