Research Forum Presentation: Building a Reference Implementation of a Preservation Environment

Authors
Reagan W. Moore, San Diego Supercomputer Center

Richard Marciano, San Diego Supercomputer Center

SAA Presentation
SAA 2008 Presentation

Abstract
The Transcontinental Persistent Archive Prototype project (TPAP) is sponsored by the National Archives and Records Administration, and is researching the development of a reference implementation of a preservation environment. The approach is based on the use of the integrated Rule-Oriented Data System (iRODS) data grid to enforce and validate trustworthiness assessment criteria. We will present an analysis of the required components, discuss the current state of implementation, and identify the areas where substantial research is still needed. This project integrates technology from the San Diego Supercomputer Center (iRODS), the Producer-Archive Workflow Network from the University of Maryland, and the SHAMAN project (Sustaining Heritage through Multivalent Archiving) micro-services for parsing documents. The capabilities sustained within the preservation environment are based on the NARA Electronic Records Archives capabilities list, and the assessment criteria are based on the Trustworthy Repositories Audit & Certification (TRAC): Criteria and Checklist. For each identified capability, the required operations are encapsulated in micro-services that are executed at the storage location, under the control of rules that implement the management policies needed to enforce TRAC criteria. Rules are also defined that periodically query the system to verify compliance, and automate recovery procedures when problems are found. The reference implementation then consists of the record management environment, the preservation management rules, the management processes that implement preservation services, and the rules that verify compliance with assessment criteria.