As of 2017-12-06
Goals
- Successfully migrate master objects and required metadata to a preservation environment in the Master Objects Repository (MOR), from:
- Dark Archive (DA)
- J-Drive
- External Media (EM)
- Determine the appropriate disposition for the remaining content.
- http://go.osu.edu/MOM
Objectives and Deliverables
- Preparation for Migration
- De-duplication of within the DA and in relation to J-Drive & EM
- Format Analysis
- Identification of Master Objects (MO) to be migrated
- Establishment of metadata requirements for MOs to be migrated
- Migration
- Prioritization
- Rights analysis
- Metadata identification and/or creation
- Migration testing
- Migration of MOs
- Post Migration Clean-up
- Disposal of derivatives
- Disposal of excess copies found in J-Drive and EM
- Determination of disposition for other non-MOs
Scope
Location | In Scope? |
Dark Archive |
|
/archive/Archived | No |
/archive/Committees | No |
/archive/Dept/ARV | Yes |
/archive/Dept/ATH | Yes |
/archive/Dept/CGA | Yes |
/archive/Dept/CSS | No |
/archive/Dept/DI | Yes |
/archive/Dept/HIL | Yes |
/archive/Dept/KB | Yes, as it pertains to MO determinations in other collection folders |
/archive/Dept/MUS | Yes |
/archive/Dept/RAR | Yes |
/archive/Dept/SRI | No, this is just a mapping to /archive/Dept/KB |
/archive/Dept/TRI | Yes |
/archive/Dept/WIT | No |
/archive/Fedora | No |
/archive/FedoraMORdata | No |
/archive/lost+found | No |
J-Drive | Yes, as it pertains to MO determinations in DA folders |
External Media | Yes |
Stakeholders
Project Role | Who | OSUL Role |
Executive Sponsors | Lisa Carter | AD: Special Collections & Area Studies |
Jennifer Vinopal | AD: Information Technology | |
Process Owner/Manager | Dan Noonan | Digital Preservation Librarian |
Project Team | Application Development & Support (AD&S): Individuals ID’d on as needed basis | MOR interaction Subject Matter Experts (SME) |
OSUL-IT Infrastructure (IS):
| DA/MOR Analysis Reports SMEs | |
Archives & Special Collections:
| Collections SMEs | |
Special Collections Description and Access:
| Metadata & IMS/MOR processing SMEs | |
Preservation & Reformatting:
| Preservation & Reformatting SMEs; Digital Imaging (DI) folder | |
Publishing and Repository Services (PRS): Maureen Walsh | Metadata & Knowledge Bank (KB) SME | |
Copyright Resource Center (CRC): Sandra Enimil | Rights SME |
Tasks
De-duplication of Dark Archive
- Hash sum de-duplication: Includes OSUL-IT Infrastructure Support’s (IS) creation of reports that identify files with duplicate hash sums
- Derivative de-duplication: The MOR is a repository for master objects, not derivatives; therefore we need to be certain to only migrate the masters. This task includes IS’s creation of reports that identify files with duplicate file-names (e.g. 001.TIFF and 001.jpg) that is provided in csv format for analysis; initial analysis by Digital Resources Archivist (DRA); meetings between DRA and appropriate members of the curatorial staff; final analysis and determinations made regarding whether the duplicate files names should be maintained or disposed of by appropriate curatorial staff.
- External de-duplication: External Media and potential Master Objects on the J-Drive will need to be analyzed and compared with digital objects in the Dark Archive to determine which objects should be migrated to the MOR and which are to be discarded as duplicates.
The majority of this effort has been completed. As content is migrated to the K-Drive for processing into the Master Objects Repository, it will be double-checked for any residual duplicates.
Formats/Collections Analysis
The DRA in collaboration with IS will develop reports of number of files by type by collections. This analysis will identify the quantifiable scope of the project and be used to assist in identifying migration priorities.
MOM Formats and Collections Scope
The DRA will develop a Collections Analysis Template that will allow curators to examine their collections, while identifying key information to facilitate the the migration and disposition of content.
Establishment of metadata requirements for MOs to be migrated
The Metadata Objects Work Group (MDOWG) has been working towards establishing minimum metadata guidelines for placing objects in the MOR. The current draft guidelines are for images and will need to be extended to include other format types and complex objects.
Identification of Master Objects (MO) to be migrated
Of the nearly 2,000,000 items in the Dark Archives (and digital objects stored on external media and potentially the J-Drive) not all are Master Objects that should be migrated to the MOR. Curators and Archivists will conduct this endeavor and will rely upon the outcomes of the "De-duplication of Dark Archive" efforts, the application of the definitions for "Master Objects" and the Format/Collection Analysis efforts to determine which objects will be migrated.
Migration
Prioritization
An initial analysis for prioritization of collections to be migrated was developed based upon:
- completion of the de-duplication process
- the file formats that the MOR can ingest
- the format homogeneity of the collection
- metadata and rights readiness of the collection
- User access demands
Prioritization decisions were to be made by collection curators and archivists in consultation with Strategic Digital Initiatives Work Group (SDIWG - which includes the Heads of Digital Initiatives, Digital Content Services, Preservation and Reformatting and Application Development & Support), Head Special Collections Access & Description and the Digital Resources Archivist.
That system did not work. As of August of 2017, the newly minted Digital Preservation Librarian was charged with developing a new prioritization metric that accounted for:
- born digital vs. reformatted content
- File type
- Object complexity:
- single
- complex <10
- complex >10
- ordered complex
- Metadata readiness
- KB master
- Security/restriction constraints
- Special considerations
Rights analysis
To ensure proper citation and public availability, content rights statements MUST accompany all content. Rights statements should include information regarding the rights holder, access permissions, and special processing instructions (e.g., watermarked or not). This is a collaborative effort with the curatorial staff, Special Collections Access & Description and the Copyright Resource Center (CRC).
Metadata identification and/or creation
NO ITEMS WILL BE MIGRATED WITHOUT METADATA. Based upon the minimum metadata guidelines established by the MDOWG, as well as desired elements for particular collections metadata will be created (or appropriate existing sources will be identified) for all objects to be migrated. This is a collaborative effort with the MDOWG, SCA&D and curatorial staff, in consultation with AD&S.
Migration Testing
Individual and batch migration workflows was tested with migration-ready content prior to taking the process live in 2016 and 2017. While processes were refined issues arose with complex objects that had more than 10 components. Call through Active Fedora bogged the system down, eventually prompting a 503 timeout error. Therefor that content is not only inaccessible to our patrons, but to curators. We did successfully test batch ingest, ingest of simple complex objects including those with various file types, and we successfully tested ingesting complex object that include public and suppressed files.
Migration of MOs
- Non-mediated content: images, documents, audio/visual objects that can be individually and batch loaded to MOR with little-
- Mediated content: images, documents, audio/visual objects that can be individually and batch loaded to MOR that require significant mediation from curator. This is a collaborative effort with Digital Preservation (DP) and curatorial staff in consultation with Digital Initiatives (DI), Special Collections Access & Description (SCA&D) and Application Development & Support (AD&S).
- Disposition determinations of remaining content. Decisions will be made by Digital Preservation Librarian, Head of Digital Initiatives, Head of Preservation and Reformatting, Head of Publications and Repository Services and appropriate curatorial staff.
Post DarkArchive Migration Clean-up
Time-frame: TBD
- Disposal of derivatives
- Disposal of excess copies found in J-Drive and External Media
- Determination of disposition for other non-Master Objects
Documentation
In addition to this project plan, documentation will be maintained in BuckeyeBox at: https://osu.box.com/MOM.