Master Objects Migration Project Plan

Project has been superseded by the Dark Archive Decommissioning Project.

Content was last updated as of 2019-07-16

Goals

  • Successfully migrate master objects and required metadata to a preservation environment in the Master Objects Repository (MOR), from:

    • Dark Archive (DA)

    • J-Drive

    • External Media (EM)

  • Determine the appropriate disposition for the remaining content.

  • http://go.osu.edu/MOM

Objectives and Deliverables

  • Preparation for Migration

    • De-duplication of within the DA and in relation to J-Drive & EM

    • Format Analysis

    • Identification of Master Objects (MO) to be migrated

    • Establishment of metadata requirements for MOs to be migrated

  • Migration

    • Prioritization

    • Rights analysis

    • Metadata identification and/or creation

    • Migration testing

    • Migration of MOs

  • Post Migration Clean-up

    • Disposal of derivatives

    • Disposal of excess copies found in J-Drive and EM

    • Determination of disposition for other non-MOs

Scope

Location

In Scope?

Dark Archive



/archive/Archived

No

/archive/Committees

No

/archive/Dept/ARV

Yes

/archive/Dept/ATH

Yes

/archive/Dept/CGA

Yes

/archive/Dept/CSS

No

/archive/Dept/DI

Yes

/archive/Dept/HIL

Yes

/archive/Dept/KB

Yes, as it pertains to MO determinations in other collection folders

/archive/Dept/MUS

Yes

/archive/Dept/RAR

Yes

/archive/Dept/SRI

No, this is just a mapping to /archive/Dept/KB

/archive/Dept/TRI

Yes

/archive/Dept/WIT

No

/archive/Fedora

No

/archive/FedoraMORdata

No

/archive/lost+found

No

J-Drive

Yes, as it pertains to MO determinations in DA folders, but most likely secondary project post DA migrations.

External Media

Yes, as it pertains to MO determinations in DA folders, but most likely secondary project post DA migrations.

Stakeholders

Project Role

Who

OSUL Role

Executive Sponsors

Jennifer Vinopal (formerly Lisa Carter)

AD: Special Collections & Area Studies (Interim)

Jennifer Vinopal

AD: Information Technology

Process Owner/Manager

Dan Noonan

Digital Preservation Librarian

Project Team

Application Development & Support (AD&S): Individuals ID’d on as needed basis

MOR interaction Subject Matter Experts (SME)

Digital Initiatives:

  • Terry Reese

  • Michelle Henley

Metadata transformation & DC/MOR processing SMEs

OSUL-IT Infrastructure (IS):

  • Travis Julian

  • Eric Haskett

DA/MOR Analysis Reports SMEs

Archives & Special Collections:

  • University Archives (ARV):

    • Tamar Chute

    • Michelle Drobik

  • Byrd Polar Archives (BPA): Laura Kissel

  • Billy Ireland Cartoon Library & Museum (CGA):

    • Jenny Robb

    • Susan Liberator

    • Marilyn Scott

  • Hilandar Library (HIL): Pasha Johnson

  • Ohio Congressional Archives (OCA): tbd

  • Rare Books (RAR):

    • Lisa Iacobellis

    • Eric Johnson

  • Theatre Research Institute (TRI):

    • Nena Couch

    • Beth Kattelman

  • Music (MUS): Alan Green

Collections SMEs

Content & Access and Archival Description & Access:

  • Morag Boyd

  • Anna Klose Hrubes

  • Ariel Bacon

  • Cate Putirskis

Metadata & DC/MOR processing SMEs

Preservation & Reformatting:

  • Miriam Centeno

  • Amy McCrory

Preservation & Reformatting SMEs; Digital Imaging (DI) folder

Publishing and Repository Services (PRS): Maureen Walsh

Metadata & Knowledge Bank (KB) SME

Copyright Resource Center (CRC): Sandra Enimil

Rights SME

Tasks

De-duplication of Dark Archive

  • Hash sum de-duplication: Includes OSUL-IT Infrastructure Support’s (IS) creation of reports that identify files with duplicate hash sums provided in csv format for analysis; initial analysis by Digital Resources Archivist (DRA); meetings between DRA and appropriate members of the curatorial staff; final analysis and deletions carried out by appropriate curatorial staff.

  • Derivative de-duplication: The MOR is a repository for master objects, not derivatives; therefore we need to be certain to only migrate the masters. This task includes IS’s creation of reports that identify files with duplicate file-names (e.g. 001.TIFF and 001.jpg) that is provided in csv format for analysis; initial analysis by Digital Resources Archivist (DRA); meetings between DRA and appropriate members of the curatorial staff; final analysis and determinations made regarding whether the duplicate files names should be maintained or disposed of by appropriate curatorial staff.

  • External de-duplication: External Media and potential Master Objects on the J-Drive will need to be analyzed and compared with digital objects in the Dark Archive to determine which objects should be migrated to the MOR and which are to be discarded as duplicates.

The majority of this effort has been completed. As content is migrated to the K-Drive for processing into the Master Objects Repository, it will be double-checked for any residual duplicates.

Formats/Collections Analysis

The DRA in collaboration with IS will develop reports of number of files by type by collections. This analysis will identify the quantifiable scope of the project and be used to assist in identifying migration priorities.

MOM Formats and Collections Scope

Master Objects Defined

The DRA will develop a Collections Analysis Template that will allow curators to examine their collections, while identifying key information to facilitate the the migration and disposition of content.

MOM Collection Analysis Tool

Establishment of metadata requirements for MOs to be migrated

The Metadata Objects Work Group (MDOWG) has been working towards establishing minimum metadata guidelines for placing objects in the MOR. The current draft guidelines are for images and will need to be extended to include other format types and complex objects.

MOM Metadata Requirements

Identification of Master Objects (MO) to be migrated

Of the nearly 2,000,000 items in the Dark Archives (and digital objects stored on external media and potentially the J-Drive) not all are Master Objects that should be migrated to the MOR. Curators and Archivists will conduct this endeavor and will rely upon the outcomes of the "De-duplication of Dark Archive" efforts, the application of the definitions for "Master Objects" and the Format/Collection Analysis efforts to determine which objects will be migrated.



Migration

Prioritization

An initial analysis for prioritization of collections to be migrated was developed based upon:

  • completion of the de-duplication process

  • the file formats that the MOR can ingest

  • the format homogeneity of the collection

  • metadata and rights readiness of the collection

  • User access demands

Prioritization decisions were to be made by collection curators and archivists in consultation with Strategic Digital Initiatives Work Group (SDIWG - which includes the Heads of Digital Initiatives, Digital Content Services, Preservation and Reformatting and Application Development & Support), Head Special Collections Access & Description and the Digital Resources Archivist.

That system did not work. As of August of 2017, the Digital Preservation Librarian was charged with developing a new prioritization metric that accounted for:

  • born digital vs. reformatted content

  • File type

  • Object complexity:

    • single

    • complex <10

    • complex >10

    • ordered complex

  • Metadata readiness

  • KB master

  • Security/restriction constraints

  • Special considerations

The results of that process can be found here:

Rights analysis

To ensure proper citation and public availability, content rights statements MUST accompany all content. Rights statements should include information regarding the rights holder, access permissions, and special processing instructions (e.g., watermarked or not). This is a collaborative effort with the curatorial staff, Special Collections Access & Description and the Copyright Resource Center (CRC).

Metadata identification and/or creation

NO ITEMS WILL BE MIGRATED WITHOUT METADATA. Based upon the minimum metadata guidelines established by the MDOWG, as well as desired elements for particular collections metadata will be created (or appropriate existing sources will be identified) for all objects to be migrated.  This is a collaborative effort with the MDOWG, SCA&D and curatorial staff, in consultation with AD&S.

Migration Testing

Individual and batch migration workflows was tested with migration-ready content prior to taking the process live in 2016 and 2017. While processes were refined issues arose with complex objects that had more than 10 components. Call through Active Fedora bogged the system down, eventually prompting a 503 timeout error. Therefor that content is not only inaccessible to our patrons, but to curators. We did successfully test batch ingest, ingest of simple complex objects including those with various file types, and we successfully tested ingesting complex object that include public and suppressed files.

Migration of MOs

  • Non-mediated content: images, documents, audio/visual objects that can be individually and batch loaded to MOR with little- to- no mediation from curator. This is a collaborative effort with Digital Preservation (DP) and curatorial staff in consultation with Digital Initiatives (DI), Special Collections Access & Description (SCA&D) and Application Development & Support (AD&S).

  • Mediated content: images, documents, audio/visual objects that can be individually and batch loaded to MOR that require significant mediation from curator. This is a collaborative effort with Digital Preservation (DP) and curatorial staff in consultation with Digital Initiatives (DI), Special Collections Access & Description (SCA&D) and Application Development & Support (AD&S).

  • Disposition determinations of remaining content. Decisions will be made by Digital Preservation Librarian, Head of Digital Initiatives, Head of Preservation and Reformatting, Head of Publications and Repository Services and appropriate curatorial staff.

Migration will be tracked through JIRA tickets. The JIRA Ticket Number can be found in the priority lists:

Below is an example of the MOM Kanban Board that will be used for tracking. In this example we see two existing projects that have been moved from the "To Do"  to "In Progress". Projects will appear in the "Pending" column when Digital Preservation is awaiting curatorial input or actual ingest completion prior to QC.

This example shows the detail of the "Ticket". Digital Preservation (DP) will track activity here. When a Ticket is moved form "To Do" to "In Progress" DP will loop the curatorial staff in to viewing the Ticket's information.

Post DarkArchive Migration Clean-up

Time-frame: TBD

  • Disposal of derivatives

  • Disposal of excess copies found in J-Drive and External Media

  • Determination of disposition for other non-Master Objects

Documentation

In addition to this project plan, documentation will be maintained in BuckeyeBox at: https://osu.box.com/MOM.

The Ohio State University

If you have a disability and experience difficulty accessing this content, please contact LIB-a11y@osu.edu.