1 Context
- 1.1 Background
- 1.2 Gaps Addressed
- 1.3 Linkage to Strategic Directions
2 Service
- 2.1 Stakeholders
  - 2.1.1 Service Owner & Providers
  - 2.1.2 Content Owners/Curators
  - 2.1.3 Content Processing
  - 2.1.4 Consultants
  - 2.1.5 Informed Parties
- 2.2 Description
  - 2.2.1 Components
  - 2.2.2 Content
  - 2.2.3 Process Workflow Overview
    - 2.2.3.1 Redacted Workflow
    - 2.2.3.2 Ingest Workflow Checklist
    - 2.2.3.3 Archivematica Evaluation

The Ohio State University Libraries' Gray Digital Preservation Repository (Gray Repo) service provides a path to preservation for born digital (or received as digital) content that has been accessioned, and is only intended to be minimally processed and/or is temporally restricted. As such, and in accordance with Distinctive Collections' accessioning policies and procedures, it is the default digital preservation repository. Further, it provides a preservation environment for some legacy digitized preservation files. It is a "dim archive" that allows for curatorial deposit and retrieval, but no direct patron access.

Go-link for this site: go.osu.edu/Gray-Repo-Wiki

Context

The Gray Repo is a "dim digital preservation archive" that provides no public access, and limited curatorial access to the University Libraries' digital objects stored within. This is in contrast to a "light archive" which provides public access, or a "dark archive" which only allows custodial access. The Gray Repo allows for curatorial deposit and retrieval, but no direct patron access. It is much more akin to a physical archival storage facility, much like our Book Depository, where items are stored on shelves in a environmentally regulated and well managed manner, and appropriately described in conformance with accepted standards, while the public and unvetted personnel are not allowed to wander the stacks.

Background

Gaps Addressed

Linkage to Strategic Directions

The Gray Repo emerged from an initial use case presented by the University Archives (Archives) for their annual collection development efforts. The Archives regularly accrues archival materials to existing (and sometimes new) collections on an annual basis, due to a mandate for collecting University records. Whether the records are analog or born digital, they are typically so voluminous that the Archives practice is to accession the records, update the finding aid, store them and provide mediated access. There is minimal descriptive effort, and discovery is incumbent upon the researcher/patron to examine the accession inventories and/or the records themselves. The Libraries' existing digital preservation platform, Digital Collections, which inherently requires item level description, was not designed for the ingest and management of digital assets at an accession level; hence the need for a second type of digital preservation repository.

While this initial use case presents itself most definitively within the University Archives, the discovery phase of developing this service and service description revealed that the standard operating procedure for the accessioning and/or accrual of any archival and special collection is one of minimal processing. As such, it will be the University Libraries default digital preservation repository for born digital content.

Two additional use cases arose from our initial discussion with stakeholders, one that had broader implications, but ultimately tied to the underlying and evolving nature of the GDR; and the other with targeted impact.

The former instance was posited by the Ohio Public Policy Archive, but could have broader implications, and concerns temporally restricted content. Typically the donation of congressional papers comes with a donor restriction as to when the content can become available for public research. Combined with the fact that almost all congressional papers donations are now either a majority or completely digital, we need to provide a true digital preservation platform to secure these records while they await further processing. That "further processing" will happen within a specified time based upon the donor agreement, and may lead to a re-ingest of minimally processed records akin the University Archives; or they me be described in more detail and ingested into Digital Collections.
The latter instance regards legacy digitized content. Before the evolution of the Digital Collections platform, digitized University Libraries content was typically ingested into the Knowledge Bank to provide access. The files ingested are access files, not preservation files. The preservation files were stored on the Libraries sFTP server known as "The Dark Archive." Unfortunately, it is neither a dark archive, nor a digital preservation platform or environment. These KB-related files would be better preserved in the Gray Repo.

Provides University Libraries a formalized path to preservation for:

born digital content
born digital content that is temporally restricted
digitized preservation files for content that resides in other Libraries repositories, such as the Knowledge Bank

Empower Knowledge Creators
Engage for Broader Impact
Enrich the User Experience
Model Excellence

https://library.osu.edu/strategic-directions

Service

Stakeholders

Stakeholders
Service Owner & Providers	Content Owners/Curators	Content Processing	Consultants	Informed Parties
Application Development & Operations (Provider) Digital Preservation (Owner & Provider) Infrastructure (Provider)	Billy Ireland Cartoon Library & Museum Byrd Polar and Climate Research Center Archival Program Music & Dance Library Ohio Public Policy Archives Publishing & Repository Services Thompson Special Collections University Archives	Archival Technical Services Billy Ireland Cartoon Library & Museum Digital Preservation Preservation & Digitization Thompson Special Collections University Archives	Copyright Services Cybersecurity	Bibliographic Initiatives Collection Development Electronic Resources Executive Committee Management Committee Metadata Initiatives Research Services Subject Liaisons

Description

Description
Components	Content
Repo: Fedora on Amazon Web Services (AWS) Ingest: AWS ingest buckets Bag-It VPN Staging: Digital Processing (K-drive) One Drive "Dark Archive" (to be emptied and decommissioned) VPN (Libraries share drives when necessary) sFTP (when necessary) Transfer to University Libraries One Drive External drives External Media Donor cloud storage Forensics: DROID (creates manifest with checksums and file characterizations) Tesseract and ABBYY FineReader (to create OCR text for images and "dumb" PDFs) Bulk Extractor (for Personally Identifiable Information (PII) identification) Management Finding Aid Archivist ToolKit PastPerfect OPAC Help/Maintenance JIRA Local Administrative Dashboard Teams Channel file storage Microsoft Lists Plain text reader app Mediated Patron Access Archivist ToolKit PastPerfect OPAC Secured Virtual Reading Room (sVRR) One Drive External drives External Media	Initial considerations: Is its University records retention permanent? Do we have a Deed of Gift? Have we accessioned it? If it were analog,would we store it in the Book Depository or other closed stacks? Are they preservation copies of other materials in the KB? Is it something we own, but not the rights that we have digitized (e.g. brittle books project)? Typical content: University Records: Born Digital records Transferred digitized records Preservation digitized files for third party platforms (e.g. Veridian) Special Collections: Born Digital collection objects Preservation digitized files for Knowledge Bank content Publications & Repository Services (P&RS): Preservation digitized files for Knowledge Bank content Future potential content types: Digitized University Libraries' content (ostensibly audio-visual objects) without clear rights Web Archiving WARCs P&RS other publishing platform content Brittle Books at Internet Archive preservation digitized files HathiTrust preservation digitized files
Process Workflow Overview
The following is a brief overview of the workflow process/components. UPDATED workflow to be published before June 30, 2025 Gray Repo Workflow as of 2025.05.15 Redacted Workflow Due to information security concerns, the complete Gray Digital Preservation Repository Workflow is available from the Digital Preservation Department upon request for internal University Libraries use only; however, this redacted version, is publicly available. Ingest Workflow Checklist As part of the repository and workflow upgrades we developed in late 2024 and early 2025, we now have a workflow checklist, of high-level activities that need to be completed along the way. As the workflow is now over 90 pages long, this provides a quick summary of actions. Archivematica Evaluation From the fall of 2024 through the winter of 2025, we conducted an evaluation of utilizing Archivematica as a potential workflow replacement tool. While our recommendation was to not implement Archivematica, you can read about our analysis and decision-making process in Archivematica Proof of Concept Report.

Gray Digital Preservation Repository

Context

Background

Gaps Addressed

Linkage to Strategic Directions

Service

Stakeholders

Stakeholders

Service Owner & Providers

Content Owners/Curators

Content Processing

Consultants

Informed Parties

Description

Description

Components

Content

Process Workflow Overview

Redacted Workflow

Ingest Workflow Checklist

Archivematica Evaluation