Docstore Initial Data Loading

Initial Data Loading (Bulk Ingest) in Docstore

Requirements

 

 

 

Specifications

Data handled in initial load

Bib MARC records and metadata like createdBy, dateEntered, updatedBy, lastUpdated, fastAddFlag, supressFromPublic, status, statusUpdatedBy, statusUpdatedOn, staffOnlyFlag, harvestable etc. The metadata is supported outside the bib record, in the request xml tag /request/requestDocuments/ingestDocument/additionalAttributes

Instance OLEML records (holdings, items) with similar metadata as bib. The metadata is supported as part of holdings and item oleml record. /instanceCollection/instance/oleHoldings/extension/additionalAttributes; /instanceCollection/instance/items/item/extension/additionalAttributes.

Format of data files

Several files, each containing an OLE Docstore Request with n(configurable, e.g. 50000) ingestDocuments, each containing a bib MARC record and additional attributes

Several files, each containing an OLE Docstore Request with n(configurable, e.g. 50000) ingestDocuments, each containing an instance in OLEML format.

Bib MARC record has control field 001 with local identifier as the value. This is preserved in DocStore.

Instance has resourceIdentifier field (repeatable) which is a bib ID.

Location of data files

 

Indexing of data

Currently each batch of records is saved in docstore and they are indexed in solr in one transaction

 

 

Operated as a Community Resource by the Open Library Foundation