OLE Bibliographic Documents - Docstore

https://jira.kuali.org/browse/OLE-2674 and related linked tasks

(placeholder for Bib technical documentation, with child pages for editors, maintenance documents, etc. See also: Search)

DATA

META DATA

Meta data is keeping in the node properties, currently we have the following properties for the Bib meta data:

DateUpload

DateLastUpdated

Future work for metadata:

FastAddFlag

Public

DateEntered?

CreateBy?

DATA FORMAT SUPPORT IN DOCSTORE

DocStore currently supports the following

  • MARC/MARCXML (here is more MARCXML information)
    • Samples: MARC 21 sample Book, Serial, etc records
    • Sample: Book Format

      Leader/00-23

      ****nam##22****#a#4500

      001

      <control number>

      003

      <control number identifier>

      005

      19920331092212.7

      007/00-01

      ta

      008/00-39

      820305s1991####nyu###########001#0#eng##

      020

      ##$a0845348116 :$c$29.95 (£19.50 U.K.)

      020

      ##$a0845348205 (pbk.)

      040

      ##$a[organization code]$c[organization code]

      050

      14$aPN1992.8.S4$bT47 1991

      082

      04$a791.45/75/0973$219

      100

      1#$aTerrace, Vincent,$d1948-

      245

      10$aFifty years of television :$ba guide to series and pilots, 1937-1988 /$cVincent Terrace.

      246

      1#$a50 years of television

      260

      ##$aNew York :$bCornwall Books,$cc1991.

      300

      ##$a864 p. ;$c24 cm.

      500

      ##$aIncludes index.

      650

      #0$aTelevision pilot programs$zUnited States$vCatalogs.

      650

      #0$aTelevision serials$zUnited States$vCatalogs.

  • Qualified and Unqualified Dublin Core (seeDublin Core Initiative)
    • Sample Dublin Core XML record:
      <metadata
      xmlns="http://example.org/myapp/"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://example.org/myapp/ http://example.org/myapp/schema.xsd"
      xmlns:dc="http://purl.org/dc/elements/1.1/">
      <dc:title>
      UKOLN
      </dc:title>
      <dc:description>
      UKOLN is a national focus of expertise in digital information
      management. It provides policy, research and awareness services
      to the UK library, information and cultural heritage communities.
      UKOLN is based at the University of Bath.
      </dc:description>
      <dc:publisher>
      UKOLN, University of Bath
      </dc:publisher>
      <dc:identifier>
      http://www.ukoln.ac.uk/
      </dc:identifier>
      </metadata>
TEST DATA

FEATURES

See: OLE DocumentStore for a complete description of architecture, linking, versioning, hierarchies

Linking

In Bib node, we keep the linkages between bib records and items (on instance level). 

Business rules:

One bib record may link to multiple instance records.

Versioning

Versioning for Bib is currently turned off.

Future work:
  • Bib overlay
  • Bound-with's
  • Instances
  • Links and handling for Name, Title and Subject Headings within Bibliographic Documents, ie Authority Records & Authority Control: OLE 1.5

INGEST

BULK INGEST

  • (HTC to document processes for loading legacy data)

INGEST THROUGH WEB INTERFACE

  • is this different than Staff or Auto loads below? For SysAdmin large loads (with indexing rules) for routine maintenance?

STAFF LOAD

  • See: OLE Ingest (for Acquisitions, load/transform vendor EOCR files)

EDITOR

MAINTENANCE DOC

More search information for Bib records, please read the Search Technical Documentation.

Schema (MarcXML XSD)

Unknown macro: {html}

<p><?xml version="1.0"?><br />
<xsd:schema targetNamespace="http://www.loc.gov/MARC21/slim&quot; xmlns="http://www.loc.gov/MARC21/slim&quot; xmlns:xsd="http://www.w3.org/2001/XMLSchema&quot; elementFormDefault="qualified" attributeFormDefault="unqualified" version="1.1" xml:lang="en"><br />
<xsd:annotation><br />
<xsd:documentation><br />
MARCXML: The MARC 21 XML Schema<br />
Prepared by Corey Keith<br />
<br />
May 21, 2002 - Version 1.0 - Initial Release</p>
<p>**********************************************<br />
Changes.</p>
<p>August 4, 2003 - Version 1.1 - <br />
Removed import of xml namespace and the use of xml:space="preserve" attributes on the leader and controlfields. <br />
Whitespace preservation in these subfields is accomplished by the use of xsd:whiteSpace value="preserve"</p>
<p>May 21, 2009 - Version 1.2 - <br />
in subfieldcodeDataType the pattern <br />
"[\da-z!&quot;#$%&amp;'()*+,-./:;&lt;=&gt;?{}_^`~[]
]

Unknown macro: {1}

"<br />
changed to: <br />
"[\dA-Za-z!&quot;#$%&amp;'()*+,-./:;&lt;=&gt;?{}_^`~[]
]

"<br />
i.e "A-Z" added after "[\d" before "a-z" to allow upper case. This change is for consistency with the documentation.<br />
<br />
************************************************************<br />
This schema supports XML markup of MARC21 records as specified in the MARC documentation (see www.loc.gov). It allows tags with<br />
alphabetics and subfield codes that are symbols, neither of which are as yet used in the MARC 21 communications formats, but are <br />
allowed by MARC 21 for local data. The schema accommodates all types of MARC 21 records: bibliographic, holdings, bibliographic <br />
with embedded holdings, authority, classification, and community information.<br />
</xsd:documentation><br />
</xsd:annotation><br />
<xsd:element name="record" type="recordType" nillable="true" id="record.e"><br />
<xsd:annotation><br />
<xsd:documentation>record is a top level container element for all of the field elements which compose the record</xsd:documentation><br />
</xsd:annotation><br />
</xsd:element><br />
<xsd:element name="collection" type="collectionType" nillable="true" id="collection.e"><br />
<xsd:annotation><br />
<xsd:documentation>collection is a top level container element for 0 or many records</xsd:documentation><br />
</xsd:annotation><br />
</xsd:element><br />
<xsd:complexType name="collectionType" id="collection.ct"><br />
<xsd:sequence minOccurs="0" maxOccurs="unbounded"><br />
<xsd:element ref="record"/><br />
</xsd:sequence><br />
<xsd:attribute name="id" type="idDataType" use="optional"/><br />
</xsd:complexType><br />
<xsd:complexType name="recordType" id="record.ct"><br />
<xsd:sequence minOccurs="0"><br />
<xsd:element name="leader" type="leaderFieldType"/><br />
<xsd:element name="controlfield" type="controlFieldType" minOccurs="0" maxOccurs="unbounded"/><br />
<xsd:element name="datafield" type="dataFieldType" minOccurs="0" maxOccurs="unbounded"/><br />
</xsd:sequence><br />
<xsd:attribute name="type" type="recordTypeType" use="optional"/><br />
<xsd:attribute name="id" type="idDataType" use="optional"/><br />
</xsd:complexType><br />
<xsd:simpleType name="recordTypeType" id="type.st"><br />
<xsd:restriction base="xsd:NMTOKEN"><br />
<xsd:enumeration value="Bibliographic"/><br />
<xsd:enumeration value="Authority"/><br />
<xsd:enumeration value="Holdings"/><br />
<xsd:enumeration value="Classification"/><br />
<xsd:enumeration value="Community"/><br />
</xsd:restriction><br />
</xsd:simpleType><br />
<xsd:complexType name="leaderFieldType" id="leader.ct"><br />
<xsd:annotation><br />
<xsd:documentation>MARC21 Leader, 24 bytes</xsd:documentation><br />
</xsd:annotation><br />
<xsd:simpleContent><br />
<xsd:extension base="leaderDataType"><br />
<xsd:attribute name="id" type="idDataType" use="optional"/><br />
</xsd:extension><br />
</xsd:simpleContent><br />
</xsd:complexType><br />
<xsd:simpleType name="leaderDataType" id="leader.st"><br />
<xsd:restriction base="xsd:string"><br />
<xsd:whiteSpace value="preserve"/><br />
<xsd:pattern value="[\d]

Unknown macro: {5}

[\dA-Za-z]

Unknown macro: {1}

[\dA-Za-z]

[\dA-Za-z]

Unknown macro: {3}

(2| )(2| )[\d]

[\dA-Za-z]

Unknown macro: {3}

(4500| )"/><br />
</xsd:restriction><br />
</xsd:simpleType><br />
<xsd:complexType name="controlFieldType" id="controlfield.ct"><br />
<xsd:annotation><br />
<xsd:documentation>MARC21 Fields 001-009</xsd:documentation><br />
</xsd:annotation><br />
<xsd:simpleContent><br />
<xsd:extension base="controlDataType"><br />
<xsd:attribute name="id" type="idDataType" use="optional"/><br />
<xsd:attribute name="tag" type="controltagDataType" use="required"/><br />
</xsd:extension><br />
</xsd:simpleContent><br />
</xsd:complexType><br />
<xsd:simpleType name="controlDataType" id="controlfield.st"><br />
<xsd:restriction base="xsd:string"><br />
<xsd:whiteSpace value="preserve"/><br />
</xsd:restriction><br />
</xsd:simpleType><br />
<xsd:simpleType name="controltagDataType" id="controltag.st"><br />
<xsd:restriction base="xsd:string"><br />
<xsd:whiteSpace value="preserve"/><br />
<xsd:pattern value="00[1-9A-Za-z]

Unknown macro: {1}

"/><br />
</xsd:restriction><br />
</xsd:simpleType><br />
<xsd:complexType name="dataFieldType" id="datafield.ct"><br />
<xsd:annotation><br />
<xsd:documentation>MARC21 Variable Data Fields 010-999</xsd:documentation><br />
</xsd:annotation><br />
<xsd:sequence maxOccurs="unbounded"><br />
<xsd:element name="subfield" type="subfieldatafieldType"/><br />
</xsd:sequence><br />
<xsd:attribute name="id" type="idDataType" use="optional"/><br />
<xsd:attribute name="tag" type="tagDataType" use="required"/><br />
<xsd:attribute name="ind1" type="indicatorDataType" use="required"/><br />
<xsd:attribute name="ind2" type="indicatorDataType" use="required"/><br />
</xsd:complexType><br />
<xsd:simpleType name="tagDataType" id="tag.st"><br />
<xsd:restriction base="xsd:string"><br />
<xsd:whiteSpace value="preserve"/><br />
<xsd:pattern value="(0([1-9A-Z][0-9A-Z])|0([1-9a-z][0-9a-z]))|(([1-9A-Z][0-9A-Z]

Unknown macro: {2}

)|([1-9a-z][0-9a-z]

))"/><br />
</xsd:restriction><br />
</xsd:simpleType><br />
<xsd:simpleType name="indicatorDataType" id="ind.st"><br />
<xsd:restriction base="xsd:string"><br />
<xsd:whiteSpace value="preserve"/><br />
<xsd:pattern value="[\da-z]

"/><br />
</xsd:restriction><br />
</xsd:simpleType><br />
<xsd:complexType name="subfieldatafieldType" id="subfield.ct"><br />
<xsd:simpleContent><br />
<xsd:extension base="subfieldDataType"><br />
<xsd:attribute name="id" type="idDataType" use="optional"/><br />
<xsd:attribute name="code" type="subfieldcodeDataType" use="required"/><br />
</xsd:extension><br />
</xsd:simpleContent><br />
</xsd:complexType><br />
<xsd:simpleType name="subfieldDataType" id="subfield.st"><br />
<xsd:restriction base="xsd:string"><br />
<xsd:whiteSpace value="preserve"/><br />
</xsd:restriction><br />
</xsd:simpleType><br />
<xsd:simpleType name="subfieldcodeDataType" id="code.st"><br />
<xsd:restriction base="xsd:string"><br />
<xsd:whiteSpace value="preserve"/><br />
<xsd:pattern value="[\dA-Za-z!&quot;#$%&amp;'()*+,-./:;&lt;=&gt;?{}_^`~[]
]

Unknown macro: {1}

"/><br />
<!-- "A-Z" added after "\d" May 21, 2009 --><br />
</xsd:restriction><br />
</xsd:simpleType><br />
<xsd:simpleType name="idDataType" id="id.st"><br />
<xsd:restriction base="xsd:ID"/><br />
</xsd:simpleType><br />
</xsd:schema><br />
</p>

Dublin Core Schema

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="http://purl.org/dc/elements/1.1/" targetNamespace="http://purl.org/dc/elements/1.1/" elementFormDefault="qualified" attributeFormDefault="unqualified">

<xs:annotation>

<xs:documentation xml:lang="en">

DCMES 1.1 XML Schema XML Schema for http://purl.org/dc/elements/1.1/ namespace Created 2008-02-11 Created by Tim Cole (t-cole3@uiuc.edu) Tom Habing (thabing@uiuc.edu) Jane Hunter (jane@dstc.edu.au) Pete Johnston (p.johnston@ukoln.ac.uk), Carl Lagoze (lagoze@cs.cornell.edu) This schema declares XML elements for the 15 DC elements from the http://purl.org/dc/elements/1.1/ namespace. It defines a complexType SimpleLiteral which permits mixed content and makes the xml:lang attribute available. It disallows child elements by use of minOcccurs/maxOccurs. However, this complexType does permit the derivation of other complexTypes which would permit child elements. All elements are declared as substitutable for the abstract element any, which means that the default type for all elements is dc:SimpleLiteral.

</xs:documentation>

</xs:annotation>

<xs:import namespace="http://www.w3.org/XML/1998/namespace" schemaLocation="http://www.w3.org/2001/03/xml.xsd"></xs:import>

<xs:complexType name="SimpleLiteral">

<xs:annotation>

<xs:documentation xml:lang="en">

This is the default type for all of the DC elements. It permits text content only with optional xml:lang attribute. Text is allowed because mixed="true", but sub-elements are disallowed because minOccurs="0" and maxOccurs="0" are on the xs:any tag. This complexType allows for restriction or extension permitting child elements.

</xs:documentation>

</xs:annotation>

<xs:complexContent mixed="true">

<xs:restriction base="xs:anyType">

<xs:sequence>

<xs:any processContents="lax" minOccurs="0" maxOccurs="0"/>

</xs:sequence>

<xs:attribute ref="xml:lang" use="optional"/>

</xs:restriction>

</xs:complexContent>

</xs:complexType>

<xs:element name="any" type="SimpleLiteral" abstract="true"/>

<xs:element name="title" substitutionGroup="any"/>

<xs:element name="creator" substitutionGroup="any"/>

<xs:element name="subject" substitutionGroup="any"/>

<xs:element name="description" substitutionGroup="any"/>

<xs:element name="publisher" substitutionGroup="any"/>

<xs:element name="contributor" substitutionGroup="any"/>

<xs:element name="date" substitutionGroup="any"/>

<xs:element name="type" substitutionGroup="any"/>

<xs:element name="format" substitutionGroup="any"/>

<xs:element name="identifier" substitutionGroup="any"/>

<xs:element name="source" substitutionGroup="any"/>

<xs:element name="language" substitutionGroup="any"/>

<xs:element name="relation" substitutionGroup="any"/>

<xs:element name="coverage" substitutionGroup="any"/>

<xs:element name="rights" substitutionGroup="any"/>

<xs:group name="elementsGroup">

<xs:annotation>

<xs:documentation xml:lang="en">

This group is included as a convenience for schema authors who need to refer to all the elements in the http://purl.org/dc/elements/1.1/ namespace.

</xs:documentation>

</xs:annotation>

<xs:sequence>

<xs:choice minOccurs="0" maxOccurs="unbounded">

<xs:element ref="any"/>

</xs:choice>

</xs:sequence>

</xs:group>

<xs:complexType name="elementContainer">

<xs:annotation>

<xs:documentation xml:lang="en">

This complexType is included as a convenience for schema authors who need to define a root or container element for all of the DC elements.

</xs:documentation>

</xs:annotation>

<xs:choice>

<xs:group ref="elementsGroup"/>

</xs:choice>

</xs:complexType>

</xs:schema>

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="http://purl.org/dc/elements/1.1/" targetNamespace="http://purl.org/dc/elements/1.1/" elementFormDefault="qualified" attributeFormDefault="unqualified">
<xs:annotation>
<xs:documentation xml:lang="en">
DCMES 1.1 XML Schema XML Schema for http://purl.org/dc/elements/1.1/ namespace Created 2008-02-11 Created by Tim Cole (t-cole3@uiuc.edu) Tom Habing (thabing@uiuc.edu) Jane Hunter (jane@dstc.edu.au) Pete Johnston (p.johnston@ukoln.ac.uk), Carl Lagoze (lagoze@cs.cornell.edu) This schema declares XML elements for the 15 DC elements from the http://purl.org/dc/elements/1.1/ namespace. It defines a complexType SimpleLiteral which permits mixed content and makes the xml:lang attribute available. It disallows child elements by use of minOcccurs/maxOccurs. However, this complexType does permit the derivation of other complexTypes which would permit child elements. All elements are declared as substitutable for the abstract element any, which means that the default type for all elements is dc:SimpleLiteral.
</xs:documentation>
</xs:annotation>
<xs:import namespace="http://www.w3.org/XML/1998/namespace" schemaLocation="http://www.w3.org/2001/03/xml.xsd"></xs:import>
<xs:complexType name="SimpleLiteral">
<xs:annotation>
<xs:documentation xml:lang="en">
This is the default type for all of the DC elements. It permits text content only with optional xml:lang attribute. Text is allowed because mixed="true", but sub-elements are disallowed because minOccurs="0" and maxOccurs="0" are on the xs:any tag. This complexType allows for restriction or extension permitting child elements.
</xs:documentation>
</xs:annotation>
<xs:complexContent mixed="true">
<xs:restriction base="xs:anyType">
<xs:sequence>
<xs:any processContents="lax" minOccurs="0" maxOccurs="0"/>
</xs:sequence>
<xs:attribute ref="xml:lang" use="optional"/>
</xs:restriction>
</xs:complexContent>
</xs:complexType>
<xs:element name="any" type="SimpleLiteral" abstract="true"/>
<xs:element name="title" substitutionGroup="any"/>
<xs:element name="creator" substitutionGroup="any"/>
<xs:element name="subject" substitutionGroup="any"/>
<xs:element name="description" substitutionGroup="any"/>
<xs:element name="publisher" substitutionGroup="any"/>
<xs:element name="contributor" substitutionGroup="any"/>
<xs:element name="date" substitutionGroup="any"/>
<xs:element name="type" substitutionGroup="any"/>
<xs:element name="format" substitutionGroup="any"/>
<xs:element name="identifier" substitutionGroup="any"/>
<xs:element name="source" substitutionGroup="any"/>
<xs:element name="language" substitutionGroup="any"/>
<xs:element name="relation" substitutionGroup="any"/>
<xs:element name="coverage" substitutionGroup="any"/>
<xs:element name="rights" substitutionGroup="any"/>
<xs:group name="elementsGroup">
<xs:annotation>
<xs:documentation xml:lang="en">
This group is included as a convenience for schema authors who need to refer to all the elements in the http://purl.org/dc/elements/1.1/ namespace.
</xs:documentation>
</xs:annotation>
<xs:sequence>
<xs:choice minOccurs="0" maxOccurs="unbounded">
<xs:element ref="any"/>
</xs:choice>
</xs:sequence>
</xs:group>
<xs:complexType name="elementContainer">
<xs:annotation>
<xs:documentation xml:lang="en">
This complexType is included as a convenience for schema authors who need to define a root or container element for all of the DC elements.
</xs:documentation>
</xs:annotation>
<xs:choice>
<xs:group ref="elementsGroup"/>
</xs:choice>
</xs:complexType>
</xs:schema>

Operated as a Community Resource by the Open Library Foundation