OLE Bibliographic Documents - Docstore
https://jira.kuali.org/browse/OLE-2674 and related linked tasks
(placeholder for Bib technical documentation, with child pages for editors, maintenance documents, etc. See also: Search)
DATA
META DATA
Meta data is keeping in the node properties, currently we have the following properties for the Bib meta data:
DateUpload
DateLastUpdated
Future work for metadata:
FastAddFlag
Public
DateEntered?
CreateBy?
DATA FORMAT SUPPORT IN DOCSTORE
DocStore currently supports the following
- MARC/MARCXML (here is more MARCXML information)
- Samples: MARC 21 sample Book, Serial, etc records
- Sample: Book Format
Leader/00-23
****nam##22****#a#4500
001
<control number>
003
<control number identifier>
005
19920331092212.7
007/00-01
ta
008/00-39
820305s1991####nyu###########001#0#eng##
020
##$a0845348116 :$c$29.95 (£19.50 U.K.)
020
##$a0845348205 (pbk.)
040
##$a[organization code]$c[organization code]
050
14$aPN1992.8.S4$bT47 1991
082
04$a791.45/75/0973$219
100
1#$aTerrace, Vincent,$d1948-
245
10$aFifty years of television :$ba guide to series and pilots, 1937-1988 /$cVincent Terrace.
246
1#$a50 years of television
260
##$aNew York :$bCornwall Books,$cc1991.
300
##$a864 p. ;$c24 cm.
500
##$aIncludes index.
650
#0$aTelevision pilot programs$zUnited States$vCatalogs.
650
#0$aTelevision serials$zUnited States$vCatalogs.
- Qualified and Unqualified Dublin Core (seeDublin Core Initiative)
- Sample Dublin Core XML record:
<metadata
xmlns="http://example.org/myapp/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://example.org/myapp/ http://example.org/myapp/schema.xsd"
xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:title>
UKOLN
</dc:title>
<dc:description>
UKOLN is a national focus of expertise in digital information
management. It provides policy, research and awareness services
to the UK library, information and cultural heritage communities.
UKOLN is based at the University of Bath.
</dc:description>
<dc:publisher>
UKOLN, University of Bath
</dc:publisher>
<dc:identifier>
http://www.ukoln.ac.uk/
</dc:identifier>
</metadata>
- Sample Dublin Core XML record:
TEST DATA
- Where is test data?
- See: OLE Inventory Test Data- All document types
- Above lists where record sets can be found in the Cloud
- Future Needs and follow up?
FEATURES
See: OLE DocumentStore for a complete description of architecture, linking, versioning, hierarchies
Linking
In Bib node, we keep the linkages between bib records and items (on instance level).
Business rules:
One bib record may link to multiple instance records.
Versioning
Versioning for Bib is currently turned off.
Future work:
- Bib overlay
- Bound-with's
- Instances
- Links and handling for Name, Title and Subject Headings within Bibliographic Documents, ie Authority Records & Authority Control: OLE 1.5
INGEST
BULK INGEST
- (HTC to document processes for loading legacy data)
INGEST THROUGH WEB INTERFACE
- is this different than Staff or Auto loads below? For SysAdmin large loads (with indexing rules) for routine maintenance?
STAFF LOAD
- See: OLE Ingest (for Acquisitions, load/transform vendor EOCR files)
EDITOR
MAINTENANCE DOC
SEARCH
More search information for Bib records, please read the Search Technical Documentation.
Schema (MarcXML XSD)
<p><?xml version="1.0"?><br />
<xsd:schema targetNamespace="http://www.loc.gov/MARC21/slim" xmlns="http://www.loc.gov/MARC21/slim" xmlns:xsd="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" attributeFormDefault="unqualified" version="1.1" xml:lang="en"><br />
<xsd:annotation><br />
<xsd:documentation><br />
MARCXML: The MARC 21 XML Schema<br />
Prepared by Corey Keith<br />
<br />
May 21, 2002 - Version 1.0 - Initial Release</p>
<p>**********************************************<br />
Changes.</p>
<p>August 4, 2003 - Version 1.1 - <br />
Removed import of xml namespace and the use of xml:space="preserve" attributes on the leader and controlfields. <br />
Whitespace preservation in these subfields is accomplished by the use of xsd:whiteSpace value="preserve"</p>
<p>May 21, 2009 - Version 1.2 - <br />
in subfieldcodeDataType the pattern <br />
"[\da-z!"#$%&'()*+,-./:;<=>?{}_^`~[]
]
"<br />
changed to: <br />
"[\dA-Za-z!"#$%&'()*+,-./:;<=>?{}_^`~[]
]
"<br />
i.e "A-Z" added after "[\d" before "a-z" to allow upper case. This change is for consistency with the documentation.<br />
<br />
************************************************************<br />
This schema supports XML markup of MARC21 records as specified in the MARC documentation (see www.loc.gov). It allows tags with<br />
alphabetics and subfield codes that are symbols, neither of which are as yet used in the MARC 21 communications formats, but are <br />
allowed by MARC 21 for local data. The schema accommodates all types of MARC 21 records: bibliographic, holdings, bibliographic <br />
with embedded holdings, authority, classification, and community information.<br />
</xsd:documentation><br />
</xsd:annotation><br />
<xsd:element name="record" type="recordType" nillable="true" id="record.e"><br />
<xsd:annotation><br />
<xsd:documentation>record is a top level container element for all of the field elements which compose the record</xsd:documentation><br />
</xsd:annotation><br />
</xsd:element><br />
<xsd:element name="collection" type="collectionType" nillable="true" id="collection.e"><br />
<xsd:annotation><br />
<xsd:documentation>collection is a top level container element for 0 or many records</xsd:documentation><br />
</xsd:annotation><br />
</xsd:element><br />
<xsd:complexType name="collectionType" id="collection.ct"><br />
<xsd:sequence minOccurs="0" maxOccurs="unbounded"><br />
<xsd:element ref="record"/><br />
</xsd:sequence><br />
<xsd:attribute name="id" type="idDataType" use="optional"/><br />
</xsd:complexType><br />
<xsd:complexType name="recordType" id="record.ct"><br />
<xsd:sequence minOccurs="0"><br />
<xsd:element name="leader" type="leaderFieldType"/><br />
<xsd:element name="controlfield" type="controlFieldType" minOccurs="0" maxOccurs="unbounded"/><br />
<xsd:element name="datafield" type="dataFieldType" minOccurs="0" maxOccurs="unbounded"/><br />
</xsd:sequence><br />
<xsd:attribute name="type" type="recordTypeType" use="optional"/><br />
<xsd:attribute name="id" type="idDataType" use="optional"/><br />
</xsd:complexType><br />
<xsd:simpleType name="recordTypeType" id="type.st"><br />
<xsd:restriction base="xsd:NMTOKEN"><br />
<xsd:enumeration value="Bibliographic"/><br />
<xsd:enumeration value="Authority"/><br />
<xsd:enumeration value="Holdings"/><br />
<xsd:enumeration value="Classification"/><br />
<xsd:enumeration value="Community"/><br />
</xsd:restriction><br />
</xsd:simpleType><br />
<xsd:complexType name="leaderFieldType" id="leader.ct"><br />
<xsd:annotation><br />
<xsd:documentation>MARC21 Leader, 24 bytes</xsd:documentation><br />
</xsd:annotation><br />
<xsd:simpleContent><br />
<xsd:extension base="leaderDataType"><br />
<xsd:attribute name="id" type="idDataType" use="optional"/><br />
</xsd:extension><br />
</xsd:simpleContent><br />
</xsd:complexType><br />
<xsd:simpleType name="leaderDataType" id="leader.st"><br />
<xsd:restriction base="xsd:string"><br />
<xsd:whiteSpace value="preserve"/><br />
<xsd:pattern value="[\d]
[\dA-Za-z]
[\dA-Za-z]
[\dA-Za-z]
(2| )(2| )[\d]
[\dA-Za-z]
(4500| )"/><br />
</xsd:restriction><br />
</xsd:simpleType><br />
<xsd:complexType name="controlFieldType" id="controlfield.ct"><br />
<xsd:annotation><br />
<xsd:documentation>MARC21 Fields 001-009</xsd:documentation><br />
</xsd:annotation><br />
<xsd:simpleContent><br />
<xsd:extension base="controlDataType"><br />
<xsd:attribute name="id" type="idDataType" use="optional"/><br />
<xsd:attribute name="tag" type="controltagDataType" use="required"/><br />
</xsd:extension><br />
</xsd:simpleContent><br />
</xsd:complexType><br />
<xsd:simpleType name="controlDataType" id="controlfield.st"><br />
<xsd:restriction base="xsd:string"><br />
<xsd:whiteSpace value="preserve"/><br />
</xsd:restriction><br />
</xsd:simpleType><br />
<xsd:simpleType name="controltagDataType" id="controltag.st"><br />
<xsd:restriction base="xsd:string"><br />
<xsd:whiteSpace value="preserve"/><br />
<xsd:pattern value="00[1-9A-Za-z]
"/><br />
</xsd:restriction><br />
</xsd:simpleType><br />
<xsd:complexType name="dataFieldType" id="datafield.ct"><br />
<xsd:annotation><br />
<xsd:documentation>MARC21 Variable Data Fields 010-999</xsd:documentation><br />
</xsd:annotation><br />
<xsd:sequence maxOccurs="unbounded"><br />
<xsd:element name="subfield" type="subfieldatafieldType"/><br />
</xsd:sequence><br />
<xsd:attribute name="id" type="idDataType" use="optional"/><br />
<xsd:attribute name="tag" type="tagDataType" use="required"/><br />
<xsd:attribute name="ind1" type="indicatorDataType" use="required"/><br />
<xsd:attribute name="ind2" type="indicatorDataType" use="required"/><br />
</xsd:complexType><br />
<xsd:simpleType name="tagDataType" id="tag.st"><br />
<xsd:restriction base="xsd:string"><br />
<xsd:whiteSpace value="preserve"/><br />
<xsd:pattern value="(0([1-9A-Z][0-9A-Z])|0([1-9a-z][0-9a-z]))|(([1-9A-Z][0-9A-Z]
)|([1-9a-z][0-9a-z]
))"/><br />
</xsd:restriction><br />
</xsd:simpleType><br />
<xsd:simpleType name="indicatorDataType" id="ind.st"><br />
<xsd:restriction base="xsd:string"><br />
<xsd:whiteSpace value="preserve"/><br />
<xsd:pattern value="[\da-z]
"/><br />
</xsd:restriction><br />
</xsd:simpleType><br />
<xsd:complexType name="subfieldatafieldType" id="subfield.ct"><br />
<xsd:simpleContent><br />
<xsd:extension base="subfieldDataType"><br />
<xsd:attribute name="id" type="idDataType" use="optional"/><br />
<xsd:attribute name="code" type="subfieldcodeDataType" use="required"/><br />
</xsd:extension><br />
</xsd:simpleContent><br />
</xsd:complexType><br />
<xsd:simpleType name="subfieldDataType" id="subfield.st"><br />
<xsd:restriction base="xsd:string"><br />
<xsd:whiteSpace value="preserve"/><br />
</xsd:restriction><br />
</xsd:simpleType><br />
<xsd:simpleType name="subfieldcodeDataType" id="code.st"><br />
<xsd:restriction base="xsd:string"><br />
<xsd:whiteSpace value="preserve"/><br />
<xsd:pattern value="[\dA-Za-z!"#$%&'()*+,-./:;<=>?{}_^`~[]
]
"/><br />
<!-- "A-Z" added after "\d" May 21, 2009 --><br />
</xsd:restriction><br />
</xsd:simpleType><br />
<xsd:simpleType name="idDataType" id="id.st"><br />
<xsd:restriction base="xsd:ID"/><br />
</xsd:simpleType><br />
</xsd:schema><br />
</p>
Dublin Core Schema
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="http://purl.org/dc/elements/1.1/" targetNamespace="http://purl.org/dc/elements/1.1/" elementFormDefault="qualified" attributeFormDefault="unqualified">
<xs:annotation>
<xs:documentation xml:lang="en">
DCMES 1.1 XML Schema XML Schema for http://purl.org/dc/elements/1.1/ namespace Created 2008-02-11 Created by Tim Cole (t-cole3@uiuc.edu) Tom Habing (thabing@uiuc.edu) Jane Hunter (jane@dstc.edu.au) Pete Johnston (p.johnston@ukoln.ac.uk), Carl Lagoze (lagoze@cs.cornell.edu) This schema declares XML elements for the 15 DC elements from the http://purl.org/dc/elements/1.1/ namespace. It defines a complexType SimpleLiteral which permits mixed content and makes the xml:lang attribute available. It disallows child elements by use of minOcccurs/maxOccurs. However, this complexType does permit the derivation of other complexTypes which would permit child elements. All elements are declared as substitutable for the abstract element any, which means that the default type for all elements is dc:SimpleLiteral.
</xs:documentation>
</xs:annotation>
<xs:import namespace="http://www.w3.org/XML/1998/namespace" schemaLocation="http://www.w3.org/2001/03/xml.xsd"></xs:import>
<xs:complexType name="SimpleLiteral">
<xs:annotation>
<xs:documentation xml:lang="en">
This is the default type for all of the DC elements. It permits text content only with optional xml:lang attribute. Text is allowed because mixed="true", but sub-elements are disallowed because minOccurs="0" and maxOccurs="0" are on the xs:any tag. This complexType allows for restriction or extension permitting child elements.
</xs:documentation>
</xs:annotation>
<xs:complexContent mixed="true">
<xs:restriction base="xs:anyType">
<xs:sequence>
<xs:any processContents="lax" minOccurs="0" maxOccurs="0"/>
</xs:sequence>
<xs:attribute ref="xml:lang" use="optional"/>
</xs:restriction>
</xs:complexContent>
</xs:complexType>
<xs:element name="any" type="SimpleLiteral" abstract="true"/>
<xs:element name="title" substitutionGroup="any"/>
<xs:element name="creator" substitutionGroup="any"/>
<xs:element name="subject" substitutionGroup="any"/>
<xs:element name="description" substitutionGroup="any"/>
<xs:element name="publisher" substitutionGroup="any"/>
<xs:element name="contributor" substitutionGroup="any"/>
<xs:element name="date" substitutionGroup="any"/>
<xs:element name="type" substitutionGroup="any"/>
<xs:element name="format" substitutionGroup="any"/>
<xs:element name="identifier" substitutionGroup="any"/>
<xs:element name="source" substitutionGroup="any"/>
<xs:element name="language" substitutionGroup="any"/>
<xs:element name="relation" substitutionGroup="any"/>
<xs:element name="coverage" substitutionGroup="any"/>
<xs:element name="rights" substitutionGroup="any"/>
<xs:group name="elementsGroup">
<xs:annotation>
<xs:documentation xml:lang="en">
This group is included as a convenience for schema authors who need to refer to all the elements in the http://purl.org/dc/elements/1.1/ namespace.
</xs:documentation>
</xs:annotation>
<xs:sequence>
<xs:choice minOccurs="0" maxOccurs="unbounded">
<xs:element ref="any"/>
</xs:choice>
</xs:sequence>
</xs:group>
<xs:complexType name="elementContainer">
<xs:annotation>
<xs:documentation xml:lang="en">
This complexType is included as a convenience for schema authors who need to define a root or container element for all of the DC elements.
</xs:documentation>
</xs:annotation>
<xs:choice>
<xs:group ref="elementsGroup"/>
</xs:choice>
</xs:complexType>
</xs:schema>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="http://purl.org/dc/elements/1.1/" targetNamespace="http://purl.org/dc/elements/1.1/" elementFormDefault="qualified" attributeFormDefault="unqualified">
<xs:annotation>
<xs:documentation xml:lang="en">
DCMES 1.1 XML Schema XML Schema for http://purl.org/dc/elements/1.1/ namespace Created 2008-02-11 Created by Tim Cole (t-cole3@uiuc.edu) Tom Habing (thabing@uiuc.edu) Jane Hunter (jane@dstc.edu.au) Pete Johnston (p.johnston@ukoln.ac.uk), Carl Lagoze (lagoze@cs.cornell.edu) This schema declares XML elements for the 15 DC elements from the http://purl.org/dc/elements/1.1/ namespace. It defines a complexType SimpleLiteral which permits mixed content and makes the xml:lang attribute available. It disallows child elements by use of minOcccurs/maxOccurs. However, this complexType does permit the derivation of other complexTypes which would permit child elements. All elements are declared as substitutable for the abstract element any, which means that the default type for all elements is dc:SimpleLiteral.
</xs:documentation>
</xs:annotation>
<xs:import namespace="http://www.w3.org/XML/1998/namespace" schemaLocation="http://www.w3.org/2001/03/xml.xsd"></xs:import>
<xs:complexType name="SimpleLiteral">
<xs:annotation>
<xs:documentation xml:lang="en">
This is the default type for all of the DC elements. It permits text content only with optional xml:lang attribute. Text is allowed because mixed="true", but sub-elements are disallowed because minOccurs="0" and maxOccurs="0" are on the xs:any tag. This complexType allows for restriction or extension permitting child elements.
</xs:documentation>
</xs:annotation>
<xs:complexContent mixed="true">
<xs:restriction base="xs:anyType">
<xs:sequence>
<xs:any processContents="lax" minOccurs="0" maxOccurs="0"/>
</xs:sequence>
<xs:attribute ref="xml:lang" use="optional"/>
</xs:restriction>
</xs:complexContent>
</xs:complexType>
<xs:element name="any" type="SimpleLiteral" abstract="true"/>
<xs:element name="title" substitutionGroup="any"/>
<xs:element name="creator" substitutionGroup="any"/>
<xs:element name="subject" substitutionGroup="any"/>
<xs:element name="description" substitutionGroup="any"/>
<xs:element name="publisher" substitutionGroup="any"/>
<xs:element name="contributor" substitutionGroup="any"/>
<xs:element name="date" substitutionGroup="any"/>
<xs:element name="type" substitutionGroup="any"/>
<xs:element name="format" substitutionGroup="any"/>
<xs:element name="identifier" substitutionGroup="any"/>
<xs:element name="source" substitutionGroup="any"/>
<xs:element name="language" substitutionGroup="any"/>
<xs:element name="relation" substitutionGroup="any"/>
<xs:element name="coverage" substitutionGroup="any"/>
<xs:element name="rights" substitutionGroup="any"/>
<xs:group name="elementsGroup">
<xs:annotation>
<xs:documentation xml:lang="en">
This group is included as a convenience for schema authors who need to refer to all the elements in the http://purl.org/dc/elements/1.1/ namespace.
</xs:documentation>
</xs:annotation>
<xs:sequence>
<xs:choice minOccurs="0" maxOccurs="unbounded">
<xs:element ref="any"/>
</xs:choice>
</xs:sequence>
</xs:group>
<xs:complexType name="elementContainer">
<xs:annotation>
<xs:documentation xml:lang="en">
This complexType is included as a convenience for schema authors who need to define a root or container element for all of the DC elements.
</xs:documentation>
</xs:annotation>
<xs:choice>
<xs:group ref="elementsGroup"/>
</xs:choice>
</xs:complexType>
</xs:schema>
Operated as a Community Resource by the Open Library Foundation