Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Wiki Markup
h1. 1. Introduction

Document Store is a Document storage repository system with features like checkin, checkout, Ingest , Delete etc for library records such as Bibliographic, Instance (Holdings and Items) etc. Most of the records are in XML format but the Document Store is format agnostic in that it stores the content as is without any type conversion. Furthermore indexing of the stored data is also supported for efficient search and retrieval. Although the Document Store is an independent system that comes with basic UI to enable supported operations, majority of interaction happens from within the code of OLE such as ingest of new records, editing of existing records, search and retrievals. 

The functionality of Docstore is mostly used by other  processes like OLE.

However, for demonstration and testing docstore, a screen is  provided with different tabs for different functions of docstore and information  about docstore.

Please refer to the docstore application deployed on Dev  server:

[http://docstore.dev.ole.kuali.org/oledocstore/]





h1. 2. Operations


h2. 2.1 Summary

Shows the summary of node count for each category, type and formath

h2. 2. 2. Node Count

Shows node count at each level of the docstore for each  category, type and format

h2. 2.3 Ingest

Allows storing of documents in the document store. The input should be a Request XML with a standard schema and returns the Response XML with a list of ingested UUIDs.

h3. 2.3.1 Sending the request

URL: [http://localhost:9080/oledocstore/document]














Method: POST

Parameters:

                  docAction=ingestContent

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; stringContent=<request.xml as described in the next section>

h3. 2.3.2 Request XML

Information about the ingest operation to be performed.

{noformat}
 <request>
    <user>ole-khuntley</user>
    <operation>batchIngest</operation>
    <requestDocuments>
        <ingestDocument id="1" category="work" type="bibliographic" format="marc">
            <content>
        See section "Sample Input XML for Ingest".
    </content>
        </ingestDocument>
    </requestDocuments>
</request>
{noformat}

h3. 2.3.3 Receiving the response


HttpResponse should be read and interpreted as given in the next section below.


h3. 2.3.4 Response XML


Information about the response from the service for the given request

{noformat}
 <response>
  <documents>
    	<document id="1" category="work" type="bibliographic" format="marc">
      	    <uuid>8675a422-b6ad-440e-bc0d-9f0dc1526ed2</uuid>
	</document>
   </documents>
<user>ole-khuntley</user>
<operation>batchIngest</operation>
<status>Success</status>
<message>Documents ingested</message>
</response>

{noformat}

Each document that is ingested will be given a UUID as specified in the response.


h2. 2.4 Get UUIDs

User can get a sample of UUIDs of documents of any {category,  type, format} already ingested.

It is useful mainly for demo purposes.

h2. 2.5 Check-in


Check-in functionality allows to modify the content and metadata (additional attributes) of a document identified by its UUID (Universally Unique Identifier).


h3. 2.5.1 Sending the request

URL: [http://localhost:9080/oledocstore/document]














Method: POST

Parameters:

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; docAction=checkIn

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; stringContent=<request.xml as described in the next section>


h3. 2.5.2 Request XML

&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Information about the check in&nbsp; operation to be performed&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

{noformat}
 <request>
    <user>ole-khuntley</user>
    <operation>checkIn</operation>
    <requestDocuments>
        <ingestDocument id="5325d77a-8221-4fda-a78f-6d2f96e0b059" category="work"
                               type="bibliographic" format="marc">
            <content>
		See section "Sample Input XML for Check in".
	</content>
        </ingestDocument>
    </requestDocuments>
</request>

{noformat}

h3. 2.5.3 Receiving the response

HttpResponse should be read and interpreted as given in the next section below.
\\

h3. 2.5.4 Response XML

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Information about the response from the service for the given request&nbsp;

{noformat}
 <?xml version="1.0" encoding="UTF-8"?>
<OLEDocstore-call>
	<request>
		<command>Check-in</command>
		<params/>
	</request>
	<response>
		<status>Success</status>
		<message>Successfully checked in </message>
	</response>
</OLEDocstore-call>

{noformat}

h2. 2.6 Check-out

This operation retrieves the content of a document given its UUID.
\\

h3. 2.6.1 Sending the request

URL: [http://localhost:9080/oledocstore/document]














Method: POST

Parameters:

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; docAction=checkOut

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; uuid=<uuid of the document to be retrieved>
\\

h3. 2.6.2 Receiving the response

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;HttpResponse should be read and interpreted as given in the next section below.

h3. 2.6.3&nbsp; Response XML

Content of the document with the given UUID.


h2. 2.7 Delete

Deletes a record from DocumentStore based on the given UUID.




h3. 2.7.1 Sending the request\\

URL: [http://localhost:9080/oledocstore/document]














Method: POST

Parameters:

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; docAction=delete or deleteWithLinkedDocs

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; requestContent=<request.xml as described in the next section>
\\

h3. 2.7.2&nbsp; Request XML \\

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Information about the delete&nbsp; operation to be performed. The "Id" attribute of <ingestDocument> should be a valid UUID of a previously ingested document.

{noformat}
 <request>
    <user>ole-khuntley</user>
    <operation>delete</operation>
    <requestDocuments>
<ingestDocument id="715e92f0-b3ab-4263-96d9-58183a23e6d5"><linkedIngestDocuments></linkedIngestDocuments></ingestDocument>
 </requestDocuments>
</request>

{noformat}

h3. 2.7.3&nbsp; Receiving the response

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; HttpResponse should be read and interpreted as given in the next section below.
\\
\\

h3. 2.7.4&nbsp; Response XML

&nbsp;Information about the response from the service for the given request.

{noformat}
 <response>
  <documents>
    <document id="715e92f0-b3ab-4263-96d9-58183a23e6d5"></document>
  </documents>
  <user>ole-khuntley</user>
  <operation>delete</operation>
  <status>Success</status>
</response>

{noformat}

h2. 2.8 BagIt Requests

When the document content is of (non-text or binary) format  PDF, DOC etc, (as in the case of License Agreement documents) it is difficult to send it to docstore through a web page.

And when the request has to deal with more than one of such  documents, it is even more difficult.

So, BagIt packaging standard is used to bundle such requests  along with the files of these formats.

The request can be for ingest, checkin, checkout or delete.

This functionality can be demonstrated using the "BagIt  Requests" tab in the[http://docstore.dev.ole.kuali.org/oledocstore/|http://docstore.dev.ole.kuali.org/oledocstore/] screen.

*To submit a request of this type follow these steps:*

1. Create a folder (e.g.  /opt/docstore/upload/bagItRequests/ingest) {link to an attachment of zipped  bagItRequests folder with ingest, checkin, checkout, delete folders}

&nbsp;&nbsp;&nbsp;&nbsp; (Make sure this folder has write permission for all users.)

2. Create and copy the request.xml and the corresponding  binary files into it.

3. Enter the full path of this folder in the text box for  "BagIt Requests Directory".

4. Click the Submit button.

*How a bagIt request is processed?*

The utility code for handling BagIt request creates a 'Bag'  (as per BagIt standard) out of the "BagIt Requests  Directory".

The content of the bag is sent to Docstore via an HTTP  connection.

Docstore unbags the received content into a temp folder, uses  the request.xml to process the files in the temp folder if any.

Docstore creates a respose.xml file to record the outcome of  the processed request

and copies it along with any files that are part of response  (in case of checkout) to a temp folder.

Then a 'Bag' is created out of the temp folder and sent back  to the client.

The utility code receives the content from Docstore, unbags  the content to a temp folder (e.g.  /opt/docstore/upload/bagItRequests/ingest/response).

The response.xml along with the temp folder name is to the  browser.

h2. 2.9 Appendix


h3. 2.9.1&nbsp; Sample Input XML for Ingest

{noformat}
 <request>
    <user>ole-khuntley</user>
    <operation>batchIngest</operation>
    <requestDocuments>
        <ingestDocument id="1" category="work" type="bibliographic" format="marc">
            <content><![CDATA[
<collection xmlns="http://www.loc.gov/MARC21/slim">
    <record>
        <leader>01142cam 2200301 a 4500</leader>
        <controlfield tag="001">92005291</controlfield>
        <controlfield tag="003">DLC</controlfield>
        <controlfield tag="005">19930521155141.9</controlfield>
        <controlfield tag="008">920219s1993 caua j 000 0 eng</controlfield>
        <datafield tag="010" ind1=" " ind2=" ">
            <subfield code="a">92005291</subfield>
        </datafield>
        <datafield tag="020" ind1=" " ind2=" ">
            <subfield code="a">0152038655 :</subfield>
            <subfield code="c">$15.95</subfield>
        </datafield>
        <datafield tag="040" ind1=" " ind2=" ">
            <subfield code="a">DLC</subfield>
            <subfield code="c">DLC</subfield>
            <subfield code="d">DLC</subfield>
        </datafield>
        <datafield tag="042" ind1=" " ind2=" ">
            <subfield code="a">lcac</subfield>
        </datafield>
        <datafield tag="050" ind1="0" ind2="0">
            <subfield code="a">PS3537.A618</subfield>
            <subfield code="b">A88 1993</subfield>
        </datafield>
        <datafield tag="082" ind1="0" ind2="0">
            <subfield code="a">811/.52</subfield>
            <subfield code="2">20</subfield>
        </datafield>
        <datafield tag="100" ind1="1" ind2=" ">
            <subfield code="a">Sandburg, Carl,</subfield>
            <subfield code="d">1878-1967.</subfield>
        </datafield>
        <datafield tag="245" ind1="1" ind2="0">
            <subfield code="a">Arithmetic /</subfield>
            <subfield code="c">
                Carl Sandburg ; illustrated as an anamorphic adventure by Ted Rand.
            </subfield>
        </datafield>
        <datafield tag="250" ind1=" " ind2=" ">
            <subfield code="a">1st ed.</subfield>
        </datafield>
        <datafield tag="260" ind1=" " ind2=" ">
            <subfield code="a">San Diego :</subfield>
            <subfield code="b">Harcourt Brace Jovanovich,</subfield>
            <subfield code="c">c1993.</subfield>
        </datafield>
        <datafield tag="300" ind1=" " ind2=" ">
            <subfield code="a">1 v. (unpaged) :</subfield>
            <subfield code="b">ill. (some col.) ;</subfield>
            <subfield code="c">26 cm.</subfield>
        </datafield>
        <datafield tag="500" ind1=" " ind2=" ">
            <subfield code="a">One Mylar sheet included in pocket.</subfield>
        </datafield>
        <datafield tag="520" ind1=" " ind2=" ">
            <subfield code="a">
                A poem about numbers and their characteristics. Features anamorphic, or distorted,
                drawings which can be
                restored to normal by viewing from a particular angle or by viewing the image's
                reflection in the
                provided Mylar cone.
            </subfield>
        </datafield>
        <datafield tag="650" ind1=" " ind2="0">
            <subfield code="a">Arithmetic</subfield>
            <subfield code="x">Juvenile poetry.</subfield>
        </datafield>
        <datafield tag="650" ind1=" " ind2="0">
            <subfield code="a">Children's poetry, American.</subfield>
        </datafield>
        <datafield tag="650" ind1=" " ind2="1">
            <subfield code="a">Arithmetic</subfield>
            <subfield code="x">Poetry.</subfield>
        </datafield>
        <datafield tag="650" ind1=" " ind2="1">
            <subfield code="a">American poetry.</subfield>
        </datafield>
        <datafield tag="650" ind1=" " ind2="1">
            <subfield code="a">Visual perception.</subfield>
        </datafield>
        <datafield tag="700" ind1="1" ind2=" ">
            <subfield code="a">Rand, Ted,</subfield>
            <subfield code="e">ill.</subfield>
        </datafield>
    </record>
</collection>
                ]]>
            </content>
        </ingestDocument>
    </requestDocuments>
</request>

{noformat}

h3. 2.9.2&nbsp; Sample Input file for Check In

The "Id" attribute of <ingestDocument> should be a valid UUID of a previously ingested document.


{noformat}
 <request>
    <user>ole-khuntley</user>
    <operation>checkIn</operation>
    <requestDocuments>
        <ingestDocument id="1" category="work" type="bibliographic" format="marc">
            <content><![CDATA[
<collection xmlns="http://www.loc.gov/MARC21/slim">
    <record>
        <leader>01142cam 2200301 a 4500</leader>
        <controlfield tag="001">92005291</controlfield>
        <controlfield tag="003">DLC</controlfield>
        <controlfield tag="005">19930521155141.9</controlfield>
        <controlfield tag="008">920219s1993 caua j 000 0 eng</controlfield>
        <datafield tag="010" ind1=" " ind2=" ">
            <subfield code="a">92005291</subfield>
        </datafield>
        <datafield tag="020" ind1=" " ind2=" ">
            <subfield code="a">0152038655 :</subfield>
            <subfield code="c">$15.95</subfield>
        </datafield>
        <datafield tag="040" ind1=" " ind2=" ">
            <subfield code="a">DLC</subfield>
            <subfield code="c">DLC</subfield>
            <subfield code="d">DLC</subfield>
        </datafield>
        <datafield tag="042" ind1=" " ind2=" ">
            <subfield code="a">lcac</subfield>
        </datafield>
        <datafield tag="050" ind1="0" ind2="0">
            <subfield code="a">PS3537.A618</subfield>
            <subfield code="b">A88 1993</subfield>
        </datafield>
        <datafield tag="082" ind1="0" ind2="0">
            <subfield code="a">811/.52</subfield>
            <subfield code="2">20</subfield>
        </datafield>
        <datafield tag="100" ind1="1" ind2=" ">
            <subfield code="a">Sandburg, Carl,</subfield>
            <subfield code="d">1878-1967.</subfield>
        </datafield>
        <datafield tag="245" ind1="1" ind2="0">
            <subfield code="a">Arithmetic /</subfield>
            <subfield code="c">
                Carl Sandburg ; illustrated as an anamorphic adventure by Ted Rand.
            </subfield>
        </datafield>
        <datafield tag="250" ind1=" " ind2=" ">
            <subfield code="a">1st ed.</subfield>
        </datafield>

    </record>
</collection>
                ]]>
            </content>
        </ingestDocument>
    </requestDocuments>
</request>

{noformat}

h3. 3. Search

&nbsp;This functionality allows documents to be searched for by giving  keywords or phases. Searching can be based on category, type, format,  search fields.

h4. *3.1&nbsp; Quick Search*

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Select Doc Category : Work

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Doc Type : Bibliographic

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Doc Format : ALL

&nbsp;&nbsp; Searching on default condition(click search button without specifying  any conditions) will give all the records in search result page.

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Select Doc Category : Work

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Doc Type : Bibliographic

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Doc Format : MARC

&nbsp;&nbsp; Type one or more keywords in a text box.

&nbsp;&nbsp; System shows records with any field matching one or more keywords.

h4. *3.2&nbsp; Advanced Search*

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Select Doc Category : Work

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Doc Type : Bibliographic

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Doc Format : MARC

&nbsp;&nbsp; The drop down for search fields will be populated based on the category selected above.

&nbsp;&nbsp; User specifies a search condition:

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Selects a field.

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Enters one or more keywords.

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Specifies whether the keywords should be searched for as "All of these", "Any of these" or "As a phrase".

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "All of these"&nbsp;&nbsp; - Any record with the  selected field having all the entered keywords is included in the search  results.

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "Any of these" - Any record with the  selected field having at least one of the entered keywords is included  in the search results.

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; "As a phrase"&nbsp; - Any record with the  selected field having all the entered keywords in same order is included  in the search results.

&nbsp;&nbsp; User adds another condition:

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Chooses whether to apply this condition in addition to  the previous one ("AND") or to apply this condition as an alternative to  the previous one ("OR")&nbsp; ("NOT"???),

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "AND"&nbsp; - the conditions before and after this operator should be satisfied.

&nbsp;&nbsp;&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; "OR"&nbsp;&nbsp;&nbsp;  - one of the conditions before and after this operator should be satisfied.

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; "NOT"&nbsp; - the condition after this operator should not be satisfied.

&nbsp;&nbsp; User repeats previous step as many times as needed using the  ADD and DELETE links.

&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; \[+\]ADD  : click on this link to add fields for a new search condition.

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp; &nbsp;&nbsp; \[-\]Delete : click on this link to delete the last search condition.

&nbsp;&nbsp; Search is performed based on the conditions entered by the user.

For more information refer to the "Search" section in the&nbsp; document :[https://wiki.kuali.org/display/OLE/OLE+Search+Technical+Documentation]