Info | ||
---|---|---|
| ||
https://jira.kuali.org/browse/OLE-1144 (OLE Search Executive- see also linked tasks and sub-tasks) |
DocStore Search
1. Indexed Data
1.1 Searchable fields for all document categories, types and formats
Field Name | Work-Bib-MARC | Work-Bib-DublinQ | Work-Bib-DublinUnQ | Work-Instance-OLEML | Work-Holdings-OLEML | Work-Item-OLEML |
---|---|---|---|---|---|---|
Title | Yes | Yes | Yes | No | No | No |
Author | Yes | Yes | Yes | No | No | No |
Subject | Yes | Yes | Yes | No | No | No |
Description | Yes | Yes | Yes | No | No | No |
Date of Publication | Yes | Yes | Yes | No | No | No |
Format | Yes | Yes | Yes | No | No | No |
Language | Yes | Yes | Yes | No | No | No |
Publisher | Yes | Yes | Yes | No | No | No |
ISSN/ISBN/other (last for dc identifier) | Yes | Yes | Yes | No | No | No |
Genre (marc genre/dc type) | Yes | Yes | Yes | No | No | No |
Edition | Yes | No | No | No | No | No |
Barcode | Yes | No | No | No | No | Yes |
Location | Yes | No | No | No | No | No |
Source | No | No | No | Yes | No | No |
Record Type | No | No | No | No | Yes | No |
Encoding Level | No | No | No | No | Yes | No |
Receipt Status | No | No | No | No | Yes | No |
Acquisition Method | No | No | No | No | Yes | No |
Policy Type | No | No | No | No | Yes | No |
Copies Reported | No | No | No | No | Yes | No |
Item Type | No | No | No | No | No | Yes |
Location Status | No | No | No | No | No | Yes |
Shelving Scheme | No | No | No | No | No | Yes |
Shelving Order | No | No | No | No | No | Yes |
Address | No | No | No | No | No | Yes |
Copy Number | No | No | No | No | No | Yes |
Volume Number | No | No | No | No | No | Yes |
1.2 Facet fields for all document categories, types and formats
Facet Field | Work-Bib-MARC | Work-Bib-DublinQ | Work-Bib-DublinUnQ | Work-Instance-OLEML | Work-Holdings-OLEML | Work-Item-OLEML |
---|---|---|---|---|---|---|
Subject | Yes | Yes | Yes | No | No | No |
Author | Yes | Yes | Yes | No | No | No |
Format | Yes | Yes | Yes | No | No | No |
Language | Yes | Yes | Yes | No | No | No |
Publication Date | Yes | Yes | Yes | No | No | No |
Genre | Yes | Yes | Yes | No | No | No |
1.3 Field definitions for Work-Bib-MARC documents
Field | Data fields for search (MV- indicates multi-valued) | Data fields for short display | Data fields for detailed display | Data fields for Facet |
---|---|---|---|---|
ISSN | 022 - a,z (MV) | first value | all values | same as search field |
ISBN | 020 - a,z (MV) | first value | all values | same as search field |
Author/Creator | For each 100, 110: every subf except $6 (gives us 2 values for every tag). Also every subf except $t for: 111, 700, 710, 711, 800, 810, 811, 400, 410, 411) (MV) | first non-empty value of 100$a or 110$a etc | all values | same as short display value |
Title | 245 - all subf exc. c and 6. Also, 130, 240, 246, 247, 440, 490, 730, 740, 773, 774, 780, 785, 830, 840) (MV) | 245$a and 245$b | all values |
|
Place of Publication | 260 - a (MV) | first value | all values | same as search field |
Description | 505 - a (MV) | first value | all values | same as search field |
Subject | 600, 610, 611, 630, 650, 651, 653, 69X: every subf exc. $6 across these tags (MV) | first non-empty value of 600$a, 610$a etc | all values | same as short display value |
Date of Publication | <marc:controlfield tag="008">[Date 1 in the 7-10 positions LR: Can also include 260 $c. (260-c is same as the value in control field. Use this if control field does not have pub date value.) (MV) | first value | all values | same as search field |
Edition | 250 - a,b (MV) | first value | all values | same as search field |
Form/Genre | 655 - a, v (MV) | first value | all values | same as search field |
Language | <marc:controlfield tag="008">[language code in the 35-37 positions]</marc:controlfield> LR: Add 546 $a (MV) | all values | all values | same as search field |
Format | 856 - q | first value | all values | same as search field |
1.4 Format field definitions for Work-Bib-MARC documents
Label | Marc Fields | Comments |
---|---|---|
Manuscript | Has any holdings with "manuscripts" in location_name (gets only this value) | LR: MARC XML does not have location_name so this is irrelevant to the IU data that OLE has for November. Manuscript could be determined by the Leader 06/07. 06 values a, f, t equal manuscripts on their own. 07 values c and d seem to imply mauscript/archival collections/series. We should check with the SMEs on this one. |
Microformat | Has 245 $h containing "micro" OR has any holdings with "micro" in location_name OR call_number starts "micro" (gets only this value) | LR: the 245 $h "micro" will work for the IU OLE MARCXML we have, but the reamaing text is specific to UPenn. |
Archive | Has any holdings with "archive" in location_name (gets only this value) | LR: This is specific to UPenn. We may need to talk to IU about if they include Archive descriptions in their MARC records and how they designate them as such. |
Thesis/Dissertation | bib_format is 'tm' AND has a 502 field | LR UPenn's bib_format seems to be a combination of the data values found in the 06/07 Leader fields. For example, t in the 06 is Manuscript and m in the 07 is Monograph/Item and together they equal a Thesis/Dissertation. |
Conference/Event | Has a 111 or 711 field [LR: Include 611 or 811] |
|
Book | bib_format is 'aa', 'am' or 'ac' or 'tm'; exclude $h [micro*] and $k [kit] | LR: the 2 characters are from the Leader 06/07 the inclusions are 245 subfields |
Sound recording | bib_format is 'im' or 'jm' or 'jc' or 'jd' or 'js' | LR: the 2 characters are from the Leader 06/07 |
Musical score | bib_format is cm, dm, ca, cb, cd or cs | LR: the 2 characters are from the Leader 06/07 |
Map/Atlas | bib_format is 'e*' or 'fm' | LR: the 2 characters are from the Leader 06/07 |
Video | bib_format is 'gm' AND 007/0 = v | LR: the 2 characters are from the Leader 06/07 |
Projected graphic | bib_format is 'gm' AND 007/0 = g | LR: 007 is a controlled field that indicates the format/physical description at general level and then associated subfields are more specific. |
Journal/Periodical | bib_format is 'as' or 'gs' | LR: 007 is a controlled field that indicates the format/physical description at general level and then associated subfields are more specific. |
Image | bib_format is 'km' | LR: the 2 characters are from the Leader 06/07 |
Datafile | bib_format is 'mm' | LR: the 2 characters are from the Leader 06/07 |
Newspaper | bib_format is 'as' AND (008/21 = 'n' OR 008/22 = 'e' ) | LR: the 2 characters are from the Leader 06/07. The 008 controlled field in those 2 positions provides the "form") |
3D object | bib_format is 'r*' | LR: the single character maps to the 06 position in the leader. |
Database/Website | bib_format is '*i' | LR: the single character maps to the 06 position in the leader. |
Government document | bib_format is NOT c*, d*, i*, j* AND ( (008/28 = f, i, o and 260$b not 'press') ) | LR: the single character maps to the 06 position in the leader. 008 is a fixed length controlled field and 260 $b is a type of publication. |
Other | any bib_format not caught above | LR: Presumably relates to other 06/07 Leader data values not represented. |
1.5 Field definitions for Work-Bib-DublinCore documents
Field | DC-UnQ fields for Search | DC-Q fields for Search | Data fields for short display | Data fields for detailed display | Data fields for Facet |
---|---|---|---|---|---|
Author | <dc:creator> (MV) | <dcvalue element="contributor" qualifier="author"> | first value | all values | same as search field |
Description | <dc:description> (MV) | Per Bob P.: Do not show Abstract description. | first value | all values | same as search field |
Language | <dc:language> (MV) | <dcvalue element="language" qualifier="iso">en_US</dcvalue> | first value | all values | same as search field |
Subject | <dc:subject> (MV) | <dcvalue element="subject" qualifier="none"> | first value | all values | same as search field |
Title | <dc:title> | <dcvalue element="title" qualifier="none"> | first value | all values | same as search field |
Type | <dc:type> (MV) | <dcvalue element="type" qualifier="none"> | first value | all values | same as search field |
Date of Publication | <dc:date> | <dcvalue element="date" qualifier="issued"> | first value | all values | same as search field |
Format | <dc:format> (MV) | <dcvalue element="type" (This is covered in a separate field. So do not include it in Format) | first value | all values | same as search field |
Publisher | <dc:publisher> (MV) | <dcvalue element="publisher" | first value | all values | same as search field |
ISBN/ISSN/other | <dc:identifier>(ISSN)0198-9669</dc:identifier> (MV) | <dcvalue element="identifier" qualifier="isbn">0-918006-48-1</dcvalue> | first value | all values | same as search field |
2. Search and Display
3. NISO Standard for Sort
Transactional Search
OLE coding to-date for Acquisitions functions have utilized KNS Lookups, DocSearch (Detailed Search, Superuser Search), and named or session-based searches......
<insert more info on framework>
Notes/Not yet implemented:
- Authority records: linkages, search, NACO standards
- Call Number Browse (coming in OLE 0.8)
- Linked PO or Circ record from Item, and Order/Circ status (coming in OLE 0.8)
- Search filters: Location, Format, TBA
- External Linked Data: Authority, or other stores
- Saved DocStore Searches (or user preferences)
- Checkin, Checkout from Search
- Rice/KNS upgrades (future): search facets and other enhancements for transactional search
- Non-Roman Characters (ie, Chinese, Russian, etc)