/
Indexing Strategy

Indexing Strategy

This page documents the output of https://openlibraryfoundation.atlassian.net/browse/DCB-53

Overall

Our ideal target is a system where member systems push bib and item updates into the hub in a reactive manner.

The current situation however suggests that short-loop polling is the only viable option, particularly for POLARIS and SIERRA. Push based updates could be developed for FOLIO, but present APIs support short-loop polling and can be implemented without introducing dev dependencies.

There is a question as to the relative benefits of harvesting Bibs+Items in a unified operation, vs more granular or separate Bibs + Items harvest operations. These decisions will need to be taken on a per-source basis, but internally ReShare-DCB should be capable of accepting atomic updates at the level of items or bibs.

Folio

PHASE#1 : Short term

Discussions (TA,II,SD) suggest that the mod-search API is the most likely candidate for short term harvesting of all inventory items. OAI suffers from performance issues, and record visibility issues (realting to SRS). Mod-search reflects in near-real-time inventory changes, supports Instance and Item collections, and supports metadata.createdData and updatedDate, where an initial inspection suggests updatedDate is always populated, making it a good candidate for a harvesting cursor.

Folio has no granular way to prevent expansion of items when harvesting instances, but instance level date-stamp changes may not cascade up to bib items - it will therefore be necessary to harvest both.

Exploratory work should be undertaken to discover if item data should be ignored from a bib harvest, or eagerly consumed.

PHASE#2 : Target

It would be ideal if as a part of the consortial FOLIO work, the same functions which keep the central folio index in sync could be used to keep the shared index in sync - effectively calling the bib and item update endpoints. Disucssions between TA/II/VB should explore this in more detail ASAP.

Sierra

Sierra offers the /bibs and /items API endpoints. Bibs are capable of having it’s items expanded or not through a granular API parameter, and both endpoints support updatedDate parameters for efficient paging/resumption. These endpoints with their updatedDate parameter are the selected strategy for Sierra record sync.

Polaris

The polaris API (Polaris API - Overview ) is less accessible than the Sierra API, but it does offer an explicit sync API for third party discovery systems: Synch Methods for 3rd Party Discovery Interfaces. The Polaris Synch_GetUpdatedBibs and Synch_GetUpdatedItems are to be analagous to our bib and item feeds in FOLIO and Sierra and are the currently chosen strategy for extracting records from Polaris.

 

Operated as a Community Resource by the Open Library Foundation