Logical Data Model (Analysis - To inform Design during construction and provide common terminology)

This model is ONE expression of the domain concepts uncovered in the MOBIUS discovery phase. It may suggest an approach to take in construction, but it is not intended to be a speculative design. In particular, functional and non-functional requirements can case the logical design model to be different to the relations presented above. This should be read as “One persons understanding of the domain” it is not “The truth”. It is useful to discuss, challenge and correct assumptions IN THIS PAGE however, so we develop a shared consensus.

NOTE: Missing item >- location relationship - coming in next iteration of model.

Local Server is terminology borrowed from the existing systems and is largely synonymous with the ReShare concept of “HostLMS” - a “Local Server” is a physical installation of a library system which itself can host many logical library instances - referred to as “Agencies” (below).

Agency is terminology borrowed primarily from NCIP and in this case refers to a specific institution - I.E. the legal entity participating a library in a network. This could be an academic institution, or a local authority. In less precise language, Agency is similar to “Library” and “Institution” but is preferred because it is unlikely to be confused with “Branch”, “Site”, “Campus” or other physical.

API Credentials A common and sensible pattern is to separate out system level accounts from patron accounts. API Credentials model the keys, certificates, user+password pairs which library systems use to enable interoperability. A separate concern to patron authentication.

Grant is used to enable a single API Credential to relate to several Agencies. For example - we have an API Key and a Secret for the current kc-towers cluster (Local Server). That Local Server hosts several institutions - e.g. Rockhurst, Some other Library, Some other other library. That single key may be usable to retrieve records from any agency hosted on the local server. In other deployments, perhaps API credentials are needed at the level of the individual agency.

Consortium A consortium is a logical grouping of Agencies bound by a set of agreements and rules. Membership is agreed between both the Agency and the Consortium. Examples of Consortia include “MOBIUS”, “PALCI”, “IPLC”, “PROSPECTOR”.

Consortium Membership Tracks the membership of agencies in consortia. N.B. Agencies can be members of multiple libraries - I.E. A Library can be a member of both MOBIUS and PROSPECTOR. Libraries are added and removed to/from consortia and these relationships are considered dynamic. In this logical model there is no representation of consortia to consortia relationships outside the membership. It is conceivable that a consortium could also be considered an agency, which might suggest that Consortium and Agency should be collapsed into a single “Party” entity. This is a decision for logical design.

Record Encoding is a reference point for the semantics of an encoded record “MARC21”, “MARCXML”, etc which allows us to describe a record stream in sufficient detail to onboard records.

Exchange Record is the array of bytes representing a single bibliographic record

Canonical Bib is the normalized / canonized form of the exchange record in a way that renders all incoming records in a single format we can process. The expectation is that key fields will be extracted in order to aid fast clustering after on-boarding the record. It is likely that the canonical bib will also carry materialized versions of any match keys needed for different clustering processes.

Explicit Correlation is the mechanism whereby a consortium staff member or other authorized individual can make an explicit statement that two records are or are not a match, as a way to override or provide advice to the clustering algorithm. Alternate names considered included “Errata” - but since these correlations may be preemptive the chosen name feels more appropriate. In discovery, a number of experiments have been undertaken with machine learning for clustering - this entity is the natural home for training data as it is gathered, but also the home for the MOBIUS expressed need to be able to state explicit matches and non-matches. An explicit correlation is a relationship between two canonical records.

Cluster Record expresses a canonical bibliographic record which groups together records from member agencies using some criteria. E.G. The cluster record for “Brain of the Firm” collects together all instance records over all members. Different cluster definitions may make different choices about Work or Instance clustering but for the purposes of MOBIUS, the “Large Print” edition of “Brain of the Firm” is a different cluster record to the standard print edition.

Cluster Definition allows for the parallel definition of many clustering strategies. In discovery the approach of dynamically clustering using views was evaluated and discounted on performance grounds. The model described allows for clusters to be materialized. Although a design concern, it is anticipated that clusters will be logically partitioned (Either using table partitioning, or preferably multiple dynamic tables). Today, MOBIUS uses match keys in a similar way to goldrush as an INITIAL BLOCKING mechanism and then applies post-blocking rules to refine the clustering. This is in line with most modern approaches to record clustering and an approach we will follow. The Cluster Definition will necessarily be tied to code releases as clustering is an inherently computational process. The need to materialize cluster membership (And the reason dynamic view based clustering will not work for us here) is the need to capture the final state of that computation - it’s too complex to do in real time.

Agency Bib Expresses the presence of a canonical bib record at a given agency. Record sharing and copy cataloging approaches mean that the same bib record can be used to describe items at many agencies. this entity allows that expression.

Agency Bib may also be a convenient place to break out records from a consortial folio which aggregate holdings over many institutions. The consortial bib becomes the canonical record and each library/agency has an Agency Bib which links to that record and provides a mount point for the library holdings - thus re-constituting the original federated structure.

Item The purpose of a bib record ultimately is to describe an item findable by a patron. Each Agency Bib collects together a number of item records. item records are used to collect together location, materialized circulation status and other data. it is essential for performance reasons that the system is able to uniquely identify item records as they appear in local systems - in order to be able to cache/update circ data in real time.

Patron represents the borrowers at institutions. No attempt is made to model patron memberships cross library - should we need to do that in the future, this entity will likely become the join table and need to be renamed Agency-Patron. For now we do not need to model this case.

Patron Request A patron places a request for a cluster record - not an identified item. The system will calculate the best item under the cluster record using it’s routing algorithm. Patron Request represents the act of a patron requesting an item.

Supplier Request An explict acknowledgement and variance to the existing ReShare-ILL model. We explicitly model the act of asking a supplying library to lend an item. In MOBIUS the first to fill ratio suggests that in 90+% of cases, there will be a 1:1 correlation of patron requests to supplier requests. However, when a supplier is unable to supply, multiple supplier requests can arise for a patron request.

Representative Sequence#1

The following sequence represents the most simple requesting case where a patron of a “Passive” agency (SIERRA or INN-Reach via their standard API, which necessitates polling where the INN-Reach API would have provided more reactive notifications had that been a viable alternative) is requesting an item from an “Active” agency (In this case that being FOLIO via it’s INN-Reach API OR some other ReShare-DCB module which would be installed at the FOLIO node to create a reactive event stream). The sequence covers early lifecycle events of requesting bib updates, and then the path which leads to circ events in both lending and borrowing systems. Details are necessarily generic as the actual codes and states vary between implementations. What is important is that our internal API reflects the core “HOLD”, “ACCEPT”, “RETURN” type events and maps these canonical internal concepts as cleanly as possible into host LMS implementations.

ReShare

Logical Analysis Artefacts

Representative Sequence#1

Deployment context