Introduction
Testing of v.2 includes ingest and clustering in the staging environment.
Ingest tests
Counts
Testing of ingest includes count comparisons between suppressed instances stored in the host LMS and instances for the host LMS stored in the DCB. The three FOLIO instances include:
This is the folder with the record comparisons for all three FOLIO tenants but start with the DCB v.2 testing – staging doc for context and the SQL used to generate the counts.
CRUD
Testing of ingest also included whether or not creation, update and deletion actions and, suppression and unsuppression actions in FOLIO resulted in the expected behavior in DCB. You can find the results here.
Analysis of counts
For Truman University instances that were not found in DCB and vice versa and concluded the following observations:
Suppressed FOLIO instances are making their way into the union catalog. Examples:
DCB instance UUID | FOLIO instance screenshot |
---|---|
8458ee57-bc66-58ee-9d65-6a131e285e1d | |
0086d48c-e441-5934-a433-5a55ea044505 |
For Truman University, the reports show that there are +1.089% records in DCB than FOLIO. That said, of the 11 records checked, all were located in Truman’s FOLIO using the Inventory application. It’s unclear why the SQL results in these are being reported in DCB but not in FOLIO.
Bib UUID | In FOLIO? | suppressed? | shared? |
00057725-4edf-5424-b51a-b39298c8c755 | Y | N | Y |
0007847a-3f9f-5b9d-a1ef-b4e6e0ecdeb8 | Y | N | Y |
0086d48c-e441-5934-a433-5a55ea044505 | Y | Y | Y |
00945e6a-868e-555d-846b-e33ff8921da8 | Y | N | Y |
00aa1e28-0984-518b-9b2b-67d63b04c77a | Y | N | Y |
01f5bff0-ae8f-5277-9301-9498561daf7b | Y | N | Y |
0324db74-e93a-5190-b0b7-cbac8d44901d | Y | N | Y |
0a0433f9-e53b-5e4b-a494-906fd5322c4a | Y | N | Y |
15e253e6-ca43-4d1a-be26-868241407714 | Y | N | Y |
8458ee57-bc66-58ee-9d65-6a131e285e1d | Y | Y | Y |
9f515f8e-421d-5640-9e4d-85818d68e6f6 | Y | N | Y |
I suspect the query resulting in more records in DCB than FOLIO is wrong. FOLIO’s inventory app reports 635,589 unsuppressed instances v. 630,731 via the FOLIO query v. 637,600 via the DCB query.
University of Missouri there are +4.38% reported in DCB that are not reported in FOLIO. I reviewed 10 records and here is what I found:
Bib UUID | In FOLIO | suppressed? |
00006cb6-e2ef-5b61-9a3a-bfc17da6e27c | Y | Y |
0000749a-0131-4d26-a10f-1c16c1fd70b9 | Y | Y |
001c1a4b-cf1f-5571-9ceb-c9c5fde33832 | Y | Y |
00801e4e-69c3-5aab-9daf-7ed63c78039e | Y | Y |
01b64b59-3c06-507f-b5a9-9dc1a5518ea1 | Y | Y |
020b7914-84c6-5f01-bbd4-68a4bdea9cf5 | Y | Y |
05c3d1ba-2acf-59f5-8d2c-0a43af4b60a3 | Y | Y |
0be07767-233f-5a97-bbb5-d28e5185a153 | Y | Y |
18f4b9d6-104e-52c6-9f81-cc3d870f2950 | Y | Y |
2012e7fc-e4a8-5277-b846-60f13c5e65ca | Y | Y |
Again, the numbers are not making sense. The Inventory application reports 4,585,040 unsuppressed instances. The query results in 4,584,880 unsuppressed instances, a difference of 160 instances. DCB query results in 4,785,864. If we assume that the suppressed FOLIO instances are being added to DCB and then subtract the 260,429 suppressed FOLIO instances we get 4,525,435 which leaves about 60,000 instances short.
For Calvary University the discrepancy is significantly smaller (+0.09%). I reviewed 7 records, all of which are in FOLIO and all of which are suppressed.
Bib UUID | In FOLIO? | Suppressed? | Shared? |
18d22573-7b32-5362-9588-cdddc514e9cf | Y | Y | Y |
1e2a83ba-fae8-5510-8232-5d720fffa053 | Y | Y | Y |
37302bca-678e-53fe-8fa8-6669e276a39c | Y | Y | Y |
47590cfc-be64-5550-b263-4b8c06cf55cb | Y | Y | Y |
48709e16-042d-5f2a-ad06-050dd2493ec9 | Y | Y | Y |
4d74b545-6a69-522e-b527-009c10a18c1a | Y | Y | Y |
e0e63628-abe4-50a5-9957-310b7d37ac4e | Y | Y | Y |
The FOLIO query results in 29,434 unsuppressed instances. The Inventory app reports 29,423 unsuppressed instances and 35 suppressed instances. The DCB query 29,461 instances. Subtracting the reported 35 FOLIO suppressed instances gives a total of 29,426, very very close to what the inventory app reports as unsuppressed instances.
Carole Godfrey shared an OAI document that may be helpful in providing insight into the discrepancies. The documentation did not reveal any insights for me but it would be good to have a second pair of eyes take a gander.
Cluster tests
This Jira filter is a temporary substitute for a list of clustering use cases. These are the reports from customers in production. Next step is to find the corresponding records in staging and then report the results. It’s expected that not all reported problems will be found in staging since not all reports include libraries that are part of the staging environment.
Cluster instance records
Production instance (v1) | Staging instance (v2) | Expected behavior | Jira report |
---|---|---|---|
When we meet again / Kristin Harmel | When we meet again / Caroline Beecham When we meet again / Kristin Harmel | Separating Beecham from Harmel instances is the expected behavior. | |
The Theology of St. Cyril of Alexandria: A Critical Appreciation Seeking the Truth of Change in the Church : Reception, Communion and the Ordination of Women (Calvary) | Expected behavior, separation based on differences in ISBN, gold rush key, and absence of other identifiers. | ||
Eyewitness Horse (CALS) | Expected behavior. Note that there are many different works (9?) clustered in production so while I could keep peeling the layers what is most import is that this change will have a big positive impact on discovery, requesting and RTAC performance. | ||
fdsaljfdjk; | Expected behavior but the algorithm should change to accommodate differences in edition. The screenshot to the left of this column shows a couple of identifiers that could be used to differentiate by edition. This is no silver bullet but it’ll be a big step in the right direction. |