ARCHITECTURE
DOCSTORE ARCHITECTURE
Please read OLE DocStore wiki for more detailed architecture design and data modeling and organizing in DocStore. Currently DocStore hosts the bib, instance, license/agreement data.
COMMENTS ON DOCSTORE ARCHITECTURE
JOHN PILLANS thought we shouldn't use JackRabbit for storing Bib and Instance data, since we can not fully use the features that JackRabbit provides, and JCR brings bad performance here. He suggested we may put Bib and Instance data to database (blob field), instead of DocStore. The architecture would be much simple, with much faster performance.
WOULD LIKE John provides more detailed information about the IU library system architecture and performance evaluation on database and Solr.
COMMENTS ON UUID FOR BIB AND INSTANCE
JOHN PILLANS: Library would like to keep 16 digital identifier for Bib and instance new records, not just UUID. So, new ingested records should have this 16 digital identifier generated automatically.
More comments on the identifier
PERFORMANCE ISSUES
SLOW INGEST PERFORMANCE ON INSTANCE
...
CURRENT INGEST PERFORMANCE & REQUIREMENT
INGEST PERFORMANCE MINIMUM REQUIREMENT
From John Pillans: Ingest about 20 million legacy data (including bib, instance..) need to finish in one week!
CURRENT PERFORMANCE FOR INGESTING BIB DATA
...
Ingest 10 million instance records, processing time may take 42 days!
Detailed time breakdown for ingesting 1000 records:
...
For more detailed time breakdown, please read the spread sheet.