DocStore Architecture Evaluation

Requirements

Requirement for OLE DocStore/Discovery system based on IU's current
library system:
Store up to 25-50M records, but more close to 25M (currently IU library
has less than 25M records)
Incremental loading: 10K records
Concurrent reads: 500
Concurrent writes: 20 per minute (or 1 per 1-3 second)

Search Samples: 

This search sample file is that John collects for the testing, which will give an idea for the normal searches in library. 

Testing use case:

Normally we test non-staff user searching on our systems since that accounts for about 80% to 90% of the load.  That is out of scope here since OLE does not provide a non-staff search.  So based on our highest load days, approximately 10,000 records are added are updated on our system.  Over an 8 hour day, that is about .5 per second.  I would round up to 1 create/update per second.  Reads are higher so I would place at 5 per second.  This is from the docstore.  For the Solr index, I would say the searching would be heavier IF some of the display data is stored in the index.  This is a question for the Peri and Nianli.  I would say the Solr index would be hit at the about 10 searches/second at peak rate.  In the middle of all this, a batch load of 10,000 records would not be unusual.

So summarizing

Create/Update Rate on Docstore:  1-2 transactions/second <- this would also update the solr indexes

Read Rate on Docstore: 5 transactions/second

Read Rate on Solr Indexes:  10 transactions/second.  If the display data is coming from Docstore then Docstore read = Solr Read i.e. 10 transactions/second

Batch load of 10,000 records in the midst of testing.

Evaluation Plan

1) Build and install the app
2) Use debuggers and profilers to analyze the running app (locally)
3) Build profiling scripts to push the app to the failure point (locally)
4) Run the profiling scripts against the test server and develop an app profile
5) Check the code in the areas where performance issues are identified
6) Write up overall recommendations around setup, architecture, performance, and code practices
7) If time remains, add aspects in the code to do method profiling and generate a report of method performance under load

Problems found by Unicon

DocStore Shutdown Issues:

These are all memory cleanup issues. If the app is shutdown or
redeployed without shutting down the server then these will cause the
server to eventually run out of memory and crash.
SEVERE: The web application [/oledocstore] registered the JDBC driver
[org.apache.derby.jdbc.AutoloadedDriver] but failed to unregister it
when the web application was stopped. To prevent a memory leak, the
JDBC Driver has been forcibly unregistered.
May 21, 2012 2:54:56 PM org.apache.catalina.loader.WebappClassLoader
clearReferencesThreads
SEVERE: The web application [/oledocstore] appears to have started a
thread named [Timer-3] but has failed to stop it. This is very likely
to create a memory leak.
May 21, 2012 2:54:56 PM org.apache.catalina.loader.WebappClassLoader
clearReferencesThreads
SEVERE: The web application [/oledocstore] appears to have started a
thread named [DynamicPooledExecutor] but has failed to stop it. This
is very likely to create a memory leak.
May 21, 2012 2:54:56 PM org.apache.catalina.loader.WebappClassLoader
clearThreadLocalMap
SEVERE: The web application [/oledocstore] created a ThreadLocal with
key of type [org.apache.derby.iapi.sql.dictionary.TableDescriptor$1]
(value [org.apache.derby.iapi.sql.dictionary.TableDescriptor$1@5dbdedf8])
and a value of type [java.util.WeakHashMap] (value [{=null, =null,
={1, 2, 3}, =null, =null, =null, ={1, 2, 3}, =null, =null}]) but
failed to remove it when the web application was stopped. This is very
likely to create a memory leak.
May 21, 2012 2:54:56 PM org.apache.catalina.loader.WebappClassLoader
clearThreadLocalMap
SEVERE: The web application [/oledocstore] created a ThreadLocal with
key of type [java.lang.ThreadLocal] (value
[java.lang.ThreadLocal@605264dc]) and a value of type
[org.apache.derby.iapi.services.context.ContextManager] (value
[org.apache.derby.iapi.services.context.ContextManager@3499c4cc]) but
failed to remove it when the web application was stopped. This is very
likely to create a memory leak.

Warnings

WARN  org.apache.jackrabbit.core.query.lucene.SearchIndex.setTextFilterClasses():Line
2087: The textFilterClasses configuration parameter has been
deprecated, and the configured value will be ignored:

Strange dir in tomcat_home (or whatever dir that tomcat is started from)

It looks like someone is trying to create a windows directory path
(\OLE-SOLR\Indexes). I don't think this is causing a problem but it
indicates some bad code exists.
AZMBP:tomcat6 azeckoski$ ls -al
total 136
drwxr-xr-x  15 azeckoski  admin    510 21 May 14:50 .
drwxr-xr-x   7 azeckoski  admin    238 21 May 09:47 ..
-rw-r-r-@  1 azeckoski  admin  37951 20 Feb 17:23 LICENSE
-rw-r-r-@  1 azeckoski  admin    558 20 Feb 17:23 NOTICE
-rw-r-r-@  1 azeckoski  admin   8680 20 Feb 17:23 RELEASE-NOTES
-rw-r-r-@  1 azeckoski  admin   6670 20 Feb 17:23 RUNNING.txt
drwxr-xr-x   6 azeckoski  admin    204 21 May 14:50 \OLE-SOLR\Indexes\auth
drwxr-xr-x@ 25 azeckoski  admin    850 21 May 10:00 bin
drwxr-xr-x@  9 azeckoski  admin    306 20 Feb 17:23 conf
-rw-r-r-   1 azeckoski  admin   1565 21 May 14:54 derby.log
drwxr-xr-x@ 18 azeckoski  admin    612 20 Feb 17:23 lib
drwxr-xr-x@ 11 azeckoski  admin    374 21 May 14:50 logs
drwxr-xr-x@  4 azeckoski  admin    136 21 May 14:50 temp
drwxr-xr-x@  6 azeckoski  admin    204 21 May 14:26 webapps
drwxr-xr-x@  3 azeckoski  admin    102 21 May 14:50 work
AZMBP:tomcat6 azeckoski$ find "\OLE-SOLR\Indexes\auth"
\OLE-SOLR\Indexes\auth
\OLE-SOLR\Indexes\auth/index
\OLE-SOLR\Indexes\auth/index/segments.gen
\OLE-SOLR\Indexes\auth/index/segments_1
\OLE-SOLR\Indexes\auth/spellchecker1
\OLE-SOLR\Indexes\auth/spellchecker1/segments.gen
\OLE-SOLR\Indexes\auth/spellchecker1/segments_1
\OLE-SOLR\Indexes\auth/spellchecker2
\OLE-SOLR\Indexes\auth/spellchecker2/segments.gen
\OLE-SOLR\Indexes\auth/spellchecker2/segments_1
\OLE-SOLR\Indexes\auth/spellcheckerFile
\OLE-SOLR\Indexes\auth/spellcheckerFile/segments.gen
\OLE-SOLR\Indexes\auth/spellcheckerFile/segments_1

println in log ("in aadvancedSearch.jsp")

Looks like someone left in a println statement by accident. Should use
log.debug instead.
May 21, 2012 2:50:51 PM org.apache.jk.server.JkMain start
INFO: Jk running ID=0 time=0/12  config=null
May 21, 2012 2:50:51 PM org.apache.catalina.startup.Catalina start
INFO: Server startup in 6749 ms
in aadvancedSearch.jsp
in aadvancedSearch.jsp
May 21, 2012 2:54:54 PM org.apache.coyote.http11.Http11Protocol pause
INFO: Pausing Coyote HTTP/1.1 on http-8080

Session closing issue:

2012-05-21 15:36:13,886 [Finalizer] WARN
org.apache.jackrabbit.core.SessionImpl.finalize():Line 1326: Unclosed
session detected. The session was opened here:
java.lang.Exception: Stack Trace at org.apache.jackrabbit.core.SessionImpl.<init>(SessionImpl.java:222) at org.apache.jackrabbit.core.SessionImpl.<init>(SessionImpl.java:239) at org.apache.jackrabbit.core.XASessionImpl.<init>(XASessionImpl.java:99) at org.apache.jackrabbit.core.RepositoryImpl.createSessionInstance(RepositoryImpl.java:1573) at org.apache.jackrabbit.core.RepositoryImpl.createSession(RepositoryImpl.java:923) at org.apache.jackrabbit.core.RepositoryImpl.login(RepositoryImpl.java:1461) at org.apache.jackrabbit.core.TransientRepository.login(TransientRepository.java:381) at org.apache.jackrabbit.commons.AbstractRepository.login(AbstractRepository.java:123) at org.kuali.ole.RepositoryManager.getSession(RepositoryManager.java:71) at org.kuali.ole.RepositoryBrowser.generateNodeCount(RepositoryBrowser.java:488) at org.apache.jsp.nodeCount_jsp._jspService(nodeCount_jsp.java:96) at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70) at javax.servlet.http.HttpServlet.service(HttpServlet.java:717) at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:388) at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:313) at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:260) at javax.servlet.http.HttpServlet.service(HttpServlet.java:717) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at net.bull.javamelody.MonitoringFilter.doFilter(MonitoringFilter.java:185) at net.bull.javamelody.MonitoringFilter.doFilter(MonitoringFilter.java:159) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:602) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:680)
2012-05-21 15:36:13,888 [Finalizer] INFO
org.apache.jackrabbit.core.TransientRepository.loggedOut():Line 424:
Session closed

HTC has patched the code at https://svn.kuali.org/repos/ole/tags/OLE-0.6-1/DocumentStore. There are 2 files have changed, RepositoryManager and RepositoryBrowser.

Operated as a Community Resource by the Open Library Foundation