Word documents to XML via upCast

Following Rice's example, using I used upCast to convert Word documents to upCast (Windows machine).  The Foundation has a license for the software.  These are the steps I took for the conversion:

  1. Once upCast is installed: Resources > templates > Word to DocBook > open template.
    1. Documentation for upCast: http://upcast.de/iloop/assets/content/products/upcast/765b1744/doc/manual/html/index.html#N2004D
    2. Documentation is also in the Word to DocBook folder
  2. Change the Catalog (under Pipeline Settings) to ${pipeline:PipelineBase}/resources/schema/catalog
  3. Strip the title page, TOC from the Word documents; Large documents kept failing/timing-out so I had to break them in two.
  4. Choose the file (even though it says rtf to DocBook v 5.0 it seemed to convert .docx files just fine).
  5. I didn't select any options
  6. Table Model: CALS
  7. DocBook structure: book > chapter > section

DocBook Notes

Once converted, these are the items I needed to do

  • Fix hierarchy:
    1. Chapters are Intro, Standard Docs (overview), Maintenance Docs (overview), Appendix
    2. Section X.X would be each individual standard doc or maintenance doc or item in the appendix.
    3. Section X.X.X are sub-titles for each document: Getting Started, Process Overview, Document Layout,
    4. Section X.X.X.X are lower-level titles for each: Tabs, Business Rules and Routing.

...

UPDATE: We are able to use the Rice stylesheetstyle sheet, Jeff and Peri set up the documentation to be built through Maven very similar to Rice.

Availability

Can we follow this format: DocBook Environment Setup? - Jeff Caddel has experience with checking in and working with DocBook xml in SVN.  He is looking into hooking helped Peri to hook the DocBook output maven plugin into OLE.