Word documents to XML via upCast
Following Rice's example, using upCast to convert Word documents to upCast (Windows machine).
- Once upCast is installed: Resources > templates > Word to DocBook > open template.
- Documentation for upCast: http://upcast.de/iloop/assets/content/products/upcast/765b1744/doc/manual/html/index.html#N2004D
- Documentation also in the Word to DocBook folder
- Change the Catalog (under Pipeline Settings) to
${pipeline:PipelineBase}/resources/schema/catalog
- Strip the title page, TOC from the Word documents; Large documents kept failing/timing-out so I had to break them in two.
- Choose the file (even though it says rtf to DocBook v 5.0 it seemed to convert .docx files just fine).
- I didn't select any options
- Table Model: CALS
- DocBook structure: book > chapter > section
DocBook Notes
Chapters are Intro, Standard Docs (overview), Maintenance Docs (overview), Appendix
Section X.X would be each individual standard doc or maintenance doc or item in the appendix.
Section X.X.X are sub-titles for each document: Getting Started, Process Overview, Document Layout, Business Rules and Routing (all are on the same level in the hierarchy).