ODT/XML first aid


If you work with tex4ht to convert LaTeX to OpenDocument (for subsequent Word conversion in NeoOffice, say), you may find yourself wanting to doctor an .odt file. At least I did; sometimes tex4ht outputs an odt with problems or syntax errors. But here’s the nice thing, if you need some quick odt first aid: as I learned from this article by Maarten Wisse, an odt is really just a zip archive.

> unzip test.odt
Archive:  test.odt
   creating: META-INF/
  inflating: META-INF/manifest.xml
   creating: Pictures/
  inflating: content.xml
  inflating: meta.xml
  inflating: settings.xml
  inflating: styles.xml

Mirabile dictu, those xml files are pretty easy to read. All the good stuff is in content.xml and styles.xml. You can burrow into these files wth a text editor, modifying style parameters or the way tex4ht has tried to tag your content. And when you’re done:

> zip test.odt content.xml
updating: content.xml (deflated 81%)

That’s all! If NeoOffice gives you an error when it tries to open a generated odt, it will tell you the line number of the syntax problem, and you can just fix it by hand.

All right, I know, kludge city. But very, very useful in a pinch.