And if that title doesn’t grab you, I don’t know what will.
After some hacking I improved my earlier tangled pandoc-tex4ht-rinse-repeat process for converting a mostly-markdown-and-some-LaTeX syllabus with biblatex citations into HTML by taking advantage of pandoc’s scriptability. This required some minor Haskell flailing, which is fun in its way.1
Anyway, I had struggled with tex4ht’s eccentric conversion of a biblatex bibliography into a definition list with empty <dt>
tags. The following Haskell script (which I’ve uploaded as a github gist) deals with this issue:
I compiled this as a standalone program html_clean
, so that I can run:
pdflatex syllabus-web.tex
biber syllabus-web
pdflatex syllabus-web.tex
htlatex syllabus-web.tex syllabus.cfg " -cunihtf -utf8" "-cvalidate"
html_clean < syllabus-web.html > syllabus.html
The details of what syllabus-web.tex
consists of are in the earlier post. It’s still pretty kludgy—I didn’t figure out how to stop tex4ht from garbling \begin{enumerate}[1.]
and so am still stripping that out with sed
in order to produces source files to be included in syllabus-web.tex
. But maybe the haskell code will be a useful starting point for others working with pandoc on similar tasks. Since I now actually can automaticaly co-generate a syllabus PDF and a website from the same source, I’m content for the moment, until I need “fun” again.
-
To be precise, whereas programming normally feels like playing with Legos, programming in Haskell feels more like trying to do a math problem set, with ghc in the role of problem-set grader. So: “fun” for certain values of “fun.” Note that MacFarlane’s pandoc scripting documentation includes—I am not joking—exercises. ↩︎