Easy Lecture Slides Made Difficult with Pandoc and Beamer

TeX, DH, kludgetastic

Markdown for slides is a great idea. Don’t painfully lay out each slide, click by click: just write down an outline and let a program generate the slides for you. The almighty Pandoc does this very well. You write plain text, and a single pandoc command produces slides in one of a number of nice HTML formats (like reveal.js or slidy), ready to be uploaded, or in PDF.

I switched over to this way of doing things last year. It was a relief to stop fighting with Keynote,1 which is nice as presentation programs go but still exhausting: it constantly makes wrong guesses about formatting; putting text where you want is a struggle, and inevitably imprecise. And, worst of all, adding notes for myself to slides is pretty hopeless. I always ended up having to create two parallel versions of each class: the slides, and my own notes. That involved a lot of redundant work. By contrast, once I started writing my slides and notes together, I got my materials for class together quite a bit faster. Whether the temptation to further procrastination/tinkering was really healthy is a separate issue.

Anyway, markdown for slides with pandoc is pretty simple and easy-to-use, but because I wanted some features that the HTML formats do not easily support, I was tempted into going with pandoc’s capacity to generate LaTeX slides using the beamer package. Getting what I want out of that setup is what makes easy slides difficult. (If you want easy slides made easy, consider Ben Schmidt’s suggestions.) Here’s what I do these days. All the bits and pieces for doing this can be found in my repository of TeX stuff on github, in the lecture-slides directory. [Edit, May 18, 2016: I now use a slightly tidier set-up, available in a new repository.] For a sample of the kind of results I get, see the slides from my Early Twentieth-Century Fiction course this semester, for example a lecture on Djuna Barnes.

Basic pandoc-markdown for slides gets you most of the way. I was late to learning about . . . for pauses. I make frequent use of the oddball syntax for incremental lists, which is to put the list in a block quote. What more did I want? I wanted:

  1. notes to myself, output in a format I could use when I was teaching, which meant: with some indication of the corresponding slide;

  2. control over the slide formatting, including the typeface;

  3. flexibility when I wanted to lay out images and text on the same slide;

  4. a “presenter interface” like the one in Keynote, with a preview of the next slide and a clock;

  5. the ability to stage more elaborate “builds” than just the incremental reveal: I wanted to hide and reveal various elements of a slide.

  6. and no clicking—an automated process for going from the source notes to any and all outputs I needed for class with one command.

So with a little help from beamer’s exhaustingly comprehensive documentation and the internet, here are the solutions I found…

Notes

There’s no way to add notes to yourself in pure markdown.2 Here is the first of many places I take advantage of pandoc’s willingness to pass raw LaTeX code through unchanged when it generates LaTeX from markdown. Beamer has a \note{} command. Notes do not normally appear on slides, but if you pass an option to the beamer package, beamer can either shrink down your slide and stick the notes next to it, or—and this is what I prefer—interleave note pages and slide pages, which you then print 2-on-1.

So “all” you need is two separate preambles, one for your regular slides and one for your notes page. Then you stick those preambles onto the front of your generated latex using pandoc’s -H option. (You can see where this is heading: I’ll get to Makefiles and the “build process” below.) The preambles I actually use are on github (preamble-notes.tex, preamble-slides.tex); they incorporate some other stuff I’ll talk about shortly. The notes-page preamble has

\setbeameroption{show notes}
\setbeamertemplate{note page}[plain]

Then I actually just use my printer options to put four pages on one, which gives me two slides and two notes pages per physical page. I guess you could use pdfnup too but I haven’t bothered to sort that out yet. [Edit 12/30/2014: Compulsions win. Latest Makefile automates this with pdfjam.] The “plain” layout gets rid of beamer’s very noisy default notes page, which has headers, footers, and a miniature of the slide on it.3

One little fiddly thing, however: if you have slides without any notes, there will be no generated note page. That’s a problem, since then the 2-up or 4-up printing will be out of whack. To force a blank notes page, you need this gem from tex.stackexchange.com:

\makeatletter
\def\beamer@framenotesbegin{%
  \gdef\beamer@noteitems{}%
  \gdef\beamer@notes{{}}%
}

Formatting

Beamer slides are rather notorious for their instantly-recognizable blah-ness. Adjusting beamer formatting requires more fun with your LaTeX preamble. (There are also beamer themes out there; though the default set is not un-blah, I’ve got an eye on mtheme and may experiment with it later. In the meantime, I have opted for an instantly-recognizable Keynote blah-ness instead: white Gill Sans text on black background. For this one needs the font installed, of course—and to use xelatex. Makefiles ahoy! See below. But first the preamble magic.

First the basic color:

\setbeamercolor{normal text}{fg=white,bg=black!90}

That’s the funny xcolor mixing syntax for a 90% black.

Just to get one more color in the mix, I went with a cautious blue for headers. After some fiddling and googling, I came up with this:

\setbeamercolor*{structure}{fg=blue!33!white}
\setbeamercolor{alerted text}{use=structure,fg=structure.fg}
\setbeamercolor*{palette primary}{use=structure,fg=structure.fg}
\setbeamercolor*{palette secondary}{use=structure,fg=structure.fg!95!black}
\setbeamercolor*{palette tertiary}{use=structure,fg=structure.fg!90!black}
\setbeamercolor*{palette quaternary}{use=structure,fg=structure.fg!95!black,bg=black!80}
\setbeamercolor*{framesubtitle}{fg=white}

Now the font:

\usefonttheme{professionalfonts}
\setmainfont{Gill Sans}
\setsansfont{Gill Sans}

Beamer childishly sets block quotes in italics. If you’re teaching literature that’s ridiculous.

\setbeamerfont{quote}{shape=\upshape}

Actually I often end up not using block quotes for passages of text on slides: I just write a paragraph and take up the full width of the slide.

Pandoc makes the reasonable decision to set text under second level headers as beamer “blocks.” That’s fine, except that I wanted justified text on my slides, and this needs a little more tweaking to get justification to happen within blocks.

\usepackage{ragged2e}
\justifying
\addtobeamertemplate{block begin}{}{\justifying}

Layout

Graphical layout is of course the weak point of any system where you write the slides and generate output, and markdown is particularly minimalist about this. But I knew an opportunity to indulge the LOGO-phile within when I saw one. Again I exploited pandoc’s tolerance for mixing in LaTeX with markdown. There’s a nice package, textpos, which lets you put anything at fixed coordinates on the page. Even better, textpos will set up a “grid system” for you, so you can decide on a layout grid and lay things out in grid units instead of inches or pixels. (This is particularly useful for screen projection, of course, where things get kind of confusing. Beamer “helps” by generating slides that are, in fact, 128 mm x 96 mm.) To set up textpos, you need this in your preamble:

\usepackage[overlay,absolute]{textpos}
\TPGrid[10 mm,8 mm]{9}{8}

That’s a 9 x 8 grid with 10 mm and 8 mm margins, which has seemed okay for my purposes. Of course you can fiddle. What this gets you is LaTeX environment—with a weird syntax—for putting things anywhere on the slide, and x and y units for the grid, which are lengths called \TPHorizModule and \TPVertModule.

\begin{textblock}{4}(0,1)
\includegraphics[width=3.75\TPHorizModule]{media/hurston-van-vechten.pdf}
\end{textblock}

\begin{textblock}{5}(4,1)
\footnotesize
1891 b.\ Alabama \\
1919 Howard University \\
1924 first publication \\
1925 Barnard; studies anthro.\ under Boas
...
\end{textblock}

The first block is at grid coordinates (0,1) (origin at top left, positive y goes down the page), and is 4 horizontal grid units wide. Inside, I have included an image and set its width to be the 3.75 horizontal units. The second block is next to the first block, at (4,1), and takes up the rest of width of the grid (5 units). Inside is some text. Notice that this has to be straight LaTeX—Pandoc ignores everything inside a LaTeX environment.

The presenter interface

Using beamer means my slides are output in PDF. That’s quite nice for printing and for distributing the slides to students, but presenting slides from Preview is limiting. One thing Keynote has over Preview is its “presentation mode,” in which your computer shows the slide you’re projecting, the next slide, and a clock. I spent some time fiddling with beamer’s built-in workaround for this, the show notes on second screen option. I couldn’t get it to work, nor could I find a PDF viewer that gave me a presenter clock and didn’t crash.

Enter a rather miraculous piece of software, Melissa O’Neill’s PDF to Keynote. This does what it says on the tin. O’Neill somehow reverse-engineered the Keynote format; her program turns a PDF into a Keynote presentation (by embedding each page of the PDF into a keynote slide). Now I can use Keynote’s nice presentation mode as my “viewer.”4

Fancy incremental revealing/hiding

It’s often helpful to gradually uncover slides, and sometimes helpful to gradually re-cover parts of them too. This is the “build” feature, which Keynote does pretty well. Markdown slides allow you to incrementally reveal things. Beamer supports much more elaborate possibilities, thanks to its “overlay specifications.” In beamer LaTeX, for example:

\only<1,3>{This text appears on the first and third versions of the slide, but not the second.}

This uses beamer's highlighting command to \alert<2>{draw attention here}, but only on the second slide.

\note<2>{

Notes can also have overlay specs.

}

(Beamer then generates three slides with the appropriate features hidden, shown, or highlighted.) Once I read up in the beamer manual on overlay specifications, I thought I was all set, but unfortunately here we run up against a problem with Pandoc. Pandoc does not recognize beamer’s weird syntax (those <1,3>):

> pandoc -t beamer

This text appears on all versions.
\only<1,3>{This text appears on the first and third
versions of the slide, but not the second.}

yields

\begin{frame}

This text appears on all versions.
\only\textless{}1,3\textgreater{}\{This text appears on the first and
third versions of the slide, but not the second.\}

\end{frame}

Rats! Pandoc ate my overlay! Now we are in trouble. We need to modify pandoc. “Fortunately,” Pandoc is very extensible and scriptable. There are two ways to go: Pandoc’s Haskell API, or the pandocfilters python module. The latter, though less flexible, is easier to use, at least for my insufficiently-enlightened mind. All I wanted was to ensure that beamer overlay specifications passed through to LaTeX from markdown. pandocfilters lets you pass in a filter function to a tree-walking routine; the filter gets to operate on each element of the parse tree.

Unfortunately the Pandoc markdown parser already splits the overlay spec <1,3> away from what it recognizes as the Raw LaTeX command \only. The “correct” solution would be to filter whole Blocks for occurences of \only followed by <. I went for something lazier. If a LaTeX command is followed immediately by text in braces, the markdown parser understands all that text as raw LaTeX. In fact it does this for commands followed by any number of conseuctive braced runs of text. So if we introduce the alternate overlay specification syntax

\only{<1,3>}{text}

then this will be passed whole to any filter function, which can then convert it to the overlay syntax beamer understands, \only<1,3>{text}. The resulting filter looks like this:

from pandocfilters import toJSONFilter, RawInline
import re

ov_pat = re.compile(r'^(\\\w+)(\{<[0-9-+,]+>})({.*)$',flags=re.DOTALL)

def overlay_filter(key, value, fmt, meta):
    if key == 'RawInline' and value[0] == 'tex':
        m = ov_pat.match(value[1])
        if m:
            c = m.group(1)
            c += re.sub(r'^\{|}$', "", m.group(2))
            c += m.group(3)
            return RawInline("tex", c)

if __name__ == "__main__":
    toJSONFilter(overlay_filter)

Code on github. This is specified (once more) in the command line call to pandoc using the --filter option.

What could be simpler! Frankly, if you’re not writing python scripts the morning before you lecture on Joyce because you want your slides to come out just so and you’ll be damned if you type the same text twice over when you could automate it…you’re doing it wrong.

Automation

Now we have the following production process:

  1. Write slide-note file in markdown/LaTeX combo.
  2. Generate beamer LaTeX for slides.
  3. Generate PDF of slides.
  4. Generate Keynote file of slides for projection.
  5. Generate beamer LaTeX for notes.
  6. Generate PDF of notes.

Only the first step should require intervention by hand. For the rest, the noble Makefile comes to our aid. In order to control the insane profusion of intermediate files this generates, I corral the slide outputs in a separate directory from the notes. I’ve used two Makefiles in the setup on github. You could go for one big Makefile instead, but you’d have to be careful about TeX’s tendency to spew all its files into the working directory.

macros.tex
notes/
    Makefile
    classnotes.md
    preamble-notes.tex
slides/
    Makefile
    preamble-slides.tex

I’ve also stuck that python script, overlay_filter, in my PATH.

macros.tex holds LaTeX preamble stuff that is shared between notes and slides: this includes the textpos grid setup. It is \input in both preamble-notes.tex and preamble-slides.tex, which have the specific commands for the slides or the notes. classnotes.md holds my actual work prepping the class, if I ever actually get to that instead of messing around with my computer.

To make the slides, the Makefile looks like this:

notedir := ../notes

notes := $(wildcard $(notedir)/*.md)
slides_tex := $(patsubst $(notedir)/%.md,%.tex,$(notes))
slides_pdf := $(patsubst %.tex,%.pdf,$(slides_tex)) 
slides_key := $(patsubst %.pdf,%.key,$(slides_pdf)) 

$(slides_tex): %.tex: $(notedir)/%.md preamble-slides.tex
	pandoc $< \
	    -t beamer \
	    --slide-level 1 \
	    -H preamble-slides.tex \
	    --latex-engine xelatex \
	    --filter overlay_filter \
	    -o $@

$(slides_pdf): %.pdf: %.tex
	latexmk -xelatex $(basename $<)

That’s the pandoc command, invoking the overlay_filter and including the preamble. Then we hand off the process of calling xelatex and biber (if necessary) to latexmk.

For notes, the relevant piece is very similar:

$(notes_tex): %.tex: %.md preamble-notes.tex
	pandoc $< \
	    -t beamer \
	    -H preamble-notes.tex \
	    -V fontsize=8pt \
	    --filter overlay_filter \
	    --latex-engine xelatex \
	    -o $@

But notice the small font size. 8 pt. seems tiny, but it is rescaled when you print on letter paper (even when you print 4-up as I do). I invoke this with make classnotes.pdf.

Well, it works for me, anyway.

[Edit, May 18, 2016: Reiterating: I now use a slightly tidier set-up, available in a separate github repository.]


  1. I was using the pre-downgrade version of iWork, before Apple turned it into a maimed cloudthing.
  2. As the Pandoc docs will tell you, reveal.js supports notes, though the requirement to run reveal.js’s server locally through node looks like this feature moves things into the “easy made difficult” category too.
  3. I’ve found it useful to also put the slide number on the note, however; see preamble-notes.tex for how to do that.
  4. To achieve full automation overkill, I actually use AppleScript, called via osascript, to eliminate all pointing and clicking. See my slides Makefile on github.