Topic Modeling and the Sociology of Literature

October 16, 2014. Workshop at the Penn Digital Humanities Forum.

This workshop introduces probabilistic topic modeling for humanists, focusing on applications in literary studies. Using my own work on the history of literary study as an example, I’ll give an informal introduction to the algorithm, survey the nuts-and-bolts technical choices involved in modeling, and discuss the challenges of interpreting the algorithm’s output. Strange and novel as this technique may seem, I’ll argue that it may be surprisingly well-suited to investigating some of literary studies’ central questions about the relation between literary history and social phenomena—and to rediscovering the methodological concerns the humanities share with the social sciences.

Slides from the workshop
Source code for the slides
(further source: dfrtopics and dfr-browser)