Andrew Goldstone

I am an Assistant Professor in the Department of English at Rutgers University, New Brunswick. I study and teach twentieth-century literature in English. My research interests include modernism and non-modernism in English and French, the sociology of literature, literary theory, the history of genre fiction, South Asian literature in English, and the digital humanities, especially computational text analysis. I also have a long-standing interest in digital systems for document preparation and typesetting, especially LaTeX.

My book, Fictions of Autonomy: Modernism from Wilde to de Man (2013), is published by Oxford University Press. For information about the book and ordering links, see my webpage for the book.

What Might Have Been

Pierre Bourdieu, pioneer of quantitative literary analysis:

Pour vérifier la correspondance entre l’espace des positions et l’espace des prises de position, nous avons recensé 537 textes de 510 auteurs publiés par les éditeurs retenus dans notre étude, qui ont été traduits en français entre juillet 1995 et juillet 1996 et retenu, pour chacun des titres, les variables suivantes : genre (roman, nouvelle, récit, conte), éditeur d’origine et et d’arrivée, langue d’origine (pour l’anglais on a distingué entre «anglais» et «américain»), nom du traducteur, nom et sexe de l’auteur, année de parution de l’édition originale, de la traduction française (1995 ou 1996), jugements de la critique, prix, nombre de pages, nombre total d’auteurs étrangers publiées par l’éditeur concerné, nombre d’auteurs ayant la même langue d’origine nationale. L’immensité des recherches nécessaires pour le mener à bien nous a conduits à abandonner ce projet.

In order to verify the correspondence between the space of positions and the space of position-takings, we took a census [survey?] of 537 texts by 510 authors translated into French between July 1995 and July 1996 which were published by the publishers in our study, considering the following variables: genre (roman, nouvelle, récit, conte [novel, short story, (non-fiction) narrative, (fantastic) tale]), original publisher and French publisher, original language (for English we distinguished between “English” and “American”), name of translator, name and sex of the author, year of publication of the original edition, of the French translation (1995 or 1996), critics’ judgments, prizes, number of pages, total number of foreign authors published by the publisher in question, number of authors having the same original national language. The immensity of the research necessary to carry this project off led us to abandon it.

(“Une révolution conservatrice dans l’édition,” Actes de la recherche en sciences sociales 126, no. 1 [1999]: 3–28; here 18n31.)

Nothing beside remains….

(thanks to @rania_tn on twitter for help translating genre terms; I alone am responsible for errors in translation above)

dfrtopics, hold the dfr

It’s gratifying, and a little frightening, when someone else uses your own code. Jonathan Goodwin has built on my dfrtopics and dfr-browser code-blobs to produce a fascinating visualization of topics in fiction 1920–1922, derived by modeling the genre-specific word frequencies data set from HathiTrust. He’s given a nice description of his process as well. In the process, Jonathan revealed some unnecessarily restrictive assumptions built into my code. He solved the problem by modifying my code: all praise to him! But then I felt bad and wanted to make it possible for others to go further without having to dig into the Area X that is my code. So I made a few adjustments to my new version of dfrtopics. Here are some notes on using the updated version to cope with the issues Jonathan found in processing and modeling the Hathi data.

Topic modeling: a software update

I have spent a lot of time experimenting with and exploring topic models of text. Aside from an article, some blog posts, and a bunch of strongly held opinions, that time also produced quite a few lines of computer code for handling topic models from MALLET. I started out with a big file of R functions, then escalated to a folder full of R functions. The organization got ever more byzantine, even more so when I collaborated with others. Finally I bit the bullet and, following the gospel of Wickham, converted my pile o’ scripts into an R package, called dfrtopics because I was making models of data from JSTOR’s Data for Research. There it has sat, on github, accumulating bits now and then, plus some function documentation written in a fit of compulsion, but really not in a form that anyone but I could use (and, as time went on, becoming hard for me to use too). My website has had a note promising a tutorial demonstration of how to use my package for nigh-on two years, but no demonstration demonstrated itself.

The package was hard to use and document because of the messy and ad hoc way I represented pieces of the topic model. A hierarchical model is not easy to wrap your mind around, and different questions require different slices of the model to answer. And all the mess of code passing around random collections of data frames, lists, and who knows what else seemed like fertile ground for errors and glitches, even when the whole thing seemed more or less to do what I wanted most of the time.

So: in a questionable expenditure of energy, I’ve spent a few days applying some polish to the package. Herewith dfrtopics version 0.2. Install it from github with devtools::install_github("agoldst/dfrtopics").

Three things are new from a potential user’s perspective—and my hope is that the idea of a potential user is slightly less far-fetched than before. First, there is now an introductory tutorial in the form of a package vignette. Second, the whole package has been rewritten around the idea that a topic model is an object, stored in a single R variable. Third, I have tried to make everything as modular as possible, so that the usefulness of the package is not restricted to MALLET models of DfR data. If you have other textual data in wordcount-form—oh, let’s say, 180,000 18th-through-early-20th-century volumes—you might write up some variant file-loading code if necessary, then use the package functions to model those texts with MALLET. And if you have wordcount data that you want to analyze in R in some other way than with MALLET, there are still, I hope, some useful things here for converting those wordcounts into data frames or term-document matrices.

Sign Up for Our MSA Seminar!

I am co-organizing a seminar on The Production of Modernist Disciplinarity at this November’s MSA in Boston with Jonathan Goodwin (University of Louisiana, Lafayette). As a supplement to the very terse description you can find on the site, here is a slightly longer one:

The Production of Modernist Disciplinarity

Recent modernist studies are characterized by the aspiration to expand geographically, chronologically, and across media. Yet the core of “modernism” as an aesthetic, a period, and a canon of writers and artists has proven strikingly stable. This seminar invites new approaches to studying modernist disciplinarity itself, including (but not limited to) quantitative approaches, like citation analysis and probabilistic modeling, that have recently been brought to bear on disciplinary history in the humanities. How can we understand (or change) the forces that either stabilize or disrupt scholarship on modernism?

This was meant to underline that though Jonathan and I are both interested in quantitative methods for understanding disciplinary continuity and change, this is not an exclusively quantitative or “digital” seminar. All approaches are welcome. Please don’t hesitate to write me (contact information at right) or Jonathan () if you have any questions.

Conference registration is open. Seminar registration officially closes tomorrow, September 1. The conference early-registration discount ends September 15.

Despite any rumors you may have heard to the contrary, our seminar will not be a Zimbardo-style experiment on the production of disciplinarity.

Doing without Texts: Sapiro on Translation

If we want to do sociology of literature, let’s get away from texts for a bit.

One of the most promising things about the current interest in quantitative methods for literary study is that it offers us some alternatives to reading as a method. There are important questions about literature—above all, about literature as a social and historical system—that cannot be answered with the tools of the expert textual interpreter. Such questions are better answered, I believe, in closer collaboration with our disciplinary kindred in the social sciences.

Thanks to a new special issue of Cultural Sociology focused on literature, we have a chance to look at some concerete examples of what sociological approaches to literary problems currently look like. In this post I’ll discuss the excellent essay there by Gisèle Sapiro, Translation and Symbolic Capital in the Era of Globalization: French Literature in the United States. Sapiro, whose work I have been following for a while, works in the tradition of the sociology of fields, and has written on both twentieth-century literary history and contemporary world literature. This latest essay is particularly fun to think with, because I have my greedy paws on some of the same data Sapiro uses, so it will be possible to look in some detail at the sort of evidence and the sort of analysis her approach entails—and, perhaps, to think how to extend it.