Read, calculate and observe

2017/04/03

Read, calculate and observe

Volkswagen Foundation is funding the Digital Literary Studies research group

Professor Thomas Weitin has been at Technische Universität Darmstadt since 2016, and his “Reading at Scale” project examines the best way for people and computers to work together in the analysis of literary texts. His partner in the project, Professor Ulrik Brandes (Universität Konstanz) is an algorithmics and network analysis expert.

Professor Thomas Weitin leitet das Darmstädter LitLab an der TU Darmstadt. Dort werden die Textkorpora des Projekts „Reading at Scale“ aufbereitet und digital analysiert. Bild: Katrin Binner
Professor Thomas Weitin heads the Darmstadt LitLab at TU Darmstadt. There the text corpora for the project “Reading at Scale” is prepared and digitally analysed. Image: Katrin Binner

The two academics have compiled a straightforward concept: If we accept that human reading and computer-aided methods each have their own strengths, namely the analysis of rich detail and large-scale data analysis, then a mix of these methods should be better for analyses at an intermediate level than either method on its own. Literary texts lend themselves to analyses at different levels of resolution, from character-level in an individual work, to whole bodies of literature, with literary studies and literary history traditionally scrutinising many research questions at the intermediate level.

This was also the focus of the “Reading at Scale” project. The starting point is the historical collection of 86 novellas published under the title “Der deutsche Novellenschatz” by publishers Paul Heyse and Hermann Kurz (24 volumes, 1871-1876). Medium in size, the collection of novellas is still within the reach of an individual reader, yet is sufficiently large for promising statistical analyses.

The Darmstadt LitLab

The text corpora for the project are prepared and digitally analysed in the Darmstadt LitLab, under the supervision of Thomas Weitin. The aim is to tap into all the novella compilations of the nineteenth century. Contemporary collections of other genres, such as crime, are to be included for comparison. Thanks to funding amounting to around € 450,000 over three years, three early career researchers have been involved in analysing the body of text with regard to key questions for today’s digital age in the historical subject matter.

Nineteenth century anthologies emerge under the influence of literary mass production and dramatically increasing competition for readers’ attention as a resource. With this in mind, the Darmstadt LitLab is quantifying the emergence of individual features of style and genre and uses eye tracking and the measurement of physiological functions to carry out analyses of cognition-oriented reception, to find out how literature directs attention.

In the Konstanz Algorithmics research group headed by Ulrik Brandes, the data produced is being analysed network-analytically, to allow the research team to study the position of the individual text as a network of relationships in a wide context. The researchers expect that working with texts and their data will help them to gain a better understanding of the individualism effect of modern media-oriented societies.

Three questions for… Thomas Weitin

Professor Dr. Thomas Weitin, Institut für Sprach- und Literaturwissenschaft. Bild: Katrin Binner
Prof. Dr. Thomas Weitin. Image: Katrin Binner

Professor Weitin, why did you call your project “Reading at Scale”?

Our approach tries to overcome the unproductive animosity between humanities scholars working traditionally, and those who work digitally, which in literary studies has hardened into two concepts, “close” and “distant” reading. Then there is the “scalable reading” idea, which suggests that you can easily switch back and forth between human reading and the computer analysis of large quantities of text. This seems too optimistic to us, and is inconsistent with our experience.

We believe that in text analysis, you always have to decide on a certain scale, on a level of abstraction. This brings certain perceptions, which always come at a price. For example, if I only represent texts by their word frequency, that is a strong abstraction with an obvious advantage. Plenty of texts can be compared on this basis. But this process also comes at a high cost, as I lose almost all the context of the words and text.

So how can traditional reading be combined with digital analysis?

Fortunately there is no ideal way. First we looked at how others had tried to do this, and established that colleagues who used specific, quantitative methods in philology to help them succeed, always knew a great deal about the texts and the corpora they were analysing. They were intimately acquainted with the history of their subject matter.

This helped us a lot with our own approach. In interdisciplinary, collaborative work with medium-sized text quantities, we were easily able to bring together differently abstract representations of text, to complement one other meaningfully. You could say that both parties are always finding new reasons to make us think. We are not in the business of simply collecting data and hoping that at some point, a good question will occur to someone. The fact is, you can get immensely bored in digital infrastructures.

Whom is your research meant to benefit?

I learned from my own studies that humanities scholars should ignore this question. And to be honest, it is not an easy question to answer either. Like any other work, research is initially only useful to the one carrying it out. All the same, I make my living from it. But I do believe that it is worth humanities scholars keeping in mind the value to society of the results they achieve intellectually, rather than as a way to legitimize their purpose. I think our project will help to reduce the glaring gap between text-analytical and data-analytical competence, which I consider to be a major problem in society.

go to list