Skip to main content

Manfred Thaller (University of Cologne): Drowning in Sources - What to Do with a Few Miles of Archival Material?

Archival material
Wednesday, December 2, 2020, 5:30 pm – 7:10 pm

Zoom availability:

Meeting ID: 995 5971 9298
Passcode: medspub



The quality of “Optical Character Recognition” - the automatic conversion of printed text into digital data that can be searched and processed on computers - has made sufficient advances, that it is clear that in a few years basically the complete holdings of printed material in libraries will be accessible in that form. This and similar developments have lead to a discussion of the “big unread”, particularly in literary disciplines: That is, the question, how far the concentration on a rather narrow canon of literature has distorted our view of the past.

More recently, under the heading of “ handwritten text recognition” (HTR) similar possibilities start to arise for manuscript material, specifically also including medieval texts. While these technologies are probably lagging behind the OCR for print by twenty years or so, even the straighforward mass digitization has obvious practical consequences – some of which have implications for the practical work with sources which may morph into methodological effects even now. If we assume that HTR will move along as OCR did, we are facing much more fundamental epistemic consequences in the short and medium future.



Manfred Thaller, born 1950, holds a PhD in History from the University of Graz, Austria and a PostDoc in empirical Sociology from the Institute for Advanced Studies, Vienna, 1978. He worked for twenty years at the Max-Planck-Institut for History in Göttingen, developing a general concept of applied computer science in the Humanities, implemented as a database system named κλειω. Since 1995 he held a professorship in that field at the University of Bergen,  before moving in 2000 as Prof. of «Historisch Kulturwissenschaftliche Informationsverarbeitung» (Humanities Computer Science) to the University at Cologne, Germany. Here he worked mainly on digital archives, digital libraries and digital preservation.

Retired since 2015 he works on the question, how far some of the basic assumptions of Computer Science are irreconcilable with the characteristics of information as contained in historical sources. He argues that optimal support for historical research would require an information technology based upon a conceptual stack different from the one we currently take for granted. At the moment he looks at the possibility to implement hermenutic reasoning as a computational model. Occasionally some tentative results can be found at