Results from rough data? The large-scale study of early modern historiography with multi-dimensional register analysis

Authors

  • Aatu Liimatta
  • Yann Ryan
  • Tanja Säily
  • Mikko Tolonen

DOI:

https://doi.org/10.5617/dhnbpub.10668

Abstract

Multi-dimensional register analysis is a methodology which can be used to extract functional dimensions from a set of texts. These dimensions describe various functional differences between the set of texts. The differences can be due to various situational constraints related to the production of the text, or they can be related to differences in the author’s intent and communicative purpose. While this methodology has seen considerable use in contemporary linguistics, it has been less used in historical linguistics, and even less so in history, even though the ability to differentiate between various textual functions in historical data would be extremely useful and interesting from the point of view of a historian. In this paper, we perform a pilot study of multi-dimensional register analysis on a subset of texts from Eighteenth Century Collections Online (ECCO). In particular, our goal is to find out whether this kind of analysis is possible in the first place, or if it is hindered too much to be useful by the low quality of the ECCO data produced by optical character recognition (OCR). To do this, we first perform the analysis on ECCO data, after which we compare the results with results from running the same analysis on the same set of texts from ECCO-TCP, a manually cleaned subset of ECCO data. Our results show that not only are the results from the ECCO analysis interpretable, but they are also highly similar with the results from ECCO-TCP. Multi-dimensional register analysis appears to be a very promising and robust method which can work well even with low-quality data.

Downloads

Published

2023-10-10

How to Cite

Liimatta, Aatu, Yann Ryan, Tanja Säily, and Mikko Tolonen. 2023. “Results from Rough Data? The Large-Scale Study of Early Modern Historiography With Multi-Dimensional Register Analysis”. Digital Humanities in the Nordic and Baltic Countries Publications 5 (1). Oslo, Norway:297-312. https://doi.org/10.5617/dhnbpub.10668.