Coping with Variation in the Icelandic Diachronic Treebank

Authors

  • Eiríkur Rögnvaldsson University of Iceland
  • Anton Karl Ingason University of Iceland
  • Einar Freyr Sigurðsson University of Iceland

DOI:

https://doi.org/10.5617/osla.104

Abstract

We present an overview of an ongoing project which has the aim of developing methods for building a treebank of Icelandic. The treebank will contain both written and spoken language, and in addition have a diachronic dimension. Since Icelandic is an example of what has been called a less-resourced language when it comes to computational linguistics and language technology, it is essential to utilize the limited resources available as economically and efficiently as possible. We emphasize the importance of open source software and the interplay between linguistic knowledge and technological skills. We describe the workflow in the construction of the treebank and show how the different software tools work together towards the final representation. Finally, we show how the treebank can be used in studying some well known phenomena in Icelandic syntax.

Author Biographies

Eiríkur Rögnvaldsson, University of Iceland

Professor, Faculty of Icelandic and Comparative Cultural Studies

Anton Karl Ingason, University of Iceland

Master's student, Faculty of Icelandic and Comparative Cultural Studies

Einar Freyr Sigurðsson, University of Iceland

Master's student, Faculty of Icelandic and Comparative Cultural Studies

Downloads

Published

2011-06-17
opyright (c) 2014-2020 Simon Fraser University * Copyright (c) 2003-2020 John Willinsky * Distributed under the GNU GPL v3. For full terms see the file docs/COPYING. * * @brief Common site frontend footer. * * @uses $isFullWidth bool Should this page be displayed without sidebars? This * represents a page-level override, and doesn't indicate whether or not * sidebars have been configured for thesite. *}