Linguateca's infrastructure for Portuguese and how it allows the detailed study of language varieties


  • Diana Santos SINTEF ICT



In this paper I present briefly Linguateca, an infrastructure project for Portuguese which is over ten years old, showing how it provides several possibilities to study grammatical and semantical differences between varieties of the language. After a short history of Portuguese corpus linguistics, presenting the main projects in the area, I discuss in some detail the AC/DC project and what is called the AC/DC cluster (encompassing other related corpus projects sharing the same core). Emphasizing its potential for language variation studies, the paper also (i) describes CONDIVport's integration as an impetus for new capabilities, and (ii) provides a sketch of newly added functionalities to AC/DC.