Historical Dialect Dictionaries and Their Corpora as Data Basis for Language Variation – a Multimedia Tour of the WBÖ and its Research Platform LIÖ
2022-04-13


For studying language variation within historical language phases, historical language data continues to be one of the greatest challenges to empirical research, especially concerning oral language. Either pre-existing (primarily) written sources are consulted, which may not be perfectly tailor-made for the respective research question, or rare sound recordings are used which are limited to the last 120 years.

One type of language data has been used much less frequently for the study of language variation: the extensive corpora that were compiled by dialect lexicographers as the basis for their dictionary work. On the one hand, this is surprising given that most dictionaries do not only contain information about word meanings and grammar, but also deal with language variation in many ways, such as: regional distribution, style level, language contact, language change or pragmatics. On the other hand, it is easily comprehensible because the language corpora underlying the completed dictionary articles are often only available on millions of handwritten paper slips.

The WBÖ (Dictionary of Bavarian Dialects in Austria) is one of the very few dictionaries of German dialects that is published online and provides open access to its digitized XML/TEI corpus of 3.6 million paper slips. The WBÖ is a long-term project at the Austrian Academy of Sciences, founded in the 1910s. It is dedicated to the comprehensive documentation and lexicographic analysis of the various base and regional dialects of (historical) Austria. From the letter F onwards, the WBÖ dictionary articles are available online via LIÖ – a web-based research and information platform on the lexicon of German in Austria: https://lioe.dioe.at. LIÖ also allows for a direct link between the WBÖ dictionary articles and their underlying database entries which are stored in a BaseX database and made available via Elasticsearch on LIÖ. Via the LIÖ mapmaking tool, the selected XML/TEI entries can even be visualised geographically.

This multimedia presentation demonstrates the research process with the WBÖ data published via LIÖ and discusses the value of the WBÖ corpus for the study of language variation with two case studies: The first one is an example of phonetic variation which shows the regional distribution of the Bavarian and Bavarian-Alemannic reflexes of MHG ei /aɪ/ that are documented in the WBÖ data, e.g. the variants /ɔɐ/ and /a:/. In this case, the columns that contain the lemma and the transcriptions in the WBÖ database can be searched without a complex query syntax. The second example is a morphological one which presents the various adjectives that can be derived from the base form Farbe 'colour' in the Bavarian and Bavarian-Alemannic dialects (farbig/färbig, farben/färben, farbert/färblert, farb/färb 'coloured'). The regional distribution of these adjectives can also be visualised via the LIÖ mapmaking tool.

After the presentation of the above-mentioned examples, the interested audience may explore the WBÖ data and the LIÖ tools to search for and visualise language variation in Austria and South Tyrol themselves.


