Unifying Heterogeneous Dialect Dictionaries Structures for Scientific and Laic Usage whilst Providing FAIR Data Principles
2022-04-12, 17:00–17:30 (Europe/Vienna), Room 5

https://univienna.zoom.us/j/68933130633


The Bavarian Dictionary (BWB) , the Franconian Dictionary (WBF) , and the Dialectological Informational System of Bavaria-Swabia (DIBS) are quite heterogeneous projects concerning their particular workflows, targets, data formats, and data structures. However, all projects deal with dialectal language information in very similar fields, e.g. lemmas, meanings, and evidences. Yet, one has to search each data source separately and results are shown in differently structured layouts in one application each. Since the applications have not been designed for easy access by laic users, each project lacks an online representation of the data suitable for the general public.
The BWB recently started writing articles in XML and retro converting the existing ones from Word to XML, whereas the Franconian Dictionary does not create articles by hand, and the DIBS does not yet have a publication platform. The latter two projects use Excel and a relational SQL database application for data input. The BWB also uses a SQL based system to excerpt relevant information from the primary research data but this system is not directly linked to the XML articles. Besides, one significant difference within the data that leads to problems in terms of comparability or interoperability of the three dictionary projects consists in different writings of place names: official name vs. common or short name, typographical errors, and differences caused by the local government reorganization of Bavaria (mainly merger and changing of the rural district).

Dissolving and thus not only joining but also unifying these differences is the target of the language information system named “Bavarian Dialects Online”. The paper will thus illustrate the heterogeneous bases of all three projects as well as the process of their homogenisation into one system having one data format and structure as well as one layout for publishing results online. For this purpose, an explanation about the advantages of a self-designed XML schema over the use of standard or even modified TEI will be provided, too. Subsequently, the reasons for the development of a new software in contrast to the usage of an already existing research infrastructure will be discussed. In brief, the paper will also show the automatic compilation of articles of the WBF and the resolving of problematic place names. German stemming problems in XML databases or other solutions for performing high-speed searches over all the information will be discussed as well. Finally, the very pragmatic solutions for reaching FAIR data principles that allow researchers and laic users to easily find, access, interoperate, and reuse the data will be illustrated in detail. Hence, the possible simplicity of providing linguistic material in a FAIR manner will be underlined by using this example of the “Bavarian Dialects Online”.


References

Funk, Edith & Schwarz, Brigitte (2018–): DIBS Digital. Munich: Bavarian Academy of Sciences and Humanities. https://dibs.badw.de (30.10.2020)

Funk, Edith & Raaf, Manuel & Schwarz, Brigitte & Welsch, Ursula (2020, in print): „Dialektologisches Informationssystem von Bayerisch-Schwaben (DIBS)“. In: Lenz, Andrea & Stöckle, Philipp (eds.): Germanistische Dialektlexikographie im 21. Jahrhundert. Stuttgart: Steiner (Zeitschrift für Dialektologie und Linguistik, Beiheft).

Funk, Edith & Schamberger-Hirt, Andrea (2016–): BWB Digital. Munich: Bavarian Academy of Sciences and Humanities. https://bwb.badw.de/en, (30.10.2020)

Klepsch, Alfred & König, Almut (2016–): WBF Digital. Munich: Bavarian Academy of Sciences and Humanities. https://wbf.badw.de/en (30.10.2020)

König, Almut & Raaf, Manuel (2020, in print): „Das Fränkische Wörterbuch (WBF)“. In: Lenz, Andrea & Stöckle, Philipp (eds.): Germanistische Dialektlexikographie im 21. Jahrhundert. Stuttgart: Steiner (Zeitschrift für Dialektologie und Linguistik, Beiheft).

Raaf, Manuel (2019): Bayerns Dialekte Online. Poster at the 14. Bayerisch-Österreichische Dialektologentagung. Salzburg. https://tiny.badw.de/rSJibA (30.10.2020)

Raaf, Manuel & Schnabel Michael & Schwarz, Daniel (2020, in print): „Bayerisches Wörterbuch (BWB)“. In: Lenz, Andrea & Stöckle, Philipp (eds.): Germanistische Dialektlexikographie im 21. Jahrhundert. Stuttgart: Steiner (Zeitschrift für Dialektologie und Linguistik, Beiheft).

Linguist and Software Developer