Extracting dialect-specific features from dialect classifiers


Panel Affiliation

Embracing Variability in Natural Language Processing

References

Dirk Hovy and Christoph Purschke. 2018. Capturing regional variation with distributed place representations and geographic retrofitting. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 4383–4394.

Christoph Purschke and Dirk Hovy. 2019. Lörres, Möppes, and the Swiss. (Re)Discovering regional patterns in anonymous social media data. Journal of Linguistic Geography, 7(2), Article 2.

Roy Xie, Orevaoghene Ahia, Yulia Tsvetkov, and Antonios Anastasopoulos. 2024. Extracting lexical features from dialects via interpretable dialect classifiers. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2024).

Abstract

Neural architectures have become the de-facto standard for all kinds of NLP tasks and applications, such as machine translation, sentiment analysis or language and variety classification. However, the influencing factors for the decisions these kinds of models make remain opaque, earning them the title of black-box approaches. This a) limits their advantages as it hinders our ability to evaluate the reasoning and classifications made and b) limits the applicability of these models in certain contexts, for example in forensic linguistic case work where transparency and explainability are key factors for admissibility of linguistic evidence.

Previous work on explainability has thus focused on understanding how a model arrives at its prediction. In this vein, we employ a method proposed by Xie et al. (2024) to extract dialect-specific features from dialect classifiers. They train BERT-based binary dialect classifiers and use a post-hoc leave-one-word-out approach to detect the lexical items that contribute most to the prediction probability of the sentence. These words can be assumed to be most characteristic of a particular dialect.

First, we test this approach on a corpus collected from the platform Jodel with data from Austria, Germany and Switzerland (Hovy & Purschke, 2018; Purschke & Hovy, 2019). We extend the original method, which is based on binary classification, to multiclass classification and extract dialect-specific features for various numbers of classes. In addition to Xie et al.’s original analysis, we also look at incorrect classifications and the features relevant to these decisions to further understand the model classifications. We compare our findings with the “prototypical words” of the different regions identified by Purschke & Hovy (2019).

Additionally, we then focus on the Swiss part of the corpus. We hypothesise that any Swiss German word in a Jodel post means that the message is classified as Swiss, thus not advancing our understanding of the model classification by much. We therefore test the capability of Xie et al.’s approach to extract dialect-specific features in a more challenging setting, particularly with non-standardized spelling. Crucially, this expands the work to a situation where the most salient dialect features are known to be not at the lexical level.

Finally, we also investigate to what extent the base model (in terms of language coverage, tokenization choices and pre-training data) influences the dialect classification performance and the type of extracted dialect features.