Variation in particles used in Yes/No questions in Ukrainian language and factors impacting it.
2022-04-12, 15:00–15:30 (Europe/Vienna), Room 4

https://univienna.zoom.us/j/66752109715


In Ukrainian, as well as in other Slavic languages, Yes/No interrogatives may simply be formed by means of intonation, with no word order alteration. However, interrogatives often contain various particles, specifically used in this type of questions.

A wide range of particles can be found in Ukrainian question. Some question particles have only semantical functions and are used only in Yes/No questions, such as невже and хiба which introduce particular rhetorical semantics. Other particles have also grammatical functions, such as чи (cf. English whether), which may be used in non-interrogative clauses as disjunctive conjunction or complementation marker.

The factors impacting the use of “grammaticalized” particles, that is, чи or або are not clear.

This work thus aims to tackle this problem and find factors which impact it. The main data source is General Regionally Annotated Corpus of Ukrainian Language (GRAC), where questions from fiction texts, were extracted. The text selection was performed according to author’s attributes: only authors from four distinct periods (according to dates of their birth) and four particular sub-dialects (which belong to 4 main dialectal regions of Ukraine: North, South, West and East) were selected.

The research is performed in a variationist framework, similar to the one used by (Elsig 2009, 34) in the research on French interrogative system. According to this approach, it is vital to “circumscribe” the variable in order to work only with identical contexts where all features in question can be used. The contexts lacking the feature, in our case, questions without any particles, should also be considered in the analysis.

The questions were therefore separated into two main groups. The first includes “true” questions, that is, as (Kobozeva 1988) puts it, questions intended to be answered. The second group is much more heterogeneous and includes all interrogatives used with no intention to be answered, for example, rhetorical questions or requests in form of a question.

The main focus of our work belongs to the “true” Yes/No questions. Therefore, the two groups had to be divided. The initial separation is performed manually on a small subset of GRAC data (4200 Yes/No Questions). This is done following mainly semantical basis and the surrounding context. Therefore, sometimes particles are found in untypical contexts, i.e. particle чи used in the rhetorical question and, vice versa, “rhetorical” хiба is found in non-rhetorical contexts.

The obtained dataset containing questions divided into several groups is then used for machine learning training for a similar classification of a bigger dataset from GRAC (more than 100,000 questions). Different approaches for classification are also discussed and evaluated, mainly bag-of-words and n-gram models, as well as a fine-tuned model using manually set impacting factors.

At this stage of the research, our data allow us to hypothesize that there is a distinct territorial and temporal variation in the use of Yes/No question particles, with clear differences between western and eastern regions, which become less apparent over the time.


References

Elsig, Martin. 2009. Grammatical Variation across Space and Time - The French interrogative system. Amsterdam: John Benjamins Publishing Company

Kobozeva, Irina. 1988. Кобозева, И. М. . О первичных и вторичных функциях вопросительных предложений. (‘On primary and secondary functions of interrogative sentences’). Текст в речевой деятельности. Moscow. 39-46.

Maria Shvedova, Ruprecht von Waldenfels, Sergiy Yarygin, Andriy Rysin, Vasyl Starko, Michał Woźniak, Mikhail Kruk et al. (2017-2020): GRAC: General Regionally Annotated Corpus of Ukrainian. Electronic resource: Kyiv, Lviv, Jena. Available at uacorpus.org.

In 2015 Ilia Uchitel graduated from Saint Petersburg State University at the Department of General Linguistics with majors in Yiddish and Slavistics. In 2018 he completed his master's degree at the Higher School of Economics in Moscow at the School of Linguistics.

In 2014-2018 he also worked as a school and university teacher of English.

Currently, Ilia Uchitel is employed as a research assistant at the University of Jena.

His interests include Yiddish and Ukrainian language dialectology, Corpus linguistics and historical newspapers digitization.