East Slavic indefinite pronouns: a corpus-based approach
AbstractThe paper focuses on the development and functional distribution of indefinite pronouns in Old East Slavic, taking into account different sources, genres and registers. All the examples in the collected dataset were taken from the historical modules of the Russian National Corpus. They were tagged for type of indefinite marker, source (including originality and date), type of reference of the indefinite marker, semantics, type of discourse, and the degree of formality (formal or informal) present in the context. We then applied both descriptive and inferential statistical methods such as Random Forest analysis as well as multinomial logistic regression. Our analysis enabled us to identify the primary and secondary predictors of the choice of a particular indefinite marker and to trace the functional distribution of indefinite markers according to these factors.