scholarly journals Computational Morphological Analysis of Yorùbá Language Words

Author(s):  
Safiriyu Ijiyemi Eludiora ◽  
O R Ayemonisan

Nigeria official languages are English, Yorùbá, Igbo and Hausa. The focus of the study reported in this paper is to develop learning tool that can assist learners to learn the Yorùbá language using its alphabets. The study is critical to Yorùbá language, because of its endangerment. There is need to introduce different learning tools that can mitigate its extinction. A Yorùbá word perfect system was developed to assist people in learning the Yorùbá language. English and Yorùbá words formation are experimented using computational morphological approach (word formation). The theoretical framework considered Finite state automata (FSA) to realise different ways of combining the consonants and vowels to form word. Two to five letter words were considered. The system was designed and implemented using UML tools and python programming language.The system will teach the users on how the words are formed, and the number of syllables in each word. The user  need not to know how to tone mark word before he/she can use the system. Any word typed will be analysed according to its number of syllables. This approach produces representatives of all parts of speech (POS) of the two languages. It produces corpora for the two languages

Author(s):  
Ryan Cotterell ◽  
Hinrich Schütze

Much like sentences are composed of words, words themselves are composed of smaller units. For example, the English word questionably can be analyzed as question+ able+ ly. However, this structural decomposition of the word does not directly give us a semantic representation of the word’s meaning. Since morphology obeys the principle of compositionality, the semantics of the word can be systematically derived from the meaning of its parts. In this work, we propose a novel probabilistic model of word formation that captures both the analysis of a word w into its constituent segments and the synthesis of the meaning of w from the meanings of those segments. Our model jointly learns to segment words into morphemes and compose distributional semantic vectors of those morphemes. We experiment with the model on English CELEX data and German DErivBase (Zeller et al., 2013) data. We show that jointly modeling semantics increases both segmentation accuracy and morpheme F1 by between 3% and 5%. Additionally, we investigate different models of vector composition, showing that recurrent neural networks yield an improvement over simple additive models. Finally, we study the degree to which the representations correspond to a linguist’s notion of morphological productivity.


2021 ◽  
pp. 3-29
Author(s):  
Kateryna Horodensʹka ◽  

This study aims at investigating the development of grammatical theory at the Institute of the Ukrainian Language for the last 30 years. The paper summarizes achievements in grammar theory from applying the functional, i.e., semantic and grammatical, approach developed by I.R. Vykhovanetsʹ to differentiating units into word and nonword classes and distinguishing morphological categories of major word classes. This facilitated the establishment of the theoretical basis of functional and categorical morphology. The author analyses studies in formal grammatical, semantic, functional, categorical, deri vational, and anthropocentric syntax that attest to the multidimensional growth of a syntactic theory and main aspects of the Ukrainian word formation on the basis of semantic and categorical syntax and a formant- and stem-based derivatology. Some of the latest multi-pronged processes in word formation reflect dynamics of word formation rules, the replenishment of word formation resources, and the development of the word-formation system of Standard Ukrainian. The solving of a complex set of theoretical issues in the modern Ukrainian word-formation introduced the methodological foundations for the recent normative description of the word formation system of Standard Ukrainian. The article addresses issues in studies on applied grammar determined by the process of glo balization and democratization of the Ukrainian society and the needs of Modern Ukrainian language practice to be met. Particular importance is attached to the grammatical prescriptive norms in the professional use, the actualization of case forms appearing in the passive vocabulary, and the dynamics of morphological and syntactic norms in various functional and stylistic dimensions of Standard Ukrainian. Keywords: functional grammar, functional morphology, functional syntax, categorical grammar, categorical syntax, categorical word formation, classification of parts of speech, morphological categories.


Author(s):  
Ольга Вячеславовна Орлова ◽  
Юаньин Ван

Введение. Изучены ксенонимы-интерпретативы в сетевом дискурсе русскоязычной диаспоры Китая как репрезентанте русской лингвокультуры, ориентированной в область китайской культуры. Ксенонимы-интерпретативы имеют полноценные лексические соответствия в языке-реципиенте, не отмечены этнокультурной маркированностью в матричном языке и служат восприятию, пониманию и интерпретации иностранной языковой и социокультурной реальности диаспоральным сообществом. Цель. На основе рассмотрения одного из наиболее распространенных в изучаемом дискурсе китаизмов – ксенонима мафань – проследить механизмы лингводискурсивной адаптации, а также особенности функционирования ксенонима-интерпретатива в публичной интернет-коммуникации русскоязычного сообщества Китая. Материал и методы. Материалом исследования послужили более 500 текстов сетевой коммуникации русскоязычной диаспоры Китая, включающие ксеноним-интерпретатив мафань. Вместе с методологическим инструментарием интерлингвокультурологии, теории заимствования и лексической семантики используются приемы контекстологического и дискурс-анализа, а также корпусной и квантитативной лингвистики. Результаты и обсуждение. Ксеноним-интерпретатив мафань при вхождении в сетевой диаспоральный дискурс и адаптации в нем, в том числе графической, претерпевает значительные семантические, деривационные, стилистические, морфолого-синтаксические трансформации, демонстрируя на всех этих уровнях крайне высокую степень подвижности и вариативности. Анализ вариантов письменной фиксации показал доминирование написания китаизма в соответствии с нормативной транскрипционной системой записи кириллицей китайских имен, наряду со значимым количеством некодифицированных и иероглифических написаний. При приспособлении ксенонима к специфике морфолого-синтаксического строя русского языка также наблюдаются его значительная грамматическая вариативность и факты включения в процесс окказионального словообразования. В ходе анализа данных словарей, а также корпусных и иных данных об употреблении лексемы в аутентичной речевой среде выявлено большое количество различных по честеречной принадлежности и семантике эквивалентов лексемы при ее переводе, а также наличие книжной и этикетной стилистической окраски в китайском языке, не сохраняющейся при вхождении в русскоязычное неформальное общение. Анализ фактов метаязыковой рефлексии по поводу ксенонима, а также совокупности контекстов его употребления в гипертексте диаспорального дискурса обнаружил наличие в русском языке семантической и стилистической лакуны, которую восполняет ксеноним мафань как универсальное понятие, обозначающее неприятную, доставляющую дискомфорт деятельность или ситуацию. Заключение. В результате проведенного анализа составлен нуждающийся в дальнейшей доработке проспект методики комплексного анализа ксенонима-интерпретатива в сетевом диаспоральном дискурсе. Introduction. This research is devoted to the study of xenonyms-interpretatives in the network discourse of the Russian-speaking diaspora of China as a representative of the Russian linguistic culture, which is focused in the area of Chinese culture. Xenonyms-interpretatives have full-fledged lexical correspondences in the recipient language, are not marked by ethno-cultural marking in the matrix language and serve the perception, understanding and interpretation of foreign language and socio-cultural reality by the diaspora community. Aim and objectives. The purpose of the study, based on the consideration of one of the most common in the analyzed discourse xenonym мафань, is to trace the mechanisms of linguistic and discursive adaptation, as well as the peculiarities of functioning of xenonym-interpretative in public Internet communication of the Russian-speaking community in China. Material and methods. The material of the research is more than 500 texts of network communication of the Russian-speaking diaspora community of China, including xenonym-interpretative мафань. Together with the methodological tools of interlinguoculturology, borrowing theory and lexical semantics, the methods of contextual and discourse analysis, corpus and quantitative linguistics are used. Results and discussion. Xenonym-interpretative мафань when entering and adapting in the network diaspora discourse, including graphic adaptation, undergoes large semantic, derivative, stylistic, morphological-syntactic transformations, demonstrating at all these levels an extremely high degree of mobility and variability. The analysis of written fixation variants showed a substantial dominance of spelling according to the normative transcription system of Chinese names written in Cyrillic alphabet, along with a significant number of and hieroglyphic writings. When the xenonym is adapted to the specifics of the morphological and syntactic structure of the Russian language, its significant grammatical variability and facts of inclusion in the process of occasional word formation are also observed. The analysis of dictionaries data, as well as corpus and other data about the use of the word in the authentic speech environment revealed a large number of different in semantics and parts of speech affiliation equivalents of translation, but also the presence of book and etiquette stylistic coloring in the Chinese language, not preserved in the Russian-speaking informal communication. Analysis of the facts of metalanguage reflection on the xenonym, as well as the totality of contexts of its use in the hypertext of diaspora discourse revealed the presence of a semantic and stylistic lacuna in the Russian language, which is filled by the xenonym мафань as a universal concept that denotes unpleasant, discomforting activity or situation. Conclusion. As a result of the research, a project of the methodology of complex analysis of the xenonyminterpretative in the network diaspora discourse has been compiled.


2021 ◽  
Vol 15 (2) ◽  
pp. 202-213
Author(s):  
Antonina Petrovna Guskova

Recently transposition became the issue of many research papers for being a complicated and sophisticated language phenomenon, and its definition has been broadened. The issue of transposition and the degrees of verb transitivity are the most controversial and difficult ones both in Hungarian and Russian linguistics. This issue may be investigated on different language levels: lexical, syntactic, morphological and on the level of word formation. Taking into account the mobility of parts of speech boundaries in the compared languages we attempt to find the cause of words transitioning from one lexico-grammatical class into another, investigate transposition as a natural phenomenon both for the Hungarian and Russian languages, differentiate transition in parts of the speech system from other language phenomena, solve some contentious issues regarding parts of speech, for example ‘noun-adjective’ relations, and others. Despite having extensive literature concerning nominalization in Russian linguistics and some works in Hungarian linguistics, some aspects are not comprehensively covered in them. For example, different types of transitions from other parts of speech into nouns, thorough semantic and thematic categorization of substantivized words, characteristics of their functioning in texts of different functional styles, principles of creating lexicography, etc. In this article we compare the process of substantivation amidst the system of parts of speech in languages of such different structure as Hungarian and Russian. Comprehensive and comparative study of the process of transition of other parts of speech into nouns allows us to conduct a deeper investigation of each of these languages’ structure and also to reveal typological similarities and differences between them. These languages have not been explored this way so it provides scientific novelty to the research. For the first time we define the main conditions of a systematic process of transposition in Hungarian and Russian and reveal both specific and universal opportunities for transition in the compared languages. We use comparative analysis for researching semantic models of substantivized words, distinguish different types of transitions into nouns and describe structural and stylistic features. Thus, the topic of the research is the grammatical, semantic, structural and stylistic features of substantivized words in Hungarian and Russian. The objective of the study is to discover linguistic nature of substantivation of adjectives, verbs and verbal formations, numerals and pronouns, to find out specific and universal features caused by typological differences of the researched languages. To achieve this goal we need to solve the following problems: determining the place of substantivation in the system of word formation in Hungarian and Russian, discovering how much substantivation and conversion being productive ways of word formation are identical in Russian and Hungarian, distinguishing semantic models of substantivized words and compare them, comparing models of usual and occasional substantivation and determine its productivity, studying their structure which means showing peculiarities of substantivized words’ grammatical structure in Hungarian and Russian, discovering similarities and differences between them and finding adequate models. The research is based on data of dictionaries of Russian and Hungarian languages, examples of fictional texts, live speech and not the least on the idioms. Theoretical importance lies in the following: 1) the research develops the theory of transitivity as we study transposition in two languages of different structures using comparative analysis of substantivized words and taking into account grammatical, semantic and functional aspects; 2) using the materials of two languages of different structures we discover the main conditions of systematic transposition and distinguish its universal and specific features; 3) for the first time the problem of transposition is studied on the basis of Russian and Hungarian from a theoretical point of view (on the example of transition of other parts of speech into nouns); 4) we develop the methodology of a comprehensive approach to study substantivation in Hungarian and Russian which can be used when describing this phenomenon in other languages of different structures.


AITI ◽  
2020 ◽  
Vol 17 (1) ◽  
pp. 42-55
Author(s):  
Radius Tanone ◽  
Arnold B Emmanuel

Bank XYZ is one of the banks in Kupang City, East Nusa Tenggara Province which has several ATM machines and is placed in several merchant locations. The existing ATM machine is one of the goals of customers and non-customers in conducting transactions at the ATM machine. The placement of the ATM machines sometimes makes the machine not used optimally by the customer to transact, causing the disposal of machine resources and a condition called Not Operational Transaction (NOP). With the data consisting of several independent variables with numeric types, it is necessary to know how the classification of the dependent variable is NOP. Machine learning approach with Logistic Regression method is the solution in doing this classification. Some research steps are carried out by collecting data, analyzing using machine learning using python programming and writing reports. The results obtained with this machine learning approach is the resulting prediction value of 0.507 for its classification. This means that in the future XYZ Bank can classify NOP conditions based on the behavior of customers or non-customers in making transactions using Bank XYZ ATM machines.  


2021 ◽  
Vol 12 (4) ◽  
Author(s):  
H Sydoruk ◽  

Abstract. The article summarizes the theoretical experience of studying the structural and semantic characteristics of scientific and technical terms and agricultural terms in particular. It has substantiated the fact that the agrarian sphere is experiencing a rapid emergence of terms-innovations that are quickly passing all stages of structural-semantic assimilation. A significant percentage of terms are formed as a result of rethinking their previous meanings, which leads to the emergence of so-called semantic terms. Analysis of structural and semantic features of agricultural terms showed that their formation occurs through derivation, terminologization of common vocabulary and assimilation of borrowings. The paper also analyzes examples of the functioning of English-language multicomponent clusters and the peculiarities of their translation. The purpose of research is to clarify the structural and semantic features of agricultural terminology, to identify and describe the processes of formation of one- and multi-component agricultural terms, to determine the main word-forming types of agricultural terminology. Results of research. Agrarian terminology, which is a set of words and stable phrases corresponding to the concepts functioning in this area (objects, phenomena and actions), is a mobile and flexible part of common vocabulary, and therefore requires systematization. The main carriers of these concepts are nouns, a certain number of verbs and adjectives which most serve as definitions in multicomponent clusters and are not independent. Word-forming means are mainly Latin-Greek morphemes, word stems and phrases. Depending on the participation of language in term formation, the terms of the agricultural sector are divided into three types: simple terms or word-terms; complex terms; terms-phrases or multicomponent clusters Agrarian terminology uses such ways of word formation as lexical-semantic, lexical-syntactic, morphological, abbreviation, and morphological-syntactic. Many commonly used lexical items acquire meanings inherent in the agricultural sector, becoming terms with a narrower meaning. Modern scientific literature on agricultural topics contains terms that have two or more meanings in this area, the clarification of which is only disclosed by the context. In translation, such ambiguity causes blurring of meaning, vagueness, substitution of terms and ultimately distortion of the content as a whole. It is important to consider the lexical and grammatical environment of the term. Due to the context, the following translation issues can be addressed: a) the word is used in its common or special narrow meaning; b) the choice of one of several meanings of a polysemous term in a particular context. From agrarian vocabulary, terminological units move into colloquial language, sometimes becoming jargons or terminoids that function in limited areas of engineering and technology, forming a layer of stylized new scientific and technical terms. Professional slang is usually short, expressive and to some extent stylistically colored. The most successful of them are fixed in the terminology system over time and receive official recognition in a certain subject area, being fixed in dictionaries, and some pass to other areas and into the common language. Specifics of functioning the terminological combinations require appropriate methods of their translation, among which there are a few main ones: literal translation of lexical units is carried out with the help of calque; replacement of parts of speech; explanatory translation of terms; translation with word order changes, primarily in attributive group. Conclusions. Agro-biological terminology tends to be poly-variant in translation, polysemy and homonymy. In order to overcome the difficulties in translating professional texts, it is necessary to work more actively with special vocabulary, thoroughly study the issues of ambiguity, synonyms and antonyms, word formation and methods of translation. The above considerations open opportunities for the practical application of these methods of translation of agro-biological terminology, and for further creative search for the correct perception of professionally oriented texts.


2021 ◽  
pp. 142-161
Author(s):  
Ilze Auziņa ◽  
◽  
Kristīne Levāne-Petrova ◽  
Roberts Darģis ◽  
Kristīne Pokratniece ◽  
...  

The Latvian Language Learners Corpus (LaVA) developed at the Institute of Mathematics and Computer Science, University of Latvia, includes more than 1000 texts created by foreign Latvian language learners studying at Latvian higher education institutions for the first or second semester reaching A1 (possibly A2) Latvian language proficiency level. The size of the corpus is more than 180 000 words. The morphologically annotated texts have been checked manually; the language learners' errors have been manually annotated. In addition, each text is accompanied by information about the author of the text (metadata): gender, age, native language, knowledge of other languages. When analysing the data, this information can be used to determine how the learner's mother tongue and language skills, in general, affect the acquisition of the Latvian language. Users of the corpus can analyse the data both on the LaVA website (see http://lava.korpuss.lv/search) and in the SketchEngine tool, where the quantitative and qualitative analysis of the data can be performed. The quantitative approach makes it possible to find out the tendencies of the use of a word, word form, or construction and allows to determine the frequency of mistakes made by language learners. In addition, the objectivity of the research is ensured by looking at the data of language learners from different aspects and performing repeated analysis. For example, by statistically analysing the nouns used in learners' texts, it can be concluded that declension 4 nouns are most often used. The next in terms of frequency of use are declension 1, 5 and 2 nouns, while declension 3 and 6 nouns and indeclinable nouns are used very rarely. Qualitative analysis reveals certain features of morphology and word formation, including aspects of syntax, based on empirical data. It is possible to qualitatively analyse the erroneous use of nouns, verbs, or other parts of speech, trying to understand what rules determine this. For example, consider using non-reflexive verbs instead of reflexive verbs, using infinitives instead of finite forms (person forms), using a suffix that does not fit the noun paradigm, etc. According to LaVA data analysis, including learners error analysis, exercises and tests are generated. The exercises are intended to help the language learner to strengthen the linguistic competence of the Latvian language, for example, the use of verb forms in the indicative mood, both in indefinite and perfect tense forms. Exercise creation consists of three stages: (1) analysis of LaVA errors and identification of typical errors, (2) Collecting of sample sentences from various corpora of the Latvian language, for example, LVK2018, Saeima, with word forms and constructions in which language learners most often make mistakes in LaVA texts, (3) generation of different exercises using the selected sample sentences.


2020 ◽  
Vol 25 (2) ◽  
pp. 319-332
Author(s):  
Camiel Hamans

Abstract Against linguistic prudishness. About -gate and other libfixesThis study aims to discuss libfixing as a non-morphemic process of word formation. Libfixes are ‘liberated’ elements that originate from the reanalysis of existing words, usually opaque forms or blends. A well-known example of a libfix is -gate from Watergate, whose borrowing and spreading in Dutch has been discussed by Hüning (2000). Among the other examples that are discussed are English -cation as in mancation, Franken- as in Frankenfood and Dutch -naise as in yogonaise and -talië as in Kapitalië. This contribution shows how widespread the process of libfixing is. Moreover, it is claimed that libfixing operates systematically and can therefore be a subject of morphological analysis and theory. In addition, it is shown in this analysis that it is irrelevant whether a new formation is consciously formed or that it is the result of an unconscious productive process. What counts is whether the neologism is acceptable as a word in the language in question. Examples that are discussed in this article come from English and Dutch.


Sign in / Sign up

Export Citation Format

Share Document