word order
Recently Published Documents


TOTAL DOCUMENTS

3331
(FIVE YEARS 726)

H-INDEX

58
(FIVE YEARS 4)

Author(s):  
Pragya Katyayan ◽  
Nisheeth Joshi

Hindi is the third most-spoken language in the world (615 million speakers) and has the fourth highest native speakers (341 million). It is an inflectionally rich and relatively free word-order language with an immense vocabulary set. Despite being such a celebrated language across the globe, very few Natural Language Processing (NLP) applications and tools have been developed to support it computationally. Moreover, most of the existing ones are not efficient enough due to the lack of semantic information (or contextual knowledge). Hindi grammar is based on Paninian grammar and derives most of its rules from it. Paninian grammar very aggressively highlights the role of karaka theory in free-word order languages. In this article, we present an application that extracts all possible karakas from simple Hindi sentences with an accuracy of 84.2% and an F1 score of 88.5%. We consider features such as Parts of Speech tags, post-position markers (vibhaktis), semantic tags for nouns and syntactic structure to grab the context in different-sized word windows within a sentence. With the help of these features, we built a rule-based inference engine to extract karakas from a sentence. The application takes in a text file with clean (without punctuation) simple Hindi sentences and gives back karaka tagged sentences in a separate text file as output.


2022 ◽  
pp. 171-206
Author(s):  
Tanya Karoli Christensen ◽  
Torben Juel Jensen

2022 ◽  
Vol 12 ◽  
Author(s):  
Ying Alice Xu ◽  
Letitia R. Naigles ◽  
Yi Esther Su

This study explores the emergence and productivity of word order usage in Mandarin-speaking typically-developing (TD) children and children with autism spectrum disorder (ASD), and examines how this emergence relates to frequency of use in caregiver input. Forty-two caregiver-child dyads participated in video-recorded 30-min semi-structured play sessions. Eleven children with ASD were matched with 10 20-month-old TD children and another 11 children with ASD were matched with 10 26-month-old TD children, on expressive language. We report four major findings: (1) Preschool Mandarin-speaking children with ASD produced word order structures with pervasive ellipsis at similar rates to language-matched TD children, but also displayed differences from TD children in their usage of SVt and VtO frames; (2) Grammatical productivity was observed in both TD children and children with ASD; moreover, children with ASD with higher expressive language produced less stereotyped language; (3) Both TD children and children with ASD heard a range of word orders in their caregivers’ input, with TD children’s input greater in amount and complexity; however, caregivers of both groups also showed no age/language-related changes in word order usage; (4) Few word-order-specific correlations emerged between caregivers and their children; however, strong correlations were observed for mean length of utterances (MLU) for both groups: Caregivers who produced longer/more complex utterances had children who did the same. Taken together, it seems that despite their pragmatic deficits, the early grammatical knowledge of word order in Mandarin-exposed children with ASD is well preserved and in general follows the typical developmental pattern. Moreover, caregiver input is broadly rather than finely tuned to the linguistic development of TD children and children with ASD, and plays a more important role in children’s general syntactic development than in specific word order acquisition. Thus, early word order usage in preschool Mandarin-speaking TD children and children with ASD may be influenced by both caregiver input and child abilities.


2022 ◽  
Vol 8 (1) ◽  
pp. 1
Author(s):  
Antonija Šarić

In this paper, the author attempts to identify the most common errors that occur in the interlanguage of students at the Faculty of Food Technology when formulating indirect questions in English language. According to Processability theory (PT), language is acquired in a predictable way, in six stages, the last stage being acquiring word order in subordinate clauses, i.e. cancelling inversion. Since interlanguage presents a dynamic language system that retains some features of the first language or generalizes the second language rules in speech or writing, the origin of errors can be found in mother tongue or in the misapplication of the rules when adopting a second language. Although PT is not concerned with the errors made by the second language learners, this paper will try to identify the origin of errors that appear in the students' interlanguage and the acquisition of the last stage, i.e. the word order in subordinate clauses. In that way, it will be determined whether the errors (inter- or intralingual) made by the students prevent them from acquiring the last stage of PT.


2021 ◽  
Vol 26 (4) ◽  
pp. 964-974
Author(s):  
Yunju Nam ◽  
Hyenyung Chung ◽  
Youngjoo Kim

Objectives: Although Korean is a typical free word order language, when the same logical meaning is realized in several sentences with different word order, preference for the word order and the processibility of that sentence may vary. In this study, we investigated the word order preference on the instrumental adjunct and argument (direct object) of Korean when they were both short and when one of the two components was lengthened in the sentence using Eye-tracking technology. Additionally, the underlying cognitive mechanisms of the word order preference were discussed.Methods: Thirty-five college students were asked to read 24 sentences consisting of a condition in which both the adjunct and argument were short and one of them was lengthened, and their gaze was tracked.Results: When both components were short, the preferred word order was not confirmed. However, when one of the two components was lengthened, the canonical word order effect of putting the instrumental adjunct before the object argument and the LbS (Long before Short) effect of placing the lengthened components before the short ones were confirmed.Conclusion: The word order preference seems to reflect the strategy of keeping essential components close to the verb and minimizing the efficiency of integrated processing between critical components such as the head of an argument. However, the preference may vary depending on the burden of sentence processing or the level of the cognitive capacity of the processor. The timing at which word order preference is reflected may also vary depending on how strong the effect of the canonical order between two components is.


2021 ◽  
Vol 8 (2) ◽  
pp. 189-208
Author(s):  
Darsita Suparno ◽  
Ulil Abshar ◽  
Mulyadi Mulyadi ◽  
Santje Iroth

This paper studies language and translation of the term related to Covid-19. The background of this research is many new vocabularies related to Covid-19 is borrowing term. This study attempts to find answers to the following question: a) What kind of collocations are related to the term Covid-19 pandemic in English, Arabic, and Indonesian? b) How is the pattern of word order forming collocations with the term Covid-19 pandemic in English, Arabic, and Indonesian? This study addressed the emergence of new English, Arabic, Indonesian collocation related to Covid-19 using H. Men’s collocation theory. This study used newspapers, namely Republika, BBC, al-Jazeera online that show Covid-19 as the standard procedure for collecting data. This study used corpus linguistic to analyze collocation, concordance, and syntax analysis, models. The Covid-19 domain has chosen because the Indonesian term in this domain uses a lot of loanwords. The source of the data was a basic-words and compound term. The investigation informed several aspects of findings, such as identifying the pattern of collocation, borrowing, and collocation term of coronavirus concept.


2021 ◽  
Vol 36 (2) ◽  
pp. 298-335
Author(s):  
Hugo C. Cardoso

Abstract The Indo-Portuguese creole languages that formed along the former Malabar Coast of southwestern India, currently seriously endangered, are arguably the oldest of all Asian-Portuguese creoles. Recent documentation efforts in Cannanore and the Cochin area have revealed a language that is strikingly similar to its substrate/adstrate Malayalam in several fundamental domains of grammar, often contradicting previous records from the late 19th-century and the input of its main lexifier, Portuguese. In this article, this is shown by comparing Malabar Indo-Portuguese with both Malayalam and Portuguese with respect to features in the domains of word order (head-final syntax and harmonic syntactic patterns) and case-marking (the distribution of the oblique case). Based on older records and certain synchronic linguistic features of the Malabar Creoles, this article proposes that the observed isomorphism between modern Malabar Indo-Portuguese and Malayalam has to be explained as the product of either a gradual process of convergence, or the resolution of historical competition between Dravidian-like and Portuguese-like features.


2021 ◽  
Vol 13 (1) ◽  
pp. 51-65
Author(s):  
Adriana Mezeg

This article first gives an overview of the different uses of French apposition and then focuses on nominal appositions, a kind of supplementive clause introduced by a nominal group (NG) without an article. Only translations of initial nominal appositions are examined, i.e. those which are placed at the beginning of the sentence and where the content of the initial structure is expressed by an apposition or NG as the subject. In this context, word order and the use of commas are discussed, which are often of importance for Slovenian language users. Based on the FraSloK corpus, the following conclusions can be drawn: (a) sentence-initial position is maintained much more often in novels than in newspaper articles; (b) the expression of the content of initial structures with an apposition and an NG, which functions as a subject, is fairly evenly represented in more than half of the cases from newspaper articles, while in novels the subject function is prominent; (c) apart from the change in sentence position, Slovenian apposition corresponds to the source structure, and when its content is expressed by an NG with subject function, there are changes at different levels compared to French; (d) the (non-)use of the comma cannot be satisfactorily justified on the basis of the present corpus, but the examples suggest that it is based on translators’ personal choices and also depends on the possibilities of expression in the target language. Suggestions have already been made to change the rules and usage examples, which are not tenable in our cases, and would require further consideration.


2021 ◽  
Author(s):  
Do Tran Anh Duc ◽  
Pham Hoai Phu Thinh ◽  
Vu Tran Duy ◽  
Luong An Vinh

Sign in / Sign up

Export Citation Format

Share Document