language text
Recently Published Documents


TOTAL DOCUMENTS

603
(FIVE YEARS 207)

H-INDEX

24
(FIVE YEARS 2)

2022 ◽  
Vol 40 (1) ◽  
pp. 1-43
Author(s):  
Ruqing Zhang ◽  
Jiafeng Guo ◽  
Lu Chen ◽  
Yixing Fan ◽  
Xueqi Cheng

Question generation is an important yet challenging problem in Artificial Intelligence (AI), which aims to generate natural and relevant questions from various input formats, e.g., natural language text, structure database, knowledge base, and image. In this article, we focus on question generation from natural language text, which has received tremendous interest in recent years due to the widespread applications such as data augmentation for question answering systems. During the past decades, many different question generation models have been proposed, from traditional rule-based methods to advanced neural network-based methods. Since there have been a large variety of research works proposed, we believe it is the right time to summarize the current status, learn from existing methodologies, and gain some insights for future development. In contrast to existing reviews, in this survey, we try to provide a more comprehensive taxonomy of question generation tasks from three different perspectives, i.e., the types of the input context text, the target answer, and the generated question. We take a deep look into existing models from different dimensions to analyze their underlying ideas, major design principles, and training strategies We compare these models through benchmark tasks to obtain an empirical understanding of the existing techniques. Moreover, we discuss what is missing in the current literature and what are the promising and desired future directions.


Author(s):  
Kristin Stock ◽  
Christopher B. Jones ◽  
Shaun Russell ◽  
Mansi Radke ◽  
Prarthana Das ◽  
...  

Author(s):  
Natalya V. Ufimtseva ◽  
Olga V. Balyasnikova

The article presents the results of the study dedicated to native speakers sites of memory associated with key images of the Russian national culture. The investigation was inspired by the work of French historians Les lieus de mmoire (1984), whose ideas Yuri Nikolayevich Karaulov applied to the Russian Associative Dictionary (RAD). The study was initiated with the hypothesis elaborated by Yu. N. Karaulov that the Russian national memory could be studied through associative dictionaries. This provision is based on the linguistic personality concept formulated by Yu. N. Karaulov that is regarded as a personality expressed in a language / text and can be reconstructed on the basis of linguistic means. The texts that a language personality produces reflect the peculiarities of a persons vision of the environment (worldview). The hypothesis is tested on associative fields of the toponym Moscow and the lexemes war and Sunday using the data of several associative dictionaries compiled from 1988 to the current moment, i.e., the Russian Associative Dictionary, and Yu. N. Karaulov among the authors, as well as a number of later dictionaries developed on the basis of massive associative experiments carried out in the regions of Russia. The content and structural analyses of the associative fields of stimuli Moscow , war , and Sunday show that the associative material largely reflects the discursive space of the language personality and its functioning in texts that reproduce these sites of memory in a precedent form. The latter, however, can be found as various types of reactions (predications) of a non-stereotyped nature. Therefore, the sought-for data exist in different guises, obviously depending on the historical time and the discursive experience of native speakers of a language/culture, as well as on the region of their residence. This study confirms the psycholinguistic concept of meaning (including the associative one) as a sociocultural phenomenon.


2021 ◽  
Author(s):  
Baosheng Yin ◽  
Yifei Sun

Abstract As an important part of information extraction, relationship extraction aims to extract the relationships between given entities from natural language text. On the basis of the pre-training model R-BERT, this paper proposes an entity relationship extraction method that integrates entity dependency path and pre-training model, which generates a dependency parse tree by dependency parsing, obtains the dependency path of entity pair via a given entity, and uses entity dependency path to exclude such information as modifier chunks and useless entities in sentences. This model has achieved good F1 value performance on the SemEval2010 Task 8 dataset. Experiments on dataset show that dependency parsing can provide context information for models and improve performance.


Ethnicities ◽  
2021 ◽  
Vol 21 (6) ◽  
pp. 1093-1112
Author(s):  
Dominic O’Sullivan ◽  
Heather Came ◽  
Tim McCreanor ◽  
Jacquie Kidd

The New Zealand state developed from a treaty between the British Crown and hapū (sub-tribes) in 1840. The te Reo (Māori language) text and the English version of the agreement are fundamentally different. Breaches of this treaty and tension over how the political relationship between Māori and the Crown should proceed are ongoing. In 2019, the Cabinet Office issued a Circular instructing bureaucratic advisers of the questions they should address when providing advice to ministers on the agreement’s contemporary application. In this article, we use Critical Tiriti Analysis (CTA) – an analytical framework applied to public policies – to suggest additional and alternative questions to inform bureaucratic advice. The article defines CTA in detail and shows how using it in this way could protect Māori rights to tino rangatiratanga (a sovereignty and authority that is not subservient to others) and substantive engagement, as citizens, in the formation of public policy. This article’s central argument is that the Circular reflects an important evolution in government policy thought. However, in showing how the Circular privileges the English version (the Treaty of Waitangi) over the Māori text (Te Tiriti o Waitangi), the article demonstrates how Māori political authority remains subservient to the Crown in ways that Te Tiriti did not intend. We show through the conceptual illustration of the care and protection of Māori children, despite the significant evolution in government thought that it represents, these rights are not fully protected by the Circular. This is significant because it was Te Tiriti, with its protection of extant Māori authority and sovereignty, that was signed by all but 39 of the more than 500 chiefs who agreed to the British Crown establishing government over their own people, but who did not agree to the colonial relationship which may be read into the English version.


Author(s):  
Hakob Avchyan

Talyshi, classified as belonging to the Northwestern Iranian group of languages, includes a wide range of dialects, which, based on phonological, grammatical and lexical factors, are traditionally divided into three main clusters: Northern, Central and Southern (see Bazin 1980; Stilo 1981; Yarshater 1996). Anbarāni is a Northern Talyshi dialect spoken in Anbarān district (baxš) of Namin sub-provincial district (šahrestān) (Ardabil province) of Iran, in the city of Anbarān and adjacent villages. Relatively a large number of studies deal with the Northern Talyshi dialects spoken in the Republic of Azerbaijan. For example, “The Talyshi Texts” by Professor of Iranian Studies Boris Miller, includes short stories, anecdotes, conversations, poetic verses (see Miller 1930) written down in these dialects. Unlike the Northern Talyshi dialects spoken in the Republic of Azerbaijan, the Anbarāni dialect is little known and less studied therefore the number of texts, published in this dialect, is quite limited. The aim of this paper is to broaden the scope of knowledge on Anbarāni by introducing a short story of Mullah Nasreddin and analyzing a number of morphological, syntactical and lexical dialectal features based on the text.


2021 ◽  
Vol 1 (2) ◽  
pp. 65-86
Author(s):  
İmren Gökce Vaz de Carvalho

The study of forms of address in translation is a type of register analysis that provides an interesting insight into the way specific linguistic patterns are transferred from one language to another. This article explores how the forms of address are rendered in the Turkish translation of A Jangada de Pedra (1986) by the Portuguese author José Saramago. Paratextual and textual analyses demon­strate that this work has been translated into Turkish through the English translation of the book, and that the English translation has influenced the choices of the Turkish translator. The findings of the study seem to support the hypothesis that using a mediating language/text that lacks similar forms of address as the ultimate source and the target languages/texts can cause shifts in tenor, which results in a different reading of interpersonal relationships between fictional characters in the target text.


2021 ◽  
Author(s):  
Johannes Lindén ◽  
Tingting Zhang ◽  
Stefan Forsström ◽  
Patrik Österberg

Information extraction is a task that can extract meta-data information from text. The research in this article proposes a new information extraction algorithm called GenerateIE. The proposed algorithm identifies pairs of entities and relations described in a piece of text. The extracted meta-data is useful in many areas, but within this research the focus is to use them in news-media contexts to provide the gist of the written articles for analytics and paraphrasing of news information. GenerateIE algorithm is compared with existing state of the art algorithms with two benefits. Firstly, the GenerateIE provides the co-referenced word as the entity instead of using he, she, it, etc. which is more beneficial for knowledge graphs. Secondly GenerateIE can be applied on multiple languages without changing the algorithm itself apart from the underlying natural language text-parsing. Furthermore, the performance of GenerateIE compared with state-of-the-art algorithms is not significantly better, but it offers competitive results.


2021 ◽  
Vol 32 (4) ◽  
pp. 305-326
Author(s):  
Vadim V. Dementyev ◽  

The article deals with heart-moving stories about kitties published on the Internet by the Moscow adoption center for cats “Murkosha”, which are studied in connection with the dominants of the volunteer discourse. The article focuses on speech genre and narrative characteristics of these stories. It is shown that their specific language, text, and other features are secondary to the main goal – to influence the reader, to induce him either to take the cat directly or to help the adoption center financially. Accordingly, the means that make it possible to enhance the impact come to the fore: expression, different types of direct appeal to moral imperatives, playing with values; at the syntax level – an abundance of direct and indirect directives. The author shows some speechgenre connections of the heart-moving stories about kitties with other genres of volunteer and non-volunteer (advertising) discourse: they are united by the role of indirect communication in text-building and (usually not directly named) practical illocutionary goal; but they demostrate a different attitude to the values that the author of the heart-moving stories about kitties operates (among them the main one is the increase in the amount of love and goodness in the world). The cat is presented in a humanized form (the method of personification) (hence, and not only from “advertising” intentionality, there is a lot of indirect communication, including metaphors, pastiche of various “human” genres), most often as a child. Hence – many “children’s” genres, imitation of the features of children’s speech and speech of adults in communication with children. The method of personification determines the greatest variety of expressive means when describing the “most metaphorical” components of a cat’s image (frame actants): the “character” of the cat and the “communication” (friendship, love) of the cat and the owner. The article analyzes illocutionary types of heart-moving stories about kitties, identified by the authors themselves and marked with smilies (emergency message for help; the story of a cat that entered the adoption center and is ready to be handed over to the future owner; a message about the need for especially careful treatment of a cat with physical or psychological problems; an adoption center that found a family; “letter from home” from new owners) and narrative types (narrative with partially expressed authorship; mixed (authorship) narratives; pastiche of the “dialogue” of a cat with an adoption center employee; “narrative” on behalf of a cat, etc.). A separate micro-study deals with the heart-moving stories frame structure, where the actants / slots are distinguished: a cat (external data, diseases, and other physical and / or psychological problems, “psychology” of a cat); a past owner of a cat, a new or future owner of a cat; street, street life, homelessness, dangers; adoption center for cats “Murkosha” and its staff. There is a characteristic of the use and distribution of linguistic means (primarily expressive: metaphors, especially – conceptual metaphors, definitions, including applications, epithets, etc.) by frame actants and slots. In particular, it is shown that the image of a cat is formed by three meaningful dominants (concepts): love-friendship (as an indissoluble unity) (hence the image of communication happiness), orphanhood and doing good. The latter corresponds to the dominant of the volunteer discourse. Of the two remaining, orphanhood is well combined with it (the targeting of doing good is emphasized), but the latter rather contradicts it (it is more likely mutually beneficial cooperation, even exchange, than disinterested service) and can probably be explained by the focus on the diversity of reaching a heterogeneous audience, where the motives to take a cat from a shelter can also be different.


Author(s):  
Regīna Kvašīte ◽  
◽  
Kazimiers Župerka ◽  

The aim of the research is to find out what words are used in Lithuanian and Latvian to name the rural population. The study was performed by applying descriptive, comparative and quantitative methods. The novelty of the article is the presentation of the Lithuanian language material in Latvian, as well as the analysis of the Latvian language material and the comparison of the meanings and use of Lithuanian and Latvian words. The study is sociolinguistic, not normative; therefore, not only systematic but also contextual, situational synonymy is important. Dictionaries and texts of literary and common languages, synonyms, slang and jargon, the text of the current Lithuanian language (Dabartinės lietuvių kalbos tekstynas) and the Latvian language text corpus (Latviešu valodas tekstu korpuss), are the main sources. A Lithuanian word kaimietis (‘a villager’), which has long been a neutral name for a rural resident or a person born in a village, is a synonym for both neutral and stylistically connoted words. The most common synonyms are sodietis (‘a homestead peasant’) and valstietis (‘a peasant’). In this synonym sequence, a peasant is a remote word that includes the concept “kaimo gyventojas” (‘a rural resident’) and the concept “žemdirbys” (‘an agriculturalist’), thus linking the synonym sequence of the word a villager to a word farmer in the sequence of synonyms ūkininkas (‘a farmer’), laukininkas (‘a field peasant’). Recently, the word kaimietis (‘a villager’) has acquired a second – pejorative – meaning: “sakoma apie neišsilavinusį, prasto skonio ir pan. žmogų, kuris nebūtinai kilęs iš kaimo” (‘it is said of an uneducated, a person of poor taste, and so on, a person who does not necessarily come from the countryside’). It is already recorded in the written dictionary of the common language, which indicates that the common connoted meaning in slang is codified. The word kaimietis (‘a villager’), used in a pejorative sense, appears in the order of words that have a systemic or contextual pejorative meaning, as well as in a despising way: prastuolis, prasčiokas, mužikas, runkelis. The name of the villager in Latvian – the word laucinieks (‘a villager’) – is stylistically neutral, its synonyms consist of the neutral words lauksaimnieks (‘a farmer’) and zemnieks (‘a peasant’). The word zemnieks, similarly to the valstietis (‘a peasant’) in Lithuanian, is the dominant in the order of distant synonyms zemkopis (‘an agriculturalist’) and zemesrūķis [?]. The approach to the synonym sādžinieks (‘a homestead peasant’) is ambiguous: its definition in current dictionaries associates the word either with Latgale or Russia, although according to its origin, it is considered to be a borrowing from the Lithuanian language. The word with root lauk- (from word ‘field’) lauķis [?] is used in a pejorative sense in Latvian (its shade is similar to the Lithuanian words prasčiokas (‘a hick’) and runkelis (‘a person as mindless as a beetroot’)), as well as slang word pāķis [?] and barbarisms – slavism mužiks (‘a kern’), Germanism bauris [?] (in jargon bauers). The material of Lithuanian and Latvian texts shows that in both Lithuanian and Latvian, the words of different connotations are used synonymously in different contexts.


Sign in / Sign up

Export Citation Format

Share Document