scholarly journals Semantic Map Borrowing – Case Representation in Northeastern Romani Dialects

2016 ◽  
Vol 9 (2) ◽  
pp. 211-245 ◽  
Author(s):  
Anton Tenser

Recent studies in contact linguistics have emphasized the aspect of language-internal grammaticalization that is triggered by accommodation to an external (contact-language) model (e.g. Heine and Kuteva, 2005). This is based on the notion that speakers make use of the available resources in order to match them to those of the target language. A problematic issue is contact-induced change in the domain of case representation. Synthetic case markers are usually thought of as fully grammaticalized morphemes. If contact-induced grammaticalization is, as Heine and Kuteva suggest, much like monolingual grammaticalization, unidirectional, how do we treat instances of rearrangement of the semantic meaning and scope of case markers? I will discuss this problem by examining a sample of Romani dialects, belonging to the so-called Northeastern dialect group (see Matras, 2002). Relying on specific constructions, like Subject of Negative Existence, External Possession, Privative, Partitive etc., I will compare and contrast the Northeastern dialects with their respective contact languages (Russian and Polish). Using semantic maps, I will demonstrate how the Romani dialects in question restructure their case representation system to accommodate to the systems of the model languages, and will discuss what it is exactly that gets equated when two languages come into contact.

2019 ◽  
Vol 28 (3) ◽  
pp. 447-453 ◽  
Author(s):  
Sainik Kumar Mahata ◽  
Dipankar Das ◽  
Sivaji Bandyopadhyay

Abstract Machine translation (MT) is the automatic translation of the source language to its target language by a computer system. In the current paper, we propose an approach of using recurrent neural networks (RNNs) over traditional statistical MT (SMT). We compare the performance of the phrase table of SMT to the performance of the proposed RNN and in turn improve the quality of the MT output. This work has been done as a part of the shared task problem provided by the MTIL2017. We have constructed the traditional MT model using Moses toolkit and have additionally enriched the language model using external data sets. Thereafter, we have ranked the phrase tables using an RNN encoder-decoder module created originally as a part of the GroundHog project of LISA lab.


Author(s):  
Farhad Bin Siddique ◽  
Dario Bertero ◽  
Pascale Fung

We propose a multilingual model to recognize Big Five Personality traits from text data in four different languages: English, Spanish, Dutch and Italian. Our analysis shows that words having a similar semantic meaning in different languages do not necessarily correspond to the same personality traits. Therefore, we propose a personality alignment method, GlobalTrait, which has a mapping for each trait from the source language to the target language (English), such that words that correlate positively to each trait are close together in the multilingual vector space. Using these aligned embeddings for training, we can transfer personality related training features from high-resource languages such as English to other low-resource languages, and get better multilingual results, when compared to using simple monolingual and unaligned multilingual embeddings. We achieve an average F-score increase (across all three languages except English) from 65 to 73.4 (+8.4), when comparing our monolingual model to multilingual using CNN with personality aligned embeddings. We also show relatively good performance in the regression tasks, and better classification results when evaluating our model on a separate Chinese dataset.


2010 ◽  
Vol 40 (2) ◽  
pp. 179-196
Author(s):  
Pinfan Zhu

Cross-cultural blunders caused by inappropriate use of language are a common problem in international professional communication. They cause misunderstanding, lead to business failures, and tend to be offensive at times. Such blunders may occur in business ads, slogans, products names, and instructions. Understanding their causes and finding solutions to them are of importance in international professional communication. By examining specific cases, the article analyzes the causes that lead to such blunders from a semantic perspective and concludes that indiscriminate use of the semantic meaning of a word, a lexical form, lexical sound, numbers, color words, and animal names of the target language is the major cause of causing cultural blunders in international professional communication. Along the way, the article also offers solutions to the problems identified.


Author(s):  
Felicity Meakins

Mixed languages are a rare category of contact language which has gone from being an oddity of contact linguistics to the subject of media excitement, at least for one mixed language—Light Warlpiri. They show considerable diversity in structure, social function, and historical origins; nonetheless, they all emerged in situations of bilingualism where a common language is already present. In this respect, they do not serve a communicative function, but rather are markers of an in-group identity. Mixed languages provide a unique opportunity to study the often observable birth, life, and death of languages both in terms of the sociohistorical context of language genesis and the structural evolution of language.


Author(s):  
Karunesh Kumar Arora ◽  
Shyam Sunder Agrawal

English and Hindi have significantly different word orders. English follows the subject-verb-object (SVO) order, while Hindi primarily follows the subject-object-verb (SOV) order. This difference poses challenges to modeling this pair of languages for translation. In phrase-based translation systems, word reordering is governed by the language model, the phrase table, and reordering models. Reordering in such systems is generally achieved during decoding by transposing words within a defined window. These systems can handle local reorderings, and while some phrase-level reorderings are carried out during the formation of phrases, they are weak in learning long-distance reorderings. To overcome this weakness, researchers have used reordering as a step in pre-processing to render the reordered source sentence closer to the target language in terms of word order. Such approaches focus on using parts-of-speech (POS) tag sequences and reordering the syntax tree by using grammatical rules, or through head finalization. This study shows that mere head finalization is not sufficient for the reordering of sentences in the English-Hindi language pair. It describes various grammatical constructs and presents a comparative evaluation of reorderings with the original and the head-finalized representations. The impact of the reordering on the quality of translation is measured through the BLEU score in phrase-based statistical systems and neural machine translation systems. A significant gain in BLEU score was noted for reorderings in different grammatical constructs.


2003 ◽  
Vol 29 (2) ◽  
pp. 155-178 ◽  
Author(s):  
Sabine Ploux ◽  
Hyungsuk Ji

This article describes a spatial model for matching semantic values between two languages, French and English. Based on semantic similarity links, the model constructs a map that represents a word in the source language. Then the algorithm projects the map values onto a space in the target language. The new space abides by the semantic similarity links specific to the second language. Then the two maps are projected onto the same plane in order to detect overlapping values. For instructional purposes, the different steps are presented here using a few examples. The entire set of results is available at the following address: http://dico.isc.cnrs.fr .


2020 ◽  
Vol 8 ◽  
pp. 346-360
Author(s):  
Lei Yu ◽  
Laurent Sartran ◽  
Wojciech Stokowiec ◽  
Wang Ling ◽  
Lingpeng Kong ◽  
...  

We show that Bayes’ rule provides an effective mechanism for creating document translation models that can be learned from only parallel sentences and monolingual documents a compelling benefit because parallel documents are not always available. In our formulation, the posterior probability of a candidate translation is the product of the unconditional (prior) probability of the candidate output document and the “reverse translation probability” of translating the candidate output back into the source language. Our proposed model uses a powerful autoregressive language model as the prior on target language documents, but it assumes that each sentence is translated independently from the target to the source language. Crucially, at test time, when a source document is observed, the document language model prior induces dependencies between the translations of the source sentences in the posterior. The model’s independence assumption not only enables efficient use of available data, but it additionally admits a practical left-to-right beam-search algorithm for carrying out inference. Experiments show that our model benefits from using cross-sentence context in the language model, and it outperforms existing document translation approaches.


2020 ◽  
Vol 36 (5) ◽  
Author(s):  
Doan Thuy Quynh

In this study, we compare and contrast the titles of English pop songs with those of Vietnamese translated versions based on Newmark’s model. 65 titles in the original and 65 titles in the target language were chosen to compare and contrast with the purpose of seeing how many titles are retained and how many titles are novel in terms of content. To reach its aim, the study used two main methods including comparison and description, which were based on Newmark’s model. In addition, statistical and analysis methods were also applied to examining the titles. The research findings showed that in general 51 titles are retained in the target language through 5 translation methods: literal translation, semantic translation, communicative translation, free translation and adaption. 10 titles do not carry any characteristics in comparison with the original. The study also revealed that translators used mainly free translation to convert the titles from English into Vietnamese.


2015 ◽  
pp. 79-92
Author(s):  
Danuta Roszko

Semantic contrastive linguistics theory and dialectological studiesTheoretical contrastive studies (hereinafter referred to as TCS) emerged with a view to compare and contrast natural languages on the basis of a logical interlanguage. The idea of making the TCS guidelines available to science resulted in discontinuing the division into the original language and the target language when comparing and contrasting two (or more languages), and at the same time, terminating the dependence of the resulting material (i.e. form indexes in the target language) on the formal structures in the original language. The TCS essence is included in the interlanguage, which is used as tertium comparationis in the studies. To get more on this topic see Koseska, Korytkowska, R. Roszko (2007). Till now, TCS have not been applied in dialectal studies. There are a lot of reasons for this conjuncture. First of all, dialectal studies usually concentrate on one code (i.e. only a single local dialect is being specified), whilst in TCS, a comparison and contrast between (at least two) languages is provided. Moreover, research on the dialectal differentiation of a specific language (i.e. at least two dialects (/ local dialects) are being specified together) is based on demonstrating the features shared and differentiated on the level of (a) lexis, (b) morphology (most often narrowed to demonstrate differential morphological features) and (c) syntactic (relatively most rarely). Thus, dialectal studies are essentially a description of the formal conjuncture, whereas semantic aspects are out of the area of researchers interest. With this article, I am going to break the current patterns and prove that dialectal studies can be conducted in accordance with the TCS guidelines. The advantage of such dialectal studies is not only a different/new look at a specific local dialect, but also a possibility of an instant comparison and contrast between the local dialect and the standardized language or other local dialects (of one language or another) on the semantic level providing the highest standard of the relevances demonstrated (i.e. similarities and differences).


Sign in / Sign up

Export Citation Format

Share Document