Semantic Map Borrowing – Case Representation in Northeastern Romani Dialects

Recent studies in contact linguistics have emphasized the aspect of language-internal grammaticalization that is triggered by accommodation to an external (contact-language) model (e.g. Heine and Kuteva, 2005). This is based on the notion that speakers make use of the available resources in order to match them to those of the target language. A problematic issue is contact-induced change in the domain of case representation. Synthetic case markers are usually thought of as fully grammaticalized morphemes. If contact-induced grammaticalization is, as Heine and Kuteva suggest, much like monolingual grammaticalization, unidirectional, how do we treat instances of rearrangement of the semantic meaning and scope of case markers? I will discuss this problem by examining a sample of Romani dialects, belonging to the so-called Northeastern dialect group (see Matras, 2002). Relying on specific constructions, like Subject of Negative Existence, External Possession, Privative, Partitive etc., I will compare and contrast the Northeastern dialects with their respective contact languages (Russian and Polish). Using semantic maps, I will demonstrate how the Romani dialects in question restructure their case representation system to accommodate to the systems of the model languages, and will discuss what it is exactly that gets equated when two languages come into contact.

Download Full-text

MTIL2017: Machine Translation Using Recurrent Neural Network on Statistical Machine Translation

Journal of Intelligent Systems ◽

10.1515/jisys-2018-0016 ◽

2019 ◽

Vol 28 (3) ◽

pp. 447-453 ◽

Cited By ~ 5

Author(s):

Sainik Kumar Mahata ◽

Dipankar Das ◽

Sivaji Bandyopadhyay

Keyword(s):

Machine Translation ◽

Statistical Machine Translation ◽

Language Model ◽

Target Language ◽

Data Sets ◽

Shared Task ◽

Automatic Translation ◽

External Data ◽

Statistical Mt

Abstract Machine translation (MT) is the automatic translation of the source language to its target language by a computer system. In the current paper, we propose an approach of using recurrent neural networks (RNNs) over traditional statistical MT (SMT). We compare the performance of the phrase table of SMT to the performance of the proposed RNN and in turn improve the quality of the MT output. This work has been done as a part of the shared task problem provided by the MTIL2017. We have constructed the traditional MT model using Moses toolkit and have additionally enriched the language model using external data sets. Thereafter, we have ranked the phrase tables using an RNN encoder-decoder module created originally as a part of the GroundHog project of LISA lab.

Download Full-text

GlobalTrait: Personality Alignment of Multilingual Word Embeddings

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33017015 ◽

2019 ◽

Vol 33 ◽

pp. 7015-7022

Author(s):

Farhad Bin Siddique ◽

Dario Bertero ◽

Pascale Fung

Keyword(s):

Personality Traits ◽

Big Five ◽

Target Language ◽

Big Five Personality ◽

Word Embeddings ◽

Alignment Method ◽

Semantic Meaning ◽

Text Data ◽

Source Language ◽

High Resource

We propose a multilingual model to recognize Big Five Personality traits from text data in four different languages: English, Spanish, Dutch and Italian. Our analysis shows that words having a similar semantic meaning in different languages do not necessarily correspond to the same personality traits. Therefore, we propose a personality alignment method, GlobalTrait, which has a mapping for each trait from the source language to the target language (English), such that words that correlate positively to each trait are close together in the multilingual vector space. Using these aligned embeddings for training, we can transfer personality related training features from high-resource languages such as English to other low-resource languages, and get better multilingual results, when compared to using simple monolingual and unaligned multilingual embeddings. We achieve an average F-score increase (across all three languages except English) from 65 to 73.4 (+8.4), when comparing our monolingual model to multilingual using CNN with personality aligned embeddings. We also show relatively good performance in the regression tasks, and better classification results when evaluating our model on a separate Chinese dataset.

Download Full-text

Cross-Cultural Blunders in Professional Communication from a Semantic Perspective

Journal of Technical Writing and Communication ◽

10.2190/tw.40.2.e ◽

2010 ◽

Vol 40 (2) ◽

pp. 179-196

Author(s):

Pinfan Zhu

Keyword(s):

Cross Cultural ◽

Target Language ◽

Professional Communication ◽

Semantic Meaning ◽

Inappropriate Use ◽

Business Failures ◽

Animal Names ◽

The Way

Cross-cultural blunders caused by inappropriate use of language are a common problem in international professional communication. They cause misunderstanding, lead to business failures, and tend to be offensive at times. Such blunders may occur in business ads, slogans, products names, and instructions. Understanding their causes and finding solutions to them are of importance in international professional communication. By examining specific cases, the article analyzes the causes that lead to such blunders from a semantic perspective and concludes that indiscriminate use of the semantic meaning of a word, a lexical form, lexical sound, numbers, color words, and animal names of the target language is the major cause of causing cultural blunders in international professional communication. Along the way, the article also offers solutions to the problems identified.

Download Full-text

Mixed Languages

Oxford Research Encyclopedia of Linguistics ◽

10.1093/acrefore/9780199384655.013.151 ◽

2018 ◽

Cited By ~ 3

Author(s):

Felicity Meakins

Keyword(s):

Structural Evolution ◽

Common Language ◽

Evolution Of Language ◽

Contact Linguistics ◽

Life And Death ◽

Contact Language ◽

The Subject ◽

Rare Category ◽

Sociohistorical Context ◽

Mixed Languages

Mixed languages are a rare category of contact language which has gone from being an oddity of contact linguistics to the subject of media excitement, at least for one mixed language—Light Warlpiri. They show considerable diversity in structure, social function, and historical origins; nonetheless, they all emerged in situations of bilingualism where a common language is already present. In this respect, they do not serve a communicative function, but rather are markers of an in-group identity. Mixed languages provide a unique opportunity to study the often observable birth, life, and death of languages both in terms of the sociohistorical context of language genesis and the structural evolution of language.

Download Full-text

Source-side Reordering to Improve Machine Translation between Languages with Distinct Word Orders

ACM Transactions on Asian and Low-Resource Language Information Processing ◽

10.1145/3448252 ◽

2021 ◽

Vol 20 (4) ◽

pp. 1-18

Author(s):

Karunesh Kumar Arora ◽

Shyam Sunder Agrawal

Keyword(s):

Machine Translation ◽

Language Model ◽

Target Language ◽

Long Distance ◽

Parts Of Speech ◽

Source Sentence ◽

The Subject ◽

Statistical Systems ◽

Translation Systems ◽

The Impact

English and Hindi have significantly different word orders. English follows the subject-verb-object (SVO) order, while Hindi primarily follows the subject-object-verb (SOV) order. This difference poses challenges to modeling this pair of languages for translation. In phrase-based translation systems, word reordering is governed by the language model, the phrase table, and reordering models. Reordering in such systems is generally achieved during decoding by transposing words within a defined window. These systems can handle local reorderings, and while some phrase-level reorderings are carried out during the formation of phrases, they are weak in learning long-distance reorderings. To overcome this weakness, researchers have used reordering as a step in pre-processing to render the reordered source sentence closer to the target language in terms of word order. Such approaches focus on using parts-of-speech (POS) tag sequences and reordering the syntax tree by using grammatical rules, or through head finalization. This study shows that mere head finalization is not sufficient for the reordering of sentences in the English-Hindi language pair. It describes various grammatical constructs and presents a comparative evaluation of reorderings with the original and the head-finalized representations. The impact of the reordering on the quality of translation is measured through the BLEU score in phrase-based statistical systems and neural machine translation systems. A significant gain in BLEU score was noted for reorderings in different grammatical constructs.

Download Full-text

A Model for Matching Semantic Maps between Languages (French/English, English/French)

Computational Linguistics ◽

10.1162/089120103322145298 ◽

2003 ◽

Vol 29 (2) ◽

pp. 155-178 ◽

Cited By ~ 18

Author(s):

Sabine Ploux ◽

Hyungsuk Ji

Keyword(s):

Second Language ◽

Semantic Similarity ◽

Spatial Model ◽

Target Language ◽

Source Language ◽

Semantic Maps ◽

French And English ◽

New Space ◽

Semantic Values

This article describes a spatial model for matching semantic values between two languages, French and English. Based on semantic similarity links, the model constructs a map that represents a word in the source language. Then the algorithm projects the map values onto a space in the target language. The new space abides by the semantic similarity links specific to the second language. Then the two maps are projected onto the same plane in order to detect overlapping values. For instructional purposes, the different steps are presented here using a few examples. The entire set of results is available at the following address: http://dico.isc.cnrs.fr .

Download Full-text

Better Document-Level Machine Translation with Bayes’ Rule

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00319 ◽

2020 ◽

Vol 8 ◽

pp. 346-360

Author(s):

Lei Yu ◽

Laurent Sartran ◽

Wojciech Stokowiec ◽

Wang Ling ◽

Lingpeng Kong ◽

...

Keyword(s):

Search Algorithm ◽

Language Model ◽

Target Language ◽

Test Time ◽

Bayes Rule ◽

Independence Assumption ◽

Source Language ◽

Proposed Model ◽

Reverse Translation ◽

Document Level

We show that Bayes’ rule provides an effective mechanism for creating document translation models that can be learned from only parallel sentences and monolingual documents a compelling benefit because parallel documents are not always available. In our formulation, the posterior probability of a candidate translation is the product of the unconditional (prior) probability of the candidate output document and the “reverse translation probability” of translating the candidate output back into the source language. Our proposed model uses a powerful autoregressive language model as the prior on target language documents, but it assumes that each sentence is translated independently from the target to the source language. Crucially, at test time, when a source document is observed, the document language model prior induces dependencies between the translations of the source sentences in the posterior. The model’s independence assumption not only enables efficient use of available data, but it additionally admits a practical left-to-right beam-search algorithm for carrying out inference. Experiments show that our model benefits from using cross-sentence context in the language model, and it outperforms existing document translation approaches.

Download Full-text

COMPARATIVE STUDY OF TITLES IN ENGLISH POP SONGS AND THOSE IN THEIR TRANSLATED VERSIONS BASED ON NEWMARK’S MODEL

VNU Journal of Foreign Studies ◽

10.25073/2525-2445/vnufs.4611 ◽

2020 ◽

Vol 36 (5) ◽

Author(s):

Doan Thuy Quynh

Keyword(s):

Comparative Study ◽

Target Language ◽

Literal Translation ◽

Analysis Methods ◽

Semantic Translation ◽

Free Translation ◽

Research Findings ◽

Translation Methods ◽

Compare And Contrast ◽

Communicative Translation

In this study, we compare and contrast the titles of English pop songs with those of Vietnamese translated versions based on Newmark’s model. 65 titles in the original and 65 titles in the target language were chosen to compare and contrast with the purpose of seeing how many titles are retained and how many titles are novel in terms of content. To reach its aim, the study used two main methods including comparison and description, which were based on Newmark’s model. In addition, statistical and analysis methods were also applied to examining the titles. The research findings showed that in general 51 titles are retained in the target language through 5 translation methods: literal translation, semantic translation, communicative translation, free translation and adaption. 10 titles do not carry any characteristics in comparison with the original. The study also revealed that translators used mainly free translation to convert the titles from English into Vietnamese.

Download Full-text

Semantic contrastive linguistics theory and dialectological studies

Cognitive Studies | Études cognitives ◽

10.11649/cs.2012.006 ◽

2015 ◽

pp. 79-92

Author(s):

Danuta Roszko

Keyword(s):

Target Language ◽

Natural Languages ◽

Specific Language ◽

Semantic Level ◽

Original Language ◽

Local Dialect ◽

Contrastive Linguistics ◽

Similarities And Differences ◽

Compare And Contrast ◽

Tertium Comparationis

Semantic contrastive linguistics theory and dialectological studiesTheoretical contrastive studies (hereinafter referred to as TCS) emerged with a view to compare and contrast natural languages on the basis of a logical interlanguage. The idea of making the TCS guidelines available to science resulted in discontinuing the division into the original language and the target language when comparing and contrasting two (or more languages), and at the same time, terminating the dependence of the resulting material (i.e. form indexes in the target language) on the formal structures in the original language. The TCS essence is included in the interlanguage, which is used as tertium comparationis in the studies. To get more on this topic see Koseska, Korytkowska, R. Roszko (2007). Till now, TCS have not been applied in dialectal studies. There are a lot of reasons for this conjuncture. First of all, dialectal studies usually concentrate on one code (i.e. only a single local dialect is being specified), whilst in TCS, a comparison and contrast between (at least two) languages is provided. Moreover, research on the dialectal differentiation of a specific language (i.e. at least two dialects (/ local dialects) are being specified together) is based on demonstrating the features shared and differentiated on the level of (a) lexis, (b) morphology (most often narrowed to demonstrate differential morphological features) and (c) syntactic (relatively most rarely). Thus, dialectal studies are essentially a description of the formal conjuncture, whereas semantic aspects are out of the area of researchers interest. With this article, I am going to break the current patterns and prove that dialectal studies can be conducted in accordance with the TCS guidelines. The advantage of such dialectal studies is not only a different/new look at a specific local dialect, but also a possibility of an instant comparison and contrast between the local dialect and the standardized language or other local dialects (of one language or another) on the semantic level providing the highest standard of the relevances demonstrated (i.e. similarities and differences).

Download Full-text

إشکالیة ترجمة الأداة "ف " کحرف عطف إلى اللغة الألمانیة وأثر ذلک على فهم النص الهدف دراسة لغویة تحلیلیة فی ضوء بعض ترجمات القرآن الکریم باللغة الألمانیة The Problematic Issue of the Translation of the Particle Fā from Arabic into German and its Impact on Understanding the Target Language An analytical linguistically Study in the light of some translations of Qur’an vers

مجلة قطاع الدراسات الإنسانیة ◽

10.21608/jsh.2020.99806 ◽

2020 ◽

Vol 25 (2) ◽

pp. 909-956

Author(s):

محمد أحمد أحمد قاسم

Keyword(s):

Target Language ◽

Problematic Issue

Download Full-text