scholarly journals Translation Divergences in Chinese–English Machine Translation: An Empirical Investigation

2017 ◽  
Vol 43 (3) ◽  
pp. 521-565 ◽  
Author(s):  
Dun Deng ◽  
Nianwen Xue

In this article, we conduct an empirical investigation of translation divergences between Chinese and English relying on a parallel treebank. To do this, we first devise a hierarchical alignment scheme where Chinese and English parse trees are aligned in a way that eliminates conflicts and redundancies between word alignments and syntactic parses to prevent the generation of spurious translation divergences. Using this Hierarchically Aligned Chinese–English Parallel Treebank (HACEPT), we are able to semi-automatically identify and categorize the translation divergences between the two languages and quantify each type of translation divergence. Our results show that the translation divergences are much broader than described in previous studies that are largely based on anecdotal evidence and linguistic knowledge. The distribution of the translation divergences also shows that some high-profile translation divergences that motivate previous research are actually very rare in our data, whereas other translation divergences that have previously received little attention actually exist in large quantities. We also show that HACEPT allows the extraction of syntax-based translation rules, most of which are expressive enough to capture the translation divergences, and point out that the syntactic annotation in existing treebanks is not optimal for extracting such translation rules. We also discuss the implications of our study for attempts to bridge translation divergences by devising shared semantic representations across languages. Our quantitative results lend further support to the observation that although it is possible to bridge some translation divergences with semantic representations, other translation divergences are open-ended, thus building a semantic representation that captures all possible translation divergences may be impractical.

2010 ◽  
Vol 36 (3) ◽  
pp. 535-568 ◽  
Author(s):  
Deyi Xiong ◽  
Min Zhang ◽  
Aiti Aw ◽  
Haizhou Li

Linguistic knowledge plays an important role in phrase movement in statistical machine translation. To efficiently incorporate linguistic knowledge into phrase reordering, we propose a new approach: Linguistically Annotated Reordering (LAR). In LAR, we build hard hierarchical skeletons and inject soft linguistic knowledge from source parse trees to nodes of hard skeletons during translation. The experimental results on large-scale training data show that LAR is comparable to boundary word-based reordering (BWR) (Xiong, Liu, and Lin 2006), which is a very competitive lexicalized reordering approach. When combined with BWR, LAR provides complementary information for phrase reordering, which collectively improves the BLEU score significantly. To further understand the contribution of linguistic knowledge in LAR to phrase reordering, we introduce a syntax-based analysis method to automatically detect constituent movement in both reference and system translations, and summarize syntactic reordering patterns that are captured by reordering models. With the proposed analysis method, we conduct a comparative analysis that not only provides the insight into how linguistic knowledge affects phrase movement but also reveals new challenges in phrase reordering.


2021 ◽  
Author(s):  
Mo Shahdloo ◽  
Emin Çelik ◽  
Burcu A Urgen ◽  
Jack L. Gallant ◽  
Tolga Çukur

Object and action perception in cluttered dynamic natural scenes relies on efficient allocation of limited brain resources to prioritize the attended targets over distractors. It has been suggested that during visual search for objects, distributed semantic representation of hundreds of object categories is warped to expand the representation of targets. Yet, little is known about whether and where in the brain visual search for action categories modulates semantic representations. To address this fundamental question, we studied human brain activity recorded via functional magnetic resonance imaging while subjects viewed natural movies and searched for either communication or locomotion actions. We find that attention directed to action categories elicits tuning shifts that warp semantic representations broadly across neocortex, and that these shifts interact with intrinsic selectivity of cortical voxels for target actions. These results suggest that attention serves to facilitate task performance during social interactions by dynamically shifting semantic selectivity towards target actions, and that tuning shifts are a general feature of conceptual representations in the brain.


2002 ◽  
Vol 45 (2) ◽  
pp. 332-346 ◽  
Author(s):  
Karla K. McGregor ◽  
Rena M. Friedman ◽  
Renée M. Reilly ◽  
Robyn M. Newman

Children's semantic representations and semantic naming errors were the focus of this study. In Experiment 1, 25 normally developing children (mean age=5 years 4 months) named, drew, and defined 20 age-appropriate objects. The results suggested that functional and physical properties are core aspects of object representations in the semantic lexicon and that these representations are often organized and accessed according to a taxonomic hierarchy. Results of a new procedure, comparative picture naming/picture drawing, suggested that the degree of knowledge in the semantic lexicon makes words more or less vulner-able to retrieval failure. Most semantic naming errors were associated with limited semantic knowledge, manifested as either lexical gaps or fragile representations. Comparison of definitions for correctly named and semantically misnamed objects provided converging evidence for this conclusion. In Experiment 2, involving 16 normally developing children (mean age=5 years 5 months), the comparative picture naming/picture drawing results were replicated with a stimulus set that allowed a priori matching of the visual complexity of items drawn from correct and semantic error pools. Discussion focuses on the dynamic nature of semantic representations and the relation between semantic representation and naming during a period of slow mapping. The value of comparative picture naming/ picture drawing as a new method for exploring children's semantic representa-tions is emphasized.


Author(s):  
Olivier Bonami

This paper proposes an HPSG account of the French tense and aspect system, focussing on the analysis of the passé simple (simple past) and imparfait (imperfective) tenses and their interaction with aspectually sensitive adjuncts. Starting from de Swart's (1998) analysis of the semantics of tense and aspect, I show that while the proposed semantic representations are appropriate,  the analysis of implicit aspectual operators as coercion operators is inadequate. The proposed HPSG analysis relies on Minimal Recursion Semantics to relate standard syntactic structures with de Swart-style semantic representations. The analysis has two crucial features: first, it assumes that the semantic contribution of tense  originates in the verb's semantic representation, despite the fact that tense can get wide scope over other semantic elements. Second, it allows the occurrence of implicit aspectual operators to be controlled by the verb's inflectional class, which accounts for their peculiar distribution.


1991 ◽  
Vol 39 ◽  
pp. 83-92
Author(s):  
Wander Lowie

This article deals with the influence of a learner's native language in the acquisition of L2 derivational morphology. Do learners acquire morphologically complex words like 'learnable' and 'explanation' as unanalysed units or do they acquire and store the stems ('learn'; 'explain') and generate/analyse these forms by using word formation rules? A linguistic model of the lexicon which enables both direct retrieval and generation/analysis on the basis of word formation rules is applied to second language acquisition by distinguishing formal and semantic/syntactic rules. Based on a comparative description of English and Dutch in these terms, a small scale empirical investigation was set out. The results of this research indicate that the primary determinant of the analysability of morphologically complex words is transparency, which is defined from the learner's point of view and is dependent on all prior linguistic knowledge of the learner, including L1. Furthermore, the research indicates that semantic/syntactic similarities between the languages have a greater facilitating effect than formal similarities.


2014 ◽  
Vol 7 (2) ◽  
pp. 239-264 ◽  
Author(s):  
JOSHUA C. FEDDER ◽  
LAURA WAGNER

abstractReaders actively construct representational models of meaning when reading text, and they do so by drawing on a range of kinds of information, from the specific linguistic forms of the sentences to knowledge about how the world works (Ferretti, Kutas, & McRae, 2007; Madden & Zwaan, 2003). The present set of studies focused on how grammatical aspect is integrated into a situation model and how it is connected to other dimensions of model construction. In three experiments, participants were asked to complete sentences with a choice of grammatical aspect form (perfective or imperfective). The test sentences systematically varied four dimensions of the sentence that were linked to grammatical aspect in different ways: telicity and transitivity (both linked through their semantic representations), subject animacy (linked through an inference over semantic representations), and related location information (linked through an inference grounded in world knowledge). In addition, to examine the influence of discourse function (backgrounding vs. foregrounding) on aspectual choice different construction types were varied across experiments – specifically a fronted locative construction and the presence of a generic narrative opener (Once upon a time). The results found that aspectual choice depends on information linked to the semantic representation of grammatical aspect; however, in contrast to previous work (e.g., Ferreti et al., 2007) information grounded in world knowledge (location information) did not influence aspectual choice except when it was integrated in a specialized discourse construction.


2012 ◽  
Vol 4 (3) ◽  
pp. 223-242 ◽  
Author(s):  
Pamela Perniss ◽  
David Vinson ◽  
Frank Seifart ◽  
Gabriella Vigliocco

AbstractThe question of whether different linguistic patterns differentially influence semantic and conceptual representations is of central interest in cognitive science. In this paper, we investigate whether the regular encoding of shape within a nominal classification system leads to an increased salience of shape in speakers' semantic representations by comparing English, (Amazonian) Spanish, and Bora, a shape-based classifier language spoken in the Amazonian regions of Columbia and Peru. Crucially, in displaying obligatory use, pervasiveness in grammar, high discourse frequency, and phonological variability of forms corresponding to particular shape features, the Bora classifier system differs in important ways from those in previous studies investigating effects of nominal classification, thereby allowing better control of factors that may have influenced previous findings. In addition, the inclusion of Spanish monolinguals living in the Bora village allowed control for the possibility that differences found between English and Bora speakers may be attributed to their very different living environments. We found that shape is more salient in the semantic representation of objects for speakers of Bora, which systematically encodes shape, than for speakers of English and Spanish, which do not. Our results are consistent with assumptions that semantic representations are shaped and modulated by our specific linguistic experiences.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Long H. B. Nguyen ◽  
Viet H. Pham ◽  
Dien Dinh

The Seq2Seq model and its variants (ConvSeq2Seq and Transformer) emerge as a promising novel solution to the machine translation problem. However, these models only focus on exploiting knowledge from bilingual sentences without paying much attention to utilizing external linguistic knowledge sources such as semantic representations. Not only do semantic representations can help preserve meaning but they also minimize the data sparsity problem. However, to date, semantic information remains rarely integrated into machine translation models. In this study, we examine the effect of abstract meaning representation (AMR) semantic graphs in different machine translation models. Experimental results on the IWSLT15 English-Vietnamese dataset have proven the efficiency of the proposed model, expanding the use of external language knowledge sources to significantly improve the performance of machine translation models, especially in the application of low-resource language pairs.


Author(s):  
Jong-Bok Kim ◽  
Jaehyung Yang

The so-called floating quantifier constructions in languages like Korean display intriguing properties whose successful processing can prove the robustness of a parsing system. This paper shows that a constraint-based analysis, in particular couched upon the framework of HPSG, can offer us an efficient way of analyzing these constructions together with proper semantic representations. It also shows how the analysis has been successfully implemented in the LKB (Linguistic Knowledge Building) system.


Sign in / Sign up

Export Citation Format

Share Document