word alignments
Recently Published Documents


TOTAL DOCUMENTS

49
(FIVE YEARS 6)

H-INDEX

7
(FIVE YEARS 1)

2021 ◽  
Author(s):  
Janaki Sheth ◽  
Young-Suk Lee ◽  
Ramón Fernandez Astudillo ◽  
Tahira Naseem ◽  
Radu Florian ◽  
...  
Keyword(s):  

2020 ◽  
Vol 27 (3) ◽  
pp. 531-552
Author(s):  
Chunpeng Ma ◽  
Akihiro Tamura ◽  
Masao Utiyama ◽  
Tiejun Zhao ◽  
Eiichiro Sumita

2020 ◽  
Vol 34 (05) ◽  
pp. 8886-8893
Author(s):  
Kai Song ◽  
Kun Wang ◽  
Heng Yu ◽  
Yue Zhang ◽  
Zhongqiang Huang ◽  
...  

We investigate the task of constraining NMT with pre-specified translations, which has practical significance for a number of research and industrial applications. Existing works impose pre-specified translations as lexical constraints during decoding, which are based on word alignments derived from target-to-source attention weights. However, multiple recent studies have found that word alignment derived from generic attention heads in the Transformer is unreliable. We address this problem by introducing a dedicated head in the multi-head Transformer architecture to capture external supervision signals. Results on five language pairs show that our method is highly effective in constraining NMT with pre-specified translations, consistently outperforming previous methods in translation quality.


Author(s):  
Masoud Jalili Sabet ◽  
Philipp Dufter ◽  
François Yvon ◽  
Hinrich Schütze

Author(s):  
Laritza Coello-Guilarte ◽  
Rosa María Ortega-Mendoza ◽  
Luis Villaseñor-Pineda ◽  
Manuel Montes-y-Gómez

2017 ◽  
Vol 5 ◽  
pp. 501-514
Author(s):  
Anahita Mansouri Bigvand ◽  
Te Bu ◽  
Anoop Sarkar

Current word alignment models do not distinguish between different types of alignment links. In this paper, we provide a new probabilistic model for word alignment where word alignments are associated with linguistically motivated alignment types. We propose a novel task of joint prediction of word alignment and alignment types and propose novel semi-supervised learning algorithms for this task. We also solve a sub-task of predicting the alignment type given an aligned word pair. In our experimental results, the generative models we introduce to model alignment types significantly outperform the models without alignment types.


2017 ◽  
Vol 43 (3) ◽  
pp. 521-565 ◽  
Author(s):  
Dun Deng ◽  
Nianwen Xue

In this article, we conduct an empirical investigation of translation divergences between Chinese and English relying on a parallel treebank. To do this, we first devise a hierarchical alignment scheme where Chinese and English parse trees are aligned in a way that eliminates conflicts and redundancies between word alignments and syntactic parses to prevent the generation of spurious translation divergences. Using this Hierarchically Aligned Chinese–English Parallel Treebank (HACEPT), we are able to semi-automatically identify and categorize the translation divergences between the two languages and quantify each type of translation divergence. Our results show that the translation divergences are much broader than described in previous studies that are largely based on anecdotal evidence and linguistic knowledge. The distribution of the translation divergences also shows that some high-profile translation divergences that motivate previous research are actually very rare in our data, whereas other translation divergences that have previously received little attention actually exist in large quantities. We also show that HACEPT allows the extraction of syntax-based translation rules, most of which are expressive enough to capture the translation divergences, and point out that the syntactic annotation in existing treebanks is not optimal for extracting such translation rules. We also discuss the implications of our study for attempts to bridge translation divergences by devising shared semantic representations across languages. Our quantitative results lend further support to the observation that although it is possible to bridge some translation divergences with semantic representations, other translation divergences are open-ended, thus building a semantic representation that captures all possible translation divergences may be impractical.


Sign in / Sign up

Export Citation Format

Share Document