word alignment Latest Research Papers

In this paper, we will describe the current state-of-the-art of Statistical Machine Translation (SMT), and reflect on how SMT handles meaning. Statistical Machine Translation is a corpus-based approach to MT: it de-rives the required knowledge to generate new translations from corpora. General-purpose SMT systems do not use any formal semantic representa-tion. Instead, they directly extract translationally equivalent words or word sequences – expressions with the same meaning – from bilingual parallel corpora. All statistical translation models are based on the idea of word alignment, i.e., the automatic linking of corresponding words in parallel texts. The first generation SMT systems were word-based. From a linguistic point of view, the major problem with word-based systems is that the mean-ing of a word is often ambiguous, and is determined by its context. Current state-of-the-art SMT-systems try to capture the local contextual dependen-cies by using phrases instead of words as units of translation. In order to solve more complex ambiguity problems (where a broader text scope or even domain information is needed), a Word Sense Disambiguation (WSD) module is integrated in the Machine Translation environment.

Download Full-text

The Feasibility of the Intelligent Development of Contrastive Translation between English and Chinese based on the Development of E-commerce

Tobacco Regulatory Science ◽

10.18001/trs.7.5.120 ◽

2021 ◽

Vol 7 (5) ◽

pp. 2000-2011

Author(s):

Weijie Liu

Keyword(s):

Neural Network ◽

Rapid Development ◽

Word Alignment ◽

Translation Model ◽

Cross Border ◽

Network Algorithm ◽

Text Information ◽

Neural Network Algorithm ◽

The Cross ◽

Two Stages

Objectives: At present, with the rapid development and application of the Internet, the cross-border transaction of e-commerce presents a blowout development, and the demand for language is increasing. Methods: In this paper, starting from the perspective of machine intelligent translation of English and Chinese, and in view of the problem of traditional contrastive translation of machine, the algorithm of strengthening neural network was used to solve the problem of translation. In the study, the process of intelligent translation was divided into two stages: encoding and decoding. In view of the language type and word alignment, the input and output modules were formed and the algorithm was optimized, and a recurrent neural network algorithm was used to build an RNN-embed intelligent translation model of English and Chinese. Results: The model was input through the character level in English and Chinese, and then the network was trained, so as to solve the problem that it is difficult to deal with the advanced semantics in the process of strengthening the neural network calculation of text information in the cross-border transaction of e-commerce. Conclusion: It is proved by experiments that the RNN-embed translation model based on the enhanced neural network algorithm can improve the quality of the long sentence translation compared with the machine translation.

Download Full-text

Chapter 3. Word alignment in a parallel corpus of old English prose

Benjamins Translation Library - Corpora in Translation and Contrastive Research in the Digital Age ◽

10.1075/btl.158.03mar ◽

2021 ◽

Keyword(s):

Old English ◽

Word Alignment ◽

Parallel Corpus ◽

English Prose

Download Full-text

MultiMirror: Neural Cross-lingual Word Alignment for Multilingual Word Sense Disambiguation

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/539 ◽

2021 ◽

Author(s):

Luigi Procopio ◽

Edoardo Barba ◽

Federico Martelli ◽

Roberto Navigli

Keyword(s):

Considerable Increase ◽

Word Sense Disambiguation ◽

Training Data ◽

Word Alignment ◽

Word Sense ◽

Neural Models ◽

Discriminative Model ◽

Sense Disambiguation ◽

Projection Approach ◽

Cross Lingual

Word Sense Disambiguation (WSD), i.e., the task of assigning senses to words in context, has seen a surge of interest with the advent of neural models and a considerable increase in performance up to 80% F1 in English. However, when considering other languages, the availability of training data is limited, which hampers scaling WSD to many languages. To address this issue, we put forward MultiMirror, a sense projection approach for multilingual WSD based on a novel neural discriminative model for word alignment: given as input a pair of parallel sentences, our model -- trained with a low number of instances -- is capable of jointly aligning, at the same time, all source and target tokens with each other, surpassing its competitors across several language combinations. We demonstrate that projecting senses from English by leveraging the alignments produced by our model leads a simple mBERT-powered classifier to achieve a new state of the art on established WSD datasets in French, German, Italian, Spanish and Japanese. We release our software and all our datasets at https://github.com/SapienzaNLP/multimirror.

Download Full-text

Enhancing Deep Paraphrase Identification via Leveraging Word Alignment Information

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp39728.2021.9414944 ◽

2021 ◽

Author(s):

Boxin Li ◽

Tingwen Liu ◽

Bin Wang ◽

Lihong Wang

Keyword(s):

Word Alignment

Download Full-text

Leveraging Neural Machine Translation for Word Alignment

Prague Bulletin of Mathematical Linguistics ◽

10.14712/00326585.014 ◽

2021 ◽

Vol 116 (1) ◽

pp. 43-61

Author(s):

Vilém Zouhar ◽

Daria Pylypenko

Keyword(s):

Machine Translation ◽

Word Alignment ◽

Neural Machine Translation

Download Full-text

Phrase Table Combination Based on Symmetrization of Word Alignment for Low-Resource Languages

Applied Sciences ◽

10.3390/app11041868 ◽

2021 ◽

Vol 11 (4) ◽

pp. 1868

Author(s):

Sari Dewi Budiwati ◽

Al Hafiz Akbar Maulana Siagian ◽

Tirana Noor Fatyanosa ◽

Masayoshi Aritsugi

Keyword(s):

Language Model ◽

Word Alignment ◽

Low Resource ◽

Relative Improvement ◽

Direct Translation ◽

The Common ◽

Target Source

Phrase table combination in pivot approaches can be an effective method to deal with low-resource language pairs. The common practice to generate phrase tables in pivot approaches is to use standard symmetrization, i.e., grow-diag-final-and. Although some researchers found that the use of non-standard symmetrization could improve bilingual evaluation understudy (BLEU) scores, the use of non-standard symmetrization has not been commonly employed in pivot approaches. In this study, we propose a strategy that uses the non-standard symmetrization of word alignment in phrase table combination. The appropriate symmetrization is selected based on the highest BLEU scores in each direct translation of source–target, source–pivot, and pivot–target of Kazakh–English (Kk–En) and Japanese–Indonesian (Ja–Id). Our experiments show that our proposed strategy outperforms the direct translation in Kk–En with absolute improvements of 0.35 (a 11.3% relative improvement) and 0.22 (a 6.4% relative improvement) BLEU points for 3-gram and 5-gram, respectively. The proposed strategy shows an absolute gain of up to 0.11 (a 0.9% relative improvement) BLEU points compared to direct translation for 3-gram in Ja–Id. Our proposed strategy using a small phrase table obtains better BLEU scores than a strategy using a large phrase table. The size of the target monolingual and feature function weight of the language model (LM) could reduce perplexity scores.

Download Full-text

Research on Word Alignment Based on Embedded Bilingual Words

The 2nd International Conference on Computing and Data Science ◽

10.1145/3448734.3450800 ◽

2021 ◽

Author(s):

Shuang Wang ◽

Qi Guo ◽

Fucheng Wan

Keyword(s):

Word Alignment

Download Full-text

A Low-Area and Low-Power Comma Detection and Word Alignment Circuits for JESD204B/C Controller

IEEE Transactions on Circuits and Systems I Regular Papers ◽

10.1109/tcsi.2021.3072772 ◽

2021 ◽

pp. 1-11

Author(s):

Peng Yin ◽

Zhou Shu ◽

Yingjun Xia ◽

Tianmei Shen ◽

Xiao Guan ◽

...

Keyword(s):

Low Power ◽

Word Alignment ◽

Low Area

Download Full-text

word alignment
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

AUTOMATIC WORD ALIGNMENT FOR ENGLISH-VIETNAMESE BILINGUALS CORPUS USING A DEEP LEARNING APPROACH

Translational equivalence in Statistical Machine Translation or meaning as co-occurrence

The Feasibility of the Intelligent Development of Contrastive Translation between English and Chinese based on the Development of E-commerce

Chapter 3. Word alignment in a parallel corpus of old English prose

MultiMirror: Neural Cross-lingual Word Alignment for Multilingual Word Sense Disambiguation

Enhancing Deep Paraphrase Identification via Leveraging Word Alignment Information

Leveraging Neural Machine Translation for Word Alignment

Phrase Table Combination Based on Symmetrization of Word Alignment for Low-Resource Languages

Research on Word Alignment Based on Embedded Bilingual Words

A Low-Area and Low-Power Comma Detection and Word Alignment Circuits for JESD204B/C Controller

Export Citation Format

word alignmentRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

AUTOMATIC WORD ALIGNMENT FOR ENGLISH-VIETNAMESE BILINGUALS CORPUS USING A DEEP LEARNING APPROACH

Translational equivalence in Statistical Machine Translation or meaning as co-occurrence

The Feasibility of the Intelligent Development of Contrastive Translation between English and Chinese based on the Development of E-commerce

Chapter 3. Word alignment in a parallel corpus of old English prose

MultiMirror: Neural Cross-lingual Word Alignment for Multilingual Word Sense Disambiguation

Enhancing Deep Paraphrase Identification via Leveraging Word Alignment Information

Leveraging Neural Machine Translation for Word Alignment

Phrase Table Combination Based on Symmetrization of Word Alignment for Low-Resource Languages

Research on Word Alignment Based on Embedded Bilingual Words

A Low-Area and Low-Power Comma Detection and Word Alignment Circuits for JESD204B/C Controller

word alignment
Recently Published Documents