English Machine Translation Model Based on Modern Intelligent Recognition Technology

English machine translation is a natural language processing research direction that has important scientific research value and practical value in the current artificial intelligence boom. The variability of language, the limited ability to express semantic information, and the lack of parallel corpus resources all limit the usefulness and popularity of English machine translation in practical applications. The self-attention mechanism has received a lot of attention in English machine translation tasks because of its highly parallelizable computing ability, which reduces the model’s training time and allows it to capture the semantic relevance of all words in the context. The efficiency of the self-attention mechanism, however, differs from that of recurrent neural networks because it ignores the position and structure information between context words. The English machine translation model based on the self-attention mechanism uses sine and cosine position coding to represent the absolute position information of words in order to enable the model to use position information between words. This method, on the other hand, can reflect relative distance but does not provide directionality. As a result, a new model of English machine translation is proposed, which is based on the logarithmic position representation method and the self-attention mechanism. This model retains the distance and directional information between words, as well as the efficiency of the self-attention mechanism. Experiments show that the nonstrict phrase extraction method can effectively extract phrase translation pairs from the n-best word alignment results and that the extraction constraint strategy can improve translation quality even further. Nonstrict phrase extraction methods and n-best alignment results can significantly improve the quality of translation translations when compared to traditional phrase extraction methods based on single alignment.

Download Full-text

Research on Uyghur-Chinese Neural Machine Translation Based on the Transformer at Multistrategy Segmentation Granularity

Mobile Information Systems ◽

10.1155/2021/5744248 ◽

2021 ◽

Vol 2021 ◽

pp. 1-7

Author(s):

Zhiwang Xu ◽

Huibin Qin ◽

Yongzhu Hua

Keyword(s):

Neural Networks ◽

Machine Translation ◽

Semantic Features ◽

Training Method ◽

Chinese Translation ◽

Neural Machine Translation ◽

Translation Model ◽

Parallel Corpus ◽

Model Based ◽

Translation Systems

In recent years, machine translation based on neural networks has become the mainstream method in the field of machine translation, but there are still challenges of insufficient parallel corpus and sparse data in the field of low resource translation. Existing machine translation models are usually trained on word-granularity segmentation datasets. However, different segmentation granularities contain different grammatical and semantic features and information. Only considering word granularity will restrict the efficient training of neural machine translation systems. Aiming at the problem of data sparseness caused by the lack of Uyghur-Chinese parallel corpus and complex Uyghur morphology, this paper proposes a multistrategy segmentation granular training method for syllables, marked syllable, words, and syllable word fusion and targets traditional recurrent neural networks and convolutional neural networks; the disadvantage of the network is to build a Transformer Uyghur-Chinese Neural Machine Translation model based entirely on the multihead self-attention mechanism. In CCMT2019, dimension results on Uyghur-Chinese bilingual datasets show that the effect of multiple translation granularity training method is significantly better than the rest of granularity segmentation translation systems, while the Transformer model can obtain higher BLEU value than Uyghur-Chinese translation model based on Self-Attention-RNN.

Download Full-text

N-gram-based Machine Translation

Computational Linguistics ◽

10.1162/coli.2006.32.4.527 ◽

2006 ◽

Vol 32 (4) ◽

pp. 527-549 ◽

Cited By ~ 71

Author(s):

José B. Mariño ◽

Rafael E. Banchs ◽

Josep M. Crego ◽

Adrià de Gispert ◽

Patrik Lambert ◽

...

Keyword(s):

Linear Combination ◽

Machine Translation ◽

State Of The Art ◽

Statistical Machine Translation ◽

European Parliament ◽

The State ◽

Translation Model ◽

Model Based ◽

N Gram ◽

Log Linear

This article describes in detail an n-gram approach to statistical machine translation. This approach consists of a log-linear combination of a translation model based on n-grams of bilingual units, which are referred to as tuples, along with four specific feature functions. Translation performance, which happens to be in the state of the art, is demonstrated with Spanish-to-English and English-to-Spanish translations of the European Parliament Plenary Sessions (EPPS).

Download Full-text