A Novel Deep Learning Method for Obtaining Bilingual Corpus from Multilingual Website
Keyword(s):
Machine translation needs a large number of parallel sentence pairs to make sure of having a good translation performance. However, the lack of parallel corpus heavily limits machine translation for low-resources language pairs. We propose a novel method that combines the continuous word embeddings with deep learning to obtain parallel sentences. Since parallel sentences are very invaluable for low-resources language pair, we introduce cross-lingual semantic representation to induce bilingual signals. Our experiments show that we can achieve promising results under lacking external resources for low-resource languages. Finally, we construct a state-of-the-art machine translation system in low-resources language pair.
2019 ◽
Vol 8
(2)
◽
pp. 01-11
2005 ◽
Vol 31
(4)
◽
pp. 477-504
◽
2019 ◽
Vol 9
(2)
◽
pp. 2940-2949
2017 ◽
Vol 108
(1)
◽
pp. 221-232
Keyword(s):
2016 ◽
Vol 106
(1)
◽
pp. 159-168
◽
2021 ◽
Vol 14
(2)
◽
pp. 494-508