Instant Translation Model Adaptation by Translating Unseen Words in Continuous Vector Space

Author(s):  
Shonosuke Ishiwatari ◽  
Naoki Yoshinaga ◽  
Masashi Toyoda ◽  
Masaru Kitsuregawa
2015 ◽  
Vol 23 (3) ◽  
pp. 461-471 ◽  
Author(s):  
Ruiji Fu ◽  
Jiang Guo ◽  
Bing Qin ◽  
Wanxiang Che ◽  
Haifeng Wang ◽  
...  

Author(s):  
Matthew Snover ◽  
Bonnie Dorr ◽  
Richard Schwartz

2015 ◽  
Author(s):  
Marlies van der Wees ◽  
Arianna Bisazza ◽  
Christof Monz

2020 ◽  
Vol 17 (4) ◽  
pp. 529-538 ◽  
Author(s):  
Mohammad Khademi ◽  
Mohammad Fakhredanesh ◽  
Seyed Hoseini

Traditional methods of summarization are not cost-effective and possible today. Extractive summarization is a process that helps to extract the most important sentences from a text automatically, and generates a short informative summary. In this work, we propose a novel unsupervised method to summarize Persian texts. The proposed method adopt a hybrid approach that clusters the concepts of the text using deep learning and traditional statistical methods. First we produce a word embedding based on Hamshahri2 corpus and a dictionary of word frequencies. Then the proposed algorithm extracts the keywords of the document, clusters its concepts, and finally ranks the sentences to produce the summary. We evaluated the proposed method on Pasokh single-document corpus using the ROUGE evaluation measure. Without using any hand-crafted features, our proposed method achieves better results than the state-of-the-art related work results. We compared our unsupervised method with the best supervised Persian methods and we achieved an overall improvement of ROUGE-2 recall score of 7.5%


2016 ◽  
Vol 22 (4) ◽  
pp. 501-516 ◽  
Author(s):  
REINHARD RAPP ◽  
SERGE SHAROFF ◽  
PIERRE ZWEIGENBAUM

AbstractThis paper highlights some of the recent developments in the field of machine translation using comparable corpora. We start by updating previous definitions of comparable corpora and then look at bilingual versions of continuous vector space models. Recently, neural networks have been used to obtain latent context representations with only few dimensions which are often called word embeddings. These promising new techniques cannot only be applied to parallel but also to comparable corpora. Subsequent sections of the paper discuss work specifically targeting at machine translation using comparable corpora, as well as work dealing with the extraction of parallel segments from comparable corpora. Finally, we give an overview on the design and the results of a recent shared task on measuring document comparability across languages.


Sign in / Sign up

Export Citation Format

Share Document