Mobile Interface for Domain Specific Machine Translation Using Short Messaging Service

Author(s):  
Avinash J. Agrawal ◽  
Manoj B. Chandak
2016 ◽  
Author(s):  
Lilin Zhang ◽  
Zhen Weng ◽  
Wenyan Xiao ◽  
Jianyi Wan ◽  
Zhiming Chen ◽  
...  

Author(s):  
Josef Steinberger ◽  
Ralf Steinberger ◽  
Hristo Tanev ◽  
Vanni Zavarella ◽  
Marco Turchi

In this chapter, the authors discuss several pertinent aspects of an automatic system that generates summaries in multiple languages for sets of topic-related news articles (multilingual multi-document summarisation), gathered by news aggregation systems. The discussion follows a framework based on Latent Semantic Analysis (LSA) because LSA was shown to be a high-performing method across many different languages. Starting from a sentence-extractive approach, the authors show how domain-specific aspects can be used and how a compression and paraphrasing method can be plugged in. They also discuss the challenging problem of summarisation evaluation in different languages. In particular, the authors describe two approaches: the first uses a parallel corpus and the second statistical machine translation.


2017 ◽  
Vol 56 (05) ◽  
pp. 370-376 ◽  
Author(s):  
Roberto Pérez-Rodríguez ◽  
Luis E. Anido-Rifón ◽  
Marcos A. Mouriño-García

SummaryObjectives: The ability to efficiently review the existing literature is essential for the rapid progress of research. This paper describes a classifier of text documents, represented as vectors in spaces of Wikipedia concepts, and analyses its suitability for classification of Spanish biomedical documents when only English documents are available for training. We propose the cross-language concept matching (CLCM) technique, which relies on Wikipedia interlanguage links to convert concept vectors from the Spanish to the English space.Methods: The performance of the classifier is compared to several baselines: a classifier based on machine translation, a classifier that represents documents after performing Explicit Semantic Analysis (ESA), and a classifier that uses a domain-specific semantic an- notator (MetaMap). The corpus used for the experiments (Cross-Language UVigoMED) was purpose-built for this study, and it is composed of 12,832 English and 2,184 Spanish MEDLINE abstracts.Results: The performance of our approach is superior to any other state-of-the art classifier in the benchmark, with performance increases up to: 124% over classical machine translation, 332% over MetaMap, and 60 times over the classifier based on ESA. The results have statistical significance, showing p-values < 0.0001.Conclusion: Using knowledge mined from Wikipedia to represent documents as vectors in a space of Wikipedia concepts and translating vectors between language-specific concept spaces, a cross-language classifier can be built, and it performs better than several state-of-the-art classifiers.


2021 ◽  
Author(s):  
Hema Ala ◽  
◽  
Vandan Mujadia ◽  
Dipti Misra Sharma ◽  
◽  
...  

Sign in / Sign up

Export Citation Format

Share Document