Automatic Text Simplification Using LSTM Encoder Decoder Model

Author(s):  
Om Prakash Jena ◽  
Alok Ranjan Tripathy ◽  
Sudhansu Sekhar Patra ◽  
Manas Ranjan Chowdhury ◽  
Rajesh Kumar Sahoo
Author(s):  
Horacio Saggion

Over the past decades, information has been made available to a broad audience thanks to the availability of texts on the Web. However, understanding the wealth of information contained in texts can pose difficulties for a number of people including those with poor literacy, cognitive or linguistic impairment, or those with limited knowledge of the language of the text. Text simplification was initially conceived as a technology to simplify sentences so that they would be easier to process by natural-language processing components such as parsers. However, nowadays automatic text simplification is conceived as a technology to transform a text into an equivalent which is easier to read and to understand by a target user. Text simplification concerns both the modification of the vocabulary of the text (lexical simplification) and the modification of the structure of the sentences (syntactic simplification). In this chapter, after briefly introducing the topic of text readability, we give an overview of past and recent methods to address these two problems. We also describe simplification applications and full systems also outline language resources and evaluation approaches.


Author(s):  
Sandra M. Aluísio ◽  
Lucia Specia ◽  
Thiago A.S. Pardo ◽  
Erick G. Maziero ◽  
Renata P.M. Fortes

2020 ◽  
Author(s):  
Tarek Sakakini ◽  
Jong Yoon Lee ◽  
Aditya Duri ◽  
Renato F.L. Azevedo ◽  
Victor Sadauskas ◽  
...  

2020 ◽  
Vol 30 (02) ◽  
pp. 2050008
Author(s):  
Akihiro Katsuta ◽  
Kazuhide Yamamoto

In recent years, simple Japanese has been attracting attention as information transmission for foreigners. Automatic text simplification aims to reduce the complexity of vocabulary and expressions in a sentence while retaining its original meaning. This paper aims at compressing vocabulary, focusing on lexical simplification. Since the construction or expansion of a simplification corpus is very costly, we construct a simplification model by unsupervised learning that does not require a parallel corpus for simplification. We construct a simplification model that does not require a parallel corpus using Unsupervised Statistical Machine Translation. Based on a predetermined vocabulary, a pseudo-corpus for simplification is constructed from a web corpus and we learn the simplification model by the pseudo-corpus. We only need a vocabulary and a plain text corpus to train the simplification model. Moreover, we propose to clean the phrase table by WordNet, which improves the performance in BLEU and SARI metrics. By suppressing distant paraphrasing with WordNet, it became easier to select the correct paraphrase candidate.


Sign in / Sign up

Export Citation Format

Share Document