Experiments on Applying a Text Summarization System for Question Answering

Purpose: Pandemic COVID-19 has created an emergency for the medical community. Researchers require extensive study of scientific literature in order to discover drugs and vaccines. In this situation where every minute is valuable to save the lives of hundreds of people, a quick understanding of scientific articles will help the medical community. Automatic text summarization makes this possible. Materials and Methods: In this study, a recurrent neural network-based extractive summarization is proposed. The extractive method identifies the informative parts of the text. Recurrent neural network is very powerful for analyzing sequences such as text. The proposed method has three phases: sentence encoding, sentence ranking, and summary generation. To improve the performance of the summarization system, a coreference resolution procedure is used. Coreference resolution identifies the mentions in the text that refer to the same entity in the real world. This procedure helps to summarization process by discovering the central subject of the text. Results: The proposed method is evaluated on the COVID-19 research articles extracted from the CORD-19 dataset. The results show that the combination of using recurrent neural network and coreference resolution embedding vectors improves the performance of the summarization system. The Proposed method by achieving the value of ROUGE1-recall 0.53 demonstrates the improvement of summarization performance by using coreference resolution embedding vectors in the RNN-based summarization system. Conclusion: In this study, coreference information is stored in the form of coreference embedding vectors. Jointly use of recurrent neural network and coreference resolution results in an efficient summarization system.

Download Full-text

INFORMATION OVERLAP IN MULTILINGUAL WIKIPEDIA AND SUMMARIZATION

International Journal of Cooperative Information Systems ◽

10.1142/s0218843013500019 ◽

2012 ◽

Vol 21 (04) ◽

pp. 383-403 ◽

Cited By ~ 1

Author(s):

ELENA FILATOVA

Keyword(s):

Question Answering ◽

Human Subjects ◽

Amazon Mechanical Turk ◽

Information Selection ◽

Training Corpus ◽

Selection Systems ◽

Pyramid Model ◽

Selection Tasks ◽

Summarization System

Wikipedia is used as a training corpus for many information selection tasks: summarization, question-answering, etc. The information presented in Wikipedia articles as well as the order in which this information is presented, is treated as the gold standard and is used for improving the quality of information selection systems. However, the Wikipedia articles corresponding to the same entry (person, location, event, etc.) written in different languages have substantial differences regarding what information is included in these articles. In this paper we analyze the regularities of information overlap among the articles about the same Wikipedia entry written in different languages: some information facts are covered in the Wikipedia articles in many languages, while others are covered only in a few languages. We introduce a hypothesis that the structure of this information overlap is similar to the information overlap structure (pyramid model) used in summarization evaluation, as well as the information overlap/repetition structure used to identify important information for multidocument summarization. We prove the correctness of our hypothesis by building a summarization system according to the presented information overlap hypothesis. This system summarizes English Wikipedia articles given the articles about the same Wikipedia entries written in other languages. To evaluate the quality of the created summaries, we use Amazon Mechanical Turk as the source of human subjects who can reliably judge the quality of the created text. We also compare the summaries generated according to the information overlap hypothesis against the lead line baseline which is considered to be the most reliable way to generate summaries of Wikipedia articles. The summarization experiment proves the correctness of the introduced multilingual Wikipedia information overlap hypothesis.

Download Full-text

Story Summarization Using a Question-Answering Approach

Handbook of Research on Natural Language Processing and Smart Service Systems - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-7998-4730-4.ch003 ◽

2021 ◽

pp. 46-69

Author(s):

Sanah Nashir Sayyed ◽

Namrata Mahender C.

Keyword(s):

Statistical Methods ◽

Question Answering ◽

Text Summarization ◽

Loss Of Information ◽

Representative Data ◽

Minimal Loss ◽

Speech Data ◽

Linguistic Methods ◽

The Given ◽

Text Images

Summarization is the process of selecting representative data to produce a reduced version of the given data with a minimal loss of information; so, it generally works on text, images, videos, and speech data. The chapter deals with not only concepts of text summarization (types, stages, issues, and criteria) but also with applications. The two main categories of approaches generally used in text summaries (i.e., abstractive and extractive) are discussed. Abstractive techniques use linguistic methods to interpret the text; they produce understandable and semantically equivalent sentences with a shorter length. Extractive techniques mostly rely on statistical methods for extracting essential sentences from the given text. In addition, the authors explore the SACAS model to exemplify the process of summarization. The SACAS system analyzed 50 stories, and its evaluation is presented in terms of a new measurement based on question-answering MOS, which is also introduced in this chapter.

Download Full-text

Extractive text summarization system to aid data extraction from full text in systematic review development

Journal of Biomedical Informatics ◽

10.1016/j.jbi.2016.10.014 ◽

2016 ◽

Vol 64 ◽

pp. 265-272 ◽

Cited By ~ 17

Author(s):

Duy Duc An Bui ◽

Guilherme Del Fiol ◽

John F. Hurdle ◽

Siddhartha Jonnalagadda

Keyword(s):

Systematic Review ◽

Full Text ◽

Data Extraction ◽

Text Summarization ◽

Summarization System

Download Full-text

COMPENDIUM: A Text Summarization System for Generating Abstracts of Research Papers

Natural Language Processing and Information Systems - Lecture Notes in Computer Science ◽

10.1007/978-3-642-22327-3_2 ◽

2011 ◽

pp. 3-14 ◽

Cited By ~ 4

Author(s):

Elena Lloret ◽

María Teresa Romá-Ferri ◽

Manuel Palomar

Keyword(s):

Text Summarization ◽

Research Papers ◽

Summarization System

Download Full-text

A Novel Hybrid Text Summarization System for Punjabi Text

Cognitive Computation ◽

10.1007/s12559-015-9359-3 ◽

2015 ◽

Vol 8 (2) ◽

pp. 261-277 ◽

Cited By ~ 10

Author(s):

Vishal Gupta ◽

Narvinder Kaur

Keyword(s):

Text Summarization ◽

Summarization System

Download Full-text

Personalized Text Content Summarizer for Mobile Learning: An Automatic Text Summarization System with Relevance Based Language Model

2012 IEEE Fourth International Conference on Technology for Education ◽

10.1109/t4e.2012.23 ◽

2012 ◽

Cited By ~ 6

Author(s):

Guangbing Yang ◽

Dunwei Wen ◽

Kinshuk ◽

Nian-Shing Chen ◽

Erkki Sutinen

Keyword(s):

Mobile Learning ◽

Language Model ◽

Text Summarization ◽

Automatic Text Summarization ◽

Summarization System ◽

Text Content ◽

Automatic Text

Download Full-text

A HYBRID MODEL USING THE PRETRAINED BERT AND DEEP NEURAL NETWORKS WITH RICH FEATURE FOR EXTRACTIVE TEXT SUMMARIZATION

Journal of Computer Science and Cybernetics ◽

10.15625/1813-9663/37/2/15980 ◽

2021 ◽

Vol 37 (2) ◽

pp. 123-143

Author(s):

Tuan Minh Luu ◽

Huong Thanh Le ◽

Tan Minh Hoang

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Neural Networks ◽

Text Summarization ◽

Training Dataset ◽

Extractive Summarization ◽

Input Text ◽

Summarization System ◽

Fully Connected

Deep neural networks have been applied successfully to extractive text summarization tasks with the accompany of large training datasets. However, when the training dataset is not large enough, these models reveal certain limitations that affect the quality of the system’s summary. In this paper, we propose an extractive summarization system basing on a Convolutional Neural Network and a Fully Connected network for sentence selection. The pretrained BERT multilingual model is used to generate embeddings vectors from the input text. These vectors are combined with TF-IDF values to produce the input of the text summarization system. Redundant sentences from the output summary are eliminated by the Maximal Marginal Relevance method. Our system is evaluated with both English and Vietnamese languages using CNN and Baomoi datasets, respectively. Experimental results show that our system achieves better results comparing to existing works using the same dataset. It confirms that our approach can be effectively applied to summarize both English and Vietnamese languages.

Download Full-text