Survey of Scientific Document Summarization Techniques

Sheena Kurian K; Sheena Mathew

doi:10.7494/csci.2020.21.2.3356

Survey of Scientific Document Summarization Techniques

Computer Science ◽

10.7494/csci.2020.21.2.3356 ◽

2020 ◽

Vol 21 (2) ◽

Author(s):

Sheena Kurian K ◽

Sheena Mathew

Keyword(s):

Text Summarization ◽

Exponential Rate ◽

Research Papers ◽

Document Summarization ◽

Automatic Text Summarization ◽

Scientific Document Summarization ◽

Pros And Cons ◽

Comparison Of The Results ◽

Evaluation Techniques ◽

Automatic Text

The number of scientic or research papers published every year is growing at an exponential rate, which has led to an intensive research in scientic document summarization. The different methods commonly used in automatic text summarization are discussed in this paper with their pros and cons. Commonly used evaluation techniques and datasets in this field are also discussed. Rouge and Pyramid scores of the different methods are tabulated for easy comparison of the results.

Download Full-text

A Quantum-Inspired Genetic Algorithm for Extractive Text Summarization

International Journal of Natural Computing Research ◽

10.4018/ijncr.2021040103 ◽

2021 ◽

Vol 10 (2) ◽

pp. 42-60

Author(s):

Khadidja Chettah ◽

Amer Draa

Keyword(s):

Genetic Algorithm ◽

State Of The Art ◽

Text Summarization ◽

Automated System ◽

Evaluation Metrics ◽

Document Summarization ◽

Automatic Text Summarization ◽

Reference Methods ◽

Textual Data ◽

Automatic Text

Automatic text summarization has recently become a key instrument for reducing the huge quantity of textual data. In this paper, the authors propose a quantum-inspired genetic algorithm (QGA) for extractive single-document summarization. The QGA is used inside a totally automated system as an optimizer to search for the best combination of sentences to be put in the final summary. The presented approach is compared with 11 reference methods including supervised and unsupervised summarization techniques. They have evaluated the performances of the proposed approach on the DUC 2001 and DUC 2002 datasets using the ROUGE-1 and ROUGE-2 evaluation metrics. The obtained results show that the proposal can compete with other state-of-the-art methods. It is ranked first out of 12, outperforming all other algorithms.

Download Full-text

Implementing Supervised Approach to Summarization of Research Papers

International Journal for Modern Trends in Science and Technology - RTT2020 ◽

10.46501/ijmtst061275 ◽

2020 ◽

Vol 6 (12) ◽

pp. 398-401

Author(s):

Shaguna Awasth

Keyword(s):

Supervised Learning ◽

Research Paper ◽

Text Summarization ◽

Research Papers ◽

Global Context ◽

Automatic Text Summarization ◽

Automatic Text ◽

Intuitive Model

Using automatic text summarization we can reduce a document to its main information or to what is known as crux of the document .Recent research in this zone has zeroed in on neural ways to deal with summarisation, which can be very data hungry. This paper aims to explore a quicker way by implementing a supervised-learning based extractive summarisation system for the summarisation of research papers. This paper also explores the possibility of any section, in a research paper being the prime section to generate summaries by utilizing ROUGE scores. An easy to implement and intuitive model is developed using glove embeddings and doc2vec to encode sentences and documents in their local and global context producing grammatically coherent summaries.

Download Full-text

Automatic Text Summarization Using Latent Drichlet Allocation (LDA) for Document Clustering

International Journal of Advances in Intelligent Informatics ◽

10.26555/ijain.v1i3.43 ◽

2015 ◽

Vol 1 (3) ◽

pp. 132 ◽

Cited By ~ 5

Author(s):

Erwin Yudi Hidayat ◽

Fahri Firdausillah ◽

Khafiizh Hastuti ◽

Ika Novita Dewi ◽

Azhari Azhari

Keyword(s):

Clustering Algorithm ◽

Document Clustering ◽

Text Summarization ◽

Data Set ◽

Document Summarization ◽

Automatic Text Summarization ◽

Improve Accuracy ◽

Automatic Document Summarization ◽

Document Compression ◽

Automatic Text

In this paper, we present Latent Drichlet Allocation in automatic text summarization to improve accuracy in document clustering. The experiments involving 398 data set from public blog article obtained by using python scrapy crawler and scraper. Several steps of clustering in this research are preprocessing, automatic document compression using feature method, automatic document compression using LDA, word weighting and clustering algorithm The results show that automatic document summarization with LDA reaches 72% in LDA 40%, compared to traditional k-means method which only reaches 66%.

Download Full-text

Automatic Text Summarization in Digital Libraries

Handbook of Research on Digital Libraries ◽

10.4018/978-1-59904-879-6.ch016 ◽

2009 ◽

pp. 159-172 ◽

Cited By ~ 2

Author(s):

Shiyan Ou ◽

Christopher S.G. Khoo ◽

Dion Hoe-Lian Goh

Keyword(s):

Digital Libraries ◽

Text Summarization ◽

Automatic Text Summarization ◽

History Of ◽

Multidocument Summarization ◽

Evaluation Approaches ◽

Evaluation Techniques ◽

Automatic Text

This chapter describes various text summarization techniques and evaluation techniques that have been proposed in literature and discusses the application of text summarization in digital libraries. First, it introduces the history of automatic text summarization and various types of summaries. Next, it reviews various approaches which have been used for single-document and multidocument summarization. Then, it describes the major evaluation approaches for assessing the generated summaries. Finally, it outlines the principal trends of the area of automatic text summarization. This chapter aims to help the reader to obtain a clear overview of the text summarization field and facilitate the application of text summarization in digital libraries.

Download Full-text

A Systematic Survey on Multi-document Text Summarization

International Journal of Advanced Trends in Computer Science and Engineering ◽

10.30534/ijatcse/2021/111062021 ◽

2021 ◽

Vol 10 (6) ◽

pp. 3148-3153

Keyword(s):

Deep Learning ◽

Text Summarization ◽

Evaluation Metrics ◽

Automatic Process ◽

Document Summarization ◽

Text Document ◽

Automatic Text Summarization ◽

As Graph ◽

Abstractive Summarization ◽

Automatic Text

Automatic text summarization is a technique of generating short and accurate summary of a longer text document. Text summarization can be classified based on the number of input documents (single document and multi-document summarization) and based on the characteristics of the summary generated (extractive and abstractive summarization). Multi-document summarization is an automatic process of creating relevant, informative and concise summary from a cluster of related documents. This paper does a detailed survey on the existing literature on the various approaches for text summarization. Few of the most popular approaches such as graph based, cluster based and deep learning-based summarization techniques are discussed here along with the evaluation metrics, which can provide an insight to the future researchers.

Download Full-text

A Multi-document Summarization System for News Articles in Portuguese using Integer Linear Programming

10.5753/eniac.2019.9320 ◽

2019 ◽

Author(s):

Laerth Gomes ◽

Hilário Oliveira

Keyword(s):

Linear Programming ◽

Integer Linear Programming ◽

Relevant Information ◽

Brazilian Portuguese ◽

Text Summarization ◽

Document Summarization ◽

Automatic Text Summarization ◽

Summarization System ◽

Automatic Text ◽

Intense Research

Automatic Text Summarization (ATS) has been demanding intense research in recent years. Its importance is given the fact that ATS systems can aid in the processing of large amounts of textual documents. The ATS task aims to create a summary of one or more documents by extracting their most relevant information. Despite the existence of several works, researches involving the development of ATS systems for documents written in Brazilian Portuguese are still a few. In this paper, we propose a multi-document summarization system following a concept-based approach using Integer Linear Programming for the generation of summaries from news articles written in Portuguese. Experiments using the CSTNews corpus were performed to evaluate different aspects of the proposed system. The experimental results obtained regarding the ROUGE measures demonstrate that the developed system presents encourage results, outperforming other works of the literature.

Download Full-text