A New Approach for Single Text Document Summarization

Author(s):  
Chandra Shekhar Yadav ◽  
Aditi Sharan ◽  
Rakesh Kumar ◽  
Payal Biswas
2015 ◽  
Vol 5 (4) ◽  
pp. 46-70 ◽  
Author(s):  
Chandra Shekhar Yadav ◽  
Aditi Sharan

Summarization is a way to represent same information in concise way with equal sense. This can be categorized in two type Abstractive and Extractive type. Our work is focused around Extractive summarization. A generic approach to extractive summarization is to consider sentence as an entity, score each sentence based on some indicative features to ascertain the quality of sentence for inclusion in summary. Sort the sentences on the score and consider top n sentences for summarization. Mostly statistical features have been used for scoring the sentences. A hybrid model for a single text document summarization is being proposed. This hybrid model is an extraction based approach, which is combination of Statistical and semantic technique. The hybrid model depends on the linear combination of statistical measures: sentence position, TF-IDF, Aggregate similarity, centroid, and semantic measure. The idea to include sentiment analysis for salient sentence extraction is derived from the concept that emotion plays an important role in communication to effectively convey any message hence, it can play a vital role in text document summarization. For comparison, five system summaries have been generated: Proposed Work, MEAD system, Microsoft system, OPINOSIS system, and Human generated summary, and evaluation is done using ROUGE score.


2020 ◽  
Vol 143 ◽  
pp. 112958 ◽  
Author(s):  
Mudasir Mohd ◽  
Rafiya Jan ◽  
Muzaffar Shah

Author(s):  
Amal M. Al-Numai ◽  
Aqil M. Azmi

As the number of electronic text documents is increasing so is the need for an automatic text summarizer. The summary can be extractive, compression, or abstractive. In the former, the more important sentences are retained, more or less in their original structure, while the second one involves reducing the length of each sentence. For the latter, it requires a fusion of multiple sentences and/or paraphrasing. This chapter focuses on the abstractive text summarization (ATS) of a single text document. The study explores what ATS is. Additionally, the literature of the field of ATS is investigated. Different datasets and evaluation techniques used in assessing the summarizers are discussed. The fact is that ATS is much more challenging than its extractive counterpart, and as such, there are a few works in this area for all the languages.


Author(s):  
Arti Jain ◽  
Anuja Arora ◽  
Divakar Yadav ◽  
Jorge Morato ◽  
Amanpreet Kaur

In the contemporary world, utilization of digital content has risen exponentially. For example, newspaper and web articles, status updates, advertisements etc. have become an integral part of our daily routine. Thus, there is a need to build an automated system to summarize such large documents of text in order to save time and effort. Although, there are summarizers for languages such as English since the work has started in the 1950s and at present has led it up to a matured stage but there are several languages that still need special attention such as Punjabi language. The Punjabi language is highly rich in morphological structure as compared to English and other foreign languages. In this work, we provide three phase extractive summarization methodology using neural networks. It induces compendious summary of Punjabi single text document. The methodology incorporates pre-processing phase that cleans the text; processing phase that extracts statistical and linguistic features; and classification phase. The classification based neural network applies an activation function- sigmoid and weighted error reduction-gradient descent optimization to generate the resultant output summary. The proposed summarization system is applied over monolingual Punjabi text corpus from Indian languages corpora initiative phase-II. The precision, recall and F-measure are achieved as 90.0%, 89.28% an 89.65% respectively which is reasonably good in comparison to the performance of other existing Indian languages’ summarizers.


Author(s):  
Gabriel Silva ◽  
Rafael Ferreira ◽  
Rafael Dueire Lins ◽  
Luciano Cabral ◽  
Hilário Oliveira ◽  
...  

Author(s):  
Sandhya P. ◽  
Mahek Laxmikant Kantesaria

Named entity recognition (NER) is a subtask of the information extraction. NER system reads the text and highlights the entities. NER will separate different entities according to the project. NER is the process of two steps. The steps are detection of names and classifications of them. The first step is further divided into the segmentation. The second step will consist to choose an ontology which will organize the things categorically. Document summarization is also called automatic summarization. It is a process in which the text document with the help of software will create a summary by selecting the important points of the original text. In this chapter, the authors explain how document summarization is performed using named entity recognition. They discuss about the different types of summarization techniques. They also discuss about how NER works and its applications. The libraries available for NER-based information extraction are explained. They finally explain how NER is applied into document summarization.


Sign in / Sign up

Export Citation Format

Share Document