Comparative Analysis of Hindi Text Summarization for Multiple Documents by Padding of Ancillary Features

Author(s):  
Archana N. Gulati ◽  
Sudhir D. Sawarkar
2020 ◽  
Vol 13 (5) ◽  
pp. 977-986
Author(s):  
Srinivasa Rao Kongara ◽  
Dasika Sree Rama Chandra Murthy ◽  
Gangadhara Rao Kancherla

Background: Text summarization is the process of generating a short description of the entire document which is more difficult to read. This method provides a convenient way of extracting the most useful information and a short summary of the documents. In the existing research work, this is focused by introducing the Fuzzy Rule-based Automated Summarization Method (FRASM). Existing work tends to have various limitations which might limit its applicability to the various real-world applications. The existing method is only suitable for the single document summarization where various applications such as research industries tend to summarize information from multiple documents. Methods: This paper proposed Multi-document Automated Summarization Method (MDASM) to introduce the summarization framework which would result in the accurate summarized outcome from the multiple documents. In this work, multi-document summarization is performed whereas in the existing system only single document summarization was performed. Initially document clustering is performed using modified k means cluster algorithm to group the similar kind of documents that provides the same meaning. This is identified by measuring the frequent term measurement. After clustering, pre-processing is performed by introducing the Hybrid TF-IDF and Singular value decomposition technique which would eliminate the irrelevant content and would result in the required content. Then sentence measurement is one by introducing the additional metrics namely Title measurement in addition to the existing work metrics to accurately retrieve the sentences with more similarity. Finally, a fuzzy rule system is applied to perform text summarization. Results: The overall evaluation of the research work is conducted in the MatLab simulation environment from which it is proved that the proposed research method ensures the optimal outcome than the existing research method in terms of accurate summarization. MDASM produces 89.28% increased accuracy, 89.28% increased precision, 89.36% increased recall value and 70% increased the f-measure value which performs better than FRASM. Conclusion: The summarization processes carried out in this work provides the accurate summarized outcome.


Author(s):  
Towhid Ahmed Foysal ◽  
Mohaimen Abid Mahadi ◽  
Md. Mahadi Hasan Nahid ◽  
Ayesha Tasnim

2017 ◽  
Vol 44 (2) ◽  
pp. 139-147
Author(s):  
Kyeong-rim Kim ◽  
Da-yeong Lee ◽  
Hwan-Gue Cho

Author(s):  
Suneetha S. ◽  
Venugopal Reddy A.

Text summarization from multiple documents is an active research area in the current scenario as the data in the World Wide Web (WWW) is found in abundance. The text summarization process is time-consuming and hectic for the users to retrieve the relevant contents from this mass collection of the data. Numerous techniques have been proposed to provide the relevant information to the users in the form of the summary. Accordingly, this article presents the majority voting based hybrid learning model (MHLM) for multi-document summarization. First, the multiple documents are subjected to pre-processing, and the features, such as title-based, sentence length, numerical data and TF-IDF features are extracted for all the individual sentences of the document. Then, the feature set is sent to the proposed MHLM classifier, which includes the Support Vector Machine (SVM), K-Nearest Neighbors (KNN), and Neural Network (NN) classifiers for evaluating the significance of the sentences present in the document. These classifiers provide the significance scores based on four features extracted from the sentences in the document. Then, the majority voting model decides the significant texts based on the significance scores and develops the summary for the user and thereby, reduces the redundancy, increasing the quality of the summary similar to the original document. The experiment performed with the DUC 2002 data set is used to analyze the effectiveness of the proposed MHLM that attains the precision and recall at a rate of 0.94, f-measure at a rate of 0.93, and ROUGE-1 at a rate of 0.6324.


2016 ◽  
Vol 2016 ◽  
pp. 1-10 ◽  
Author(s):  
Nedunchelian Ramanujam ◽  
Manivannan Kaliappan

Nowadays, automatic multidocument text summarization systems can successfully retrieve the summary sentences from the input documents. But, it has many limitations such as inaccurate extraction to essential sentences, low coverage, poor coherence among the sentences, and redundancy. This paper introduces a new concept of timestamp approach with Naïve Bayesian Classification approach for multidocument text summarization. The timestamp provides the summary an ordered look, which achieves the coherent looking summary. It extracts the more relevant information from the multiple documents. Here, scoring strategy is also used to calculate the score for the words to obtain the word frequency. The higher linguistic quality is estimated in terms of readability and comprehensibility. In order to show the efficiency of the proposed method, this paper presents the comparison between the proposed methods with the existing MEAD algorithm. The timestamp procedure is also applied on the MEAD algorithm and the results are examined with the proposed method. The results show that the proposed method results in lesser time than the existing MEAD algorithm to execute the summarization process. Moreover, the proposed method results in better precision, recall, andF-score than the existing clustering with lexical chaining approach.


2015 ◽  
Vol 7 (2) ◽  
pp. 01-22 ◽  
Author(s):  
N.Adilah Hanin Zahri ◽  
Fumiyo Fukumoto ◽  
Matsyoshi Suguru ◽  
Ong Bi Lynn

Sign in / Sign up

Export Citation Format

Share Document