Similitude Based Segment Graph Construction and Segment Ranking for Automatic Summarization of Text Document

With the increase in the amount of data and documents on the web, text summarization has become one of the significant fields which cannot be avoided in today’s digital era. Automatic text summarization provides a quick summary to the user based on the information presented in the text documents. This paper presents the automated single document summarization by constructing similitude graphs from the extracted text segments. On extracting the text segments, the feature values are computed for all the segments by comparing them with the title and the entire document and by computing segment significance using the information gain ratio. Based on the computed features, the similarity between the segments is evaluated to construct the graph in which the vertices are the segments and the edges specify the similarity between them. The segments are ranked for including them in the extractive summary by computing the graph score and the sentence segment score. The experimental analysis has been performed using ROUGE metrics and the results are analyzed for the proposed model. The proposed model has been compared with the various existing models using 4 different datasets in which the proposed model acquired top 2 positions with the average rank computed on various metrics such as precision, recall, F-score. HIGHLIGHTS Paper presents the automated single document summarization by constructing similitude graphs from the extracted text segments It utilizes information gain ratio, graph construction, graph score and the sentence segment score computation Results analysis has been performed using ROUGE metrics with 4 popular datasets in the document summarization domain The model acquired top 2 positions with the average rank computed on various metrics such as precision, recall, F-score GRAPHICAL ABSTRACT

Download Full-text

Text Document Summarization Using POS tagging for Kannada Text Documents

2021 11th International Conference on Cloud Computing, Data Science & Engineering (Confluence) ◽

10.1109/confluence51648.2021.9377106 ◽

2021 ◽

Author(s):

Jayashree R ◽

Basavaraj S Anami ◽

Poornima B K

Keyword(s):

Text Documents ◽

Document Summarization ◽

Pos Tagging ◽

Text Document

Download Full-text

Gene selection and classification combining information gain ratio with fruit fly optimisation algorithm for single-cell RNA-seq data

International Journal of Computational Science and Engineering ◽

10.1504/ijcse.2021.10041500 ◽

2021 ◽

Vol 24 (5) ◽

pp. 495

Author(s):

Jie Zhang ◽

Junhong Feng ◽

Xiani Yang ◽

Jianming Liu

Keyword(s):

Single Cell ◽

Gene Selection ◽

Information Gain ◽

Fruit Fly ◽

Rna Seq ◽

Gain Ratio ◽

Optimisation Algorithm ◽

Information Gain Ratio ◽

Combining Information

Download Full-text

Gene selection and classification combining information gain ratio with fruit fly optimisation algorithm for single-cell RNA-seq data

International Journal of Computational Science and Engineering ◽

10.1504/ijcse.2021.118098 ◽

2021 ◽

Vol 24 (5) ◽

pp. 495

Author(s):

Jie Zhang ◽

Junhong Feng ◽

Xiani Yang ◽

Jianming Liu

Keyword(s):

Single Cell ◽

Gene Selection ◽

Information Gain ◽

Fruit Fly ◽

Rna Seq ◽

Gain Ratio ◽

Optimisation Algorithm ◽

Information Gain Ratio ◽

Combining Information

Download Full-text

An Information Gain Ratio based Discovery of User Similarity in Sina Blog Community

Proceedings of the 2018 International Conference on Algorithms, Computing and Artificial Intelligence - ACAI 2018 ◽

10.1145/3302425.3302449 ◽

2018 ◽

Author(s):

Wei Ren ◽

Yepeng Qiu ◽

Xianghua Li

Keyword(s):

Information Gain ◽

Gain Ratio ◽

User Similarity ◽

Information Gain Ratio

Download Full-text

Fuzzy-rough Information Gain Ratio Approach to Filter-wrapper Feature Selection

International Journal of Engineering ◽

10.5829/ije.2017.30.09c.05 ◽

2017 ◽

Vol 30 (9) ◽

Keyword(s):

Feature Selection ◽

Information Gain ◽

Gain Ratio ◽

Ratio Approach ◽

Information Gain Ratio ◽

Wrapper Feature Selection

Download Full-text

Predicting Financial Savings Decisions Using Sigmoid Function and Information Gain Ratio

Procedia Computer Science ◽

10.1016/j.procs.2016.07.176 ◽

2016 ◽

Vol 93 ◽

pp. 19-25 ◽

Cited By ~ 3

Author(s):

P.R. Mahalingam ◽

S. Vivek

Keyword(s):

Information Gain ◽

Sigmoid Function ◽

Gain Ratio ◽

Information Gain Ratio

Download Full-text

The Development of Single-Document Abstractive Text Summarizer During the Last Decade

Trends and Applications of Text Summarization Techniques - Advances in Data Mining and Database Management ◽

10.4018/978-1-5225-9373-7.ch002 ◽

2020 ◽

pp. 32-60

Author(s):

Amal M. Al-Numai ◽

Aqil M. Azmi

Keyword(s):

Text Summarization ◽

Electronic Text ◽

Original Structure ◽

Text Documents ◽

Text Document ◽

Single Text ◽

Evaluation Techniques ◽

Automatic Text

As the number of electronic text documents is increasing so is the need for an automatic text summarizer. The summary can be extractive, compression, or abstractive. In the former, the more important sentences are retained, more or less in their original structure, while the second one involves reducing the length of each sentence. For the latter, it requires a fusion of multiple sentences and/or paraphrasing. This chapter focuses on the abstractive text summarization (ATS) of a single text document. The study explores what ATS is. Additionally, the literature of the field of ATS is investigated. Different datasets and evaluation techniques used in assessing the summarizers are discussed. The fact is that ATS is much more challenging than its extractive counterpart, and as such, there are a few works in this area for all the languages.

Download Full-text

Association Rules Mining Algorithm Based on Information Gain Ratio Attribute Reduction

Business Intelligence and Information Technology - Lecture Notes on Data Engineering and Communications Technologies ◽

10.1007/978-3-030-92632-8_18 ◽

2021 ◽

pp. 181-189

Author(s):

Tongtong Han ◽

Wenjing Wang ◽

Min Guo ◽

Shiyong Ning

Keyword(s):

Association Rules ◽

Information Gain ◽

Attribute Reduction ◽

Association Rules Mining ◽

Gain Ratio ◽

Mining Algorithm ◽

Information Gain Ratio

Download Full-text

Automated Kannada Text Summarization using Sentence features

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b1531.078219 ◽

2019 ◽

Vol 8 (2) ◽

pp. 470-474

Keyword(s):

Experimental Studies ◽

Text Summarization ◽

Sentence Length ◽

Original Text ◽

Text Documents ◽

Term Frequency ◽

Exponential Increase ◽

Proposed Model ◽

The World ◽

Feature Score

There is a growing requirement for the text summarization due to the difficulty of managing exponential increase of information accessible on the World Wide Web. Text summarization is a process to extract the contents in the original text to the shorter form which provides important information to the user. The summarizer presented in this paper produces the extractive summaries of Kannada text documents. The proposed summarizer system considers five features to determine the important sentences in the document. The features used are Term Frequency, Term Frequency-Inverse Sentence Frequency, Keywords feature, Sentence length and Sentence position. The value of each feature is computed and score for each sentence in the document is the average of all the feature score values. The sentences with the top scores are selected to be included in the extractive summary. The results of the proposed model are evaluated using ROUGE toolkit to measure the performance based on F-score of generated summaries. Experimental studies on custom-built dataset with 50 Kannada text documents shows significantly better performance in producing extractive summaries as compared to human summaries

Download Full-text