Automatic Text Document Summarization Based on Machine Learning

2018 ◽

Vol 14 (4) ◽

pp. 1-32 ◽

Cited By ~ 4

Author(s):

Chandra Yadav ◽

Aditi Sharan

Keyword(s):

Semantic Analysis ◽

Evaluation Criteria ◽

Research Area ◽

Algebraic Model ◽

Document Summarization ◽

Text Document ◽

Proposed Model ◽

Active Research ◽

Value Decomposition ◽

Automatic Text

Automatic text document summarization is active research area in text mining field. In this article, the authors are proposing two new approaches (three models) for sentence selection, and a new entropy-based summary evaluation criteria. The first approach is based on the algebraic model, Singular Value Decomposition (SVD), i.e. Latent Semantic Analysis (LSA) and model is termed as proposed_model-1, and Second Approach is based on entropy that is further divided into proposed_model-2 and proposed_model-3. In first proposed model, the authors are using right singular matrix, and second & third proposed models are based on Shannon entropy. The advantage of these models is that these are not a Length dominating model, giving better results, and low redundancy. Along with these three new models, an entropy-based summary evaluation criteria is proposed and tested. They are also showing that their entropy based proposed models statistically closer to DUC-2002's standard/gold summary. In this article, the authors are using a dataset taken from Document Understanding Conference-2002.

Download Full-text

A Systematic Survey on Multi-document Text Summarization

International Journal of Advanced Trends in Computer Science and Engineering ◽

10.30534/ijatcse/2021/111062021 ◽

2021 ◽

Vol 10 (6) ◽

pp. 3148-3153

Keyword(s):

Deep Learning ◽

Text Summarization ◽

Evaluation Metrics ◽

Automatic Process ◽

Document Summarization ◽

Text Document ◽

Automatic Text Summarization ◽

As Graph ◽

Abstractive Summarization ◽

Automatic Text

Automatic text summarization is a technique of generating short and accurate summary of a longer text document. Text summarization can be classified based on the number of input documents (single document and multi-document summarization) and based on the characteristics of the summary generated (extractive and abstractive summarization). Multi-document summarization is an automatic process of creating relevant, informative and concise summary from a cluster of related documents. This paper does a detailed survey on the existing literature on the various approaches for text summarization. Few of the most popular approaches such as graph based, cluster based and deep learning-based summarization techniques are discussed here along with the evaluation metrics, which can provide an insight to the future researchers.

Download Full-text

Automatic Text Document Summarization Using Graph Based Centrality Measures on Lexical Network

International Journal of Information Retrieval Research ◽

10.4018/ijirr.2018070102 ◽

2018 ◽

Vol 8 (3) ◽

pp. 14-32 ◽

Cited By ~ 2

Author(s):

Chandra Shakhar Yadav ◽

Aditi Sharan

Keyword(s):

Cosine Similarity ◽

Semantic Relations ◽

Centrality Measures ◽

Post Processing ◽

Document Summarization ◽

Processing Task ◽

Text Document ◽

Similarity Threshold ◽

Eigen Value ◽

Automatic Text

This article proposes a new concept of Lexical Network for Automatic Text Document Summarization. Instead of a number of chains, the authors are getting a network of sentences which is called as Lexical Network termed as LexNetwork. This network is created between sentences based on different lexical and semantic relations. In this network, a node is representing sentences and edges are representing strength between two sentences. Strength means the number of relations present between the two sentences. The importance of the sentences is decided based on different centrality measures and extracted for the summary. WSD is done with Simple Lesk technique, and Cosine-Similarity threshold (Ɵ, TH) is used as post processing task. In this article, the authors are suggesting that a Cosine similarity threshold 10% is better vs. 5%, and an Eigen-Value based centrality measure is better for summarization process. At last for comparison, they are using Semantrica-Lexalytics System.

Download Full-text

Comparative study of the Performance of Machine Learning Text Classifiers Applied to Afaan Oromo Text

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit20645 ◽

2020 ◽

pp. 77-83

Author(s):

Etana Fikadu

Keyword(s):

Machine Learning ◽

The Other ◽

Optimal Method ◽

Text Document ◽

Learning Classifier ◽

Automatic Text Classification ◽

Long Time ◽

Text Classifiers ◽

Automatic Text ◽

F Measure

The aim of this study is to find the optimal method that can be used to classify Afaan Oromo text among different classifier by using the same number of text document. Automatic text classification has been needed in many fields for a long time. Many methods are used to classify text. The performance of this classifier we used in this study is measured in terms of recall, precision and F-measure. Finally we compare the efficiencies of the Bayesian Network, Naïve Bayesian, IBK and SMO to classify Afaan Oromo text. Experimental results on the same set of Afaan Oromo documents used before show that SMO slightly outperforms the other methods. Comparison reported in this paper shows that the SMO classifier exceeds the other four Machine learning classifier.

Download Full-text

An Automatic Document Summarization Approach based on Fuzzy Ontology and Machine Learning

2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA) ◽

10.1109/dsaa49011.2020.00098 ◽

2020 ◽

Author(s):

Hongfei Liu ◽

Qian Gao

Keyword(s):

Machine Learning ◽

Document Summarization ◽

Fuzzy Ontology ◽

Automatic Document Summarization

Download Full-text

Text Document Summarization Using POS tagging for Kannada Text Documents

2021 11th International Conference on Cloud Computing, Data Science & Engineering (Confluence) ◽

10.1109/confluence51648.2021.9377106 ◽

2021 ◽

Author(s):

Jayashree R ◽

Basavaraj S Anami ◽

Poornima B K

Keyword(s):

Text Documents ◽

Document Summarization ◽

Pos Tagging ◽

Text Document

Download Full-text

Multiple Text Document Summarization System using hybrid Summarization technique

2015 1st International Conference on Next Generation Computing Technologies (NGCT) ◽

10.1109/ngct.2015.7375231 ◽

2015 ◽

Cited By ~ 3

Author(s):

Harsha Dave ◽

Shree Jaswal

Keyword(s):

Document Summarization ◽

Text Document ◽

Summarization System

Download Full-text

Multilingual Sentiment Analysis on Short Text Document Using Semi-Supervised Machine Learning

10.1145/3485768.3485775 ◽

2021 ◽

Author(s):

Joshua Lois Cruz Paulino ◽

Lexter Carl Antoja Almirol ◽

Jun Marco Cruz Favila ◽

Kent Alvin Gerald Loria Aquino ◽

Angelica Hernandez De La Cruz ◽

...

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Supervised Machine Learning ◽

Short Text ◽

Text Document

Download Full-text

A Quantum-Inspired Genetic Algorithm for Extractive Text Summarization

International Journal of Natural Computing Research ◽

10.4018/ijncr.2021040103 ◽

2021 ◽

Vol 10 (2) ◽

pp. 42-60

Author(s):

Khadidja Chettah ◽

Amer Draa

Keyword(s):

Genetic Algorithm ◽

State Of The Art ◽

Text Summarization ◽

Automated System ◽

Evaluation Metrics ◽

Document Summarization ◽

Automatic Text Summarization ◽

Reference Methods ◽

Textual Data ◽

Automatic Text

Automatic text summarization has recently become a key instrument for reducing the huge quantity of textual data. In this paper, the authors propose a quantum-inspired genetic algorithm (QGA) for extractive single-document summarization. The QGA is used inside a totally automated system as an optimizer to search for the best combination of sentences to be put in the final summary. The presented approach is compared with 11 reference methods including supervised and unsupervised summarization techniques. They have evaluated the performances of the proposed approach on the DUC 2001 and DUC 2002 datasets using the ROUGE-1 and ROUGE-2 evaluation metrics. The obtained results show that the proposal can compete with other state-of-the-art methods. It is ranked first out of 12, outperforming all other algorithms.

Download Full-text

AUTOMATIC TEXT SUMMARIZATION USING SUPERVISED MACHINE LEARNING TECHNIQUE FOR HINDI LANGAUGE

International Journal of Research in Engineering and Technology ◽

10.15623/ijret.2016.0506065 ◽

2016 ◽

Vol 05 (06) ◽

pp. 361-367

Author(s):

Nikita Desai .

Keyword(s):

Machine Learning ◽

Text Summarization ◽

Supervised Machine Learning ◽

Machine Learning Technique ◽

Automatic Text Summarization ◽

Learning Technique ◽

Automatic Text

Download Full-text

Automatic Text Document Summarization Based on Machine Learning

A New LSA and Entropy-Based Approach for Automatic Text Document Summarization

A Systematic Survey on Multi-document Text Summarization

Automatic Text Document Summarization Using Graph Based Centrality Measures on Lexical Network

Comparative study of the Performance of Machine Learning Text Classifiers Applied to Afaan Oromo Text

An Automatic Document Summarization Approach based on Fuzzy Ontology and Machine Learning

Text Document Summarization Using POS tagging for Kannada Text Documents

Multiple Text Document Summarization System using hybrid Summarization technique

Multilingual Sentiment Analysis on Short Text Document Using Semi-Supervised Machine Learning

A Quantum-Inspired Genetic Algorithm for Extractive Text Summarization

AUTOMATIC TEXT SUMMARIZATION USING SUPERVISED MACHINE LEARNING TECHNIQUE FOR HINDI LANGAUGE

Export Citation Format