Improved Text Summarization of News Articles Using GA-HC and PSO-HC

Automatic Text Summarization (ATS) is gaining attention because a large volume of data is being generated at an exponential rate. Due to easy internet availability globally, a large amount of data is being generated from social networking websites, news websites and blog websites. Manual summarization is time consuming, and it is difficult to read and summarize a large amount of content. Automatic text summarization is the solution to deal with this problem. This study proposed two automatic text summarization models which are Genetic Algorithm with Hierarchical Clustering (GA-HC) and Particle Swarm Optimization with Hierarchical Clustering (PSO-HC). The proposed models use a word embedding model with Hierarchal Clustering Algorithm to group sentences conveying almost same meaning. Modified GA and adaptive PSO based sentence ranking models are proposed for text summary in news text documents. Simulations are conducted and compared with other understudied algorithms to evaluate the performance of proposed methodology. Simulations results validate the superior performance of the proposed methodology.

Download Full-text

Automatic Text Summarization Using Gensim Word2Vec and K-Means Clustering Algorithm

2020 IEEE Region 10 Symposium (TENSYMP) ◽

10.1109/tensymp50017.2020.9230670 ◽

2020 ◽

Author(s):

Mofiz Mojib Haider ◽

Md. Arman Hossin ◽

Hasibur Rashid Mahi ◽

Hossain Arif

Keyword(s):

Clustering Algorithm ◽

Text Summarization ◽

Automatic Text Summarization ◽

Automatic Text

Download Full-text

Document Summarization with VHTM: Variational Hierarchical Topic-Aware Mechanism

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6277 ◽

2020 ◽

Vol 34 (05) ◽

pp. 7740-7747 ◽

Cited By ~ 1

Author(s):

Xiyan Fu ◽

Jun Wang ◽

Jinghan Zhang ◽

Jinmao Wei ◽

Zhenglu Yang

Keyword(s):

Language Processing ◽

Topic Model ◽

Research Field ◽

Text Summarization ◽

Superior Performance ◽

The Past ◽

Automatic Text Summarization ◽

Latent Topics ◽

Further Development ◽

Automatic Text

Automatic text summarization focuses on distilling summary information from texts. This research field has been considerably explored over the past decades because of its significant role in many natural language processing tasks; however, two challenging issues block its further development: (1) how to yield a summarization model embedding topic inference rather than extending with a pre-trained one and (2) how to merge the latent topics into diverse granularity levels. In this study, we propose a variational hierarchical model to holistically address both issues, dubbed VHTM. Different from the previous work assisted by a pre-trained single-grained topic model, VHTM is the first attempt to jointly accomplish summarization with topic inference via variational encoder-decoder and merge topics into multi-grained levels through topic embedding and attention. Comprehensive experiments validate the superior performance of VHTM compared with the baselines, accompanying with semantically consistent topics.

Download Full-text

Automatic Text Summarization Using Latent Drichlet Allocation (LDA) for Document Clustering

International Journal of Advances in Intelligent Informatics ◽

10.26555/ijain.v1i3.43 ◽

2015 ◽

Vol 1 (3) ◽

pp. 132 ◽

Cited By ~ 5

Author(s):

Erwin Yudi Hidayat ◽

Fahri Firdausillah ◽

Khafiizh Hastuti ◽

Ika Novita Dewi ◽

Azhari Azhari

Keyword(s):

Clustering Algorithm ◽

Document Clustering ◽

Text Summarization ◽

Data Set ◽

Document Summarization ◽

Automatic Text Summarization ◽

Improve Accuracy ◽

Automatic Document Summarization ◽

Document Compression ◽

Automatic Text

In this paper, we present Latent Drichlet Allocation in automatic text summarization to improve accuracy in document clustering. The experiments involving 398 data set from public blog article obtained by using python scrapy crawler and scraper. Several steps of clustering in this research are preprocessing, automatic document compression using feature method, automatic document compression using LDA, word weighting and clustering algorithm The results show that automatic document summarization with LDA reaches 72% in LDA 40%, compared to traditional k-means method which only reaches 66%.

Download Full-text

Techniques and Issues in Text Mining

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2020.9079 ◽

2020 ◽

Vol 17 (9) ◽

pp. 4368-4374

Author(s):

Perpetua F. Noronha ◽

Madhu Bhan

Keyword(s):

Language Processing ◽

Text Summarization ◽

Digital Data ◽

Text Documents ◽

Significant Information ◽

Digital Era ◽

Automatic Text Summarization ◽

Text Content ◽

Available Information ◽

Automatic Text

Digital data in huge amount is being persistently generated at an unparalleled and exponential rate. In this digital era where internet stands the prime source for generating incredible information, it is vital to develop better means to mine the available information rapidly and most capably. Manual extraction of the salient information from the large input text documents is a time consuming and inefficient task. In this fast-moving world, it is difficult to read all the text-content and derive insights from it. Automatic methods are required. The task of probing for relevant documents from the large number of sources available, and consuming apt information from it is a challenging task and is need of the hour. Automatic text summarization technique can be used to generate relevant and quality information in less time. Text Summarization is used to condense the source text into a brief summary maintaining its salient information and readability. Generating summaries automatically is in great demand to attend to the growing and increasing amount of text data that is obtainable online in order to mark out the significant information and to consume it faster. Text summarization is becoming extremely popular with the advancement in Natural Language Processing (NLP) and deep learning methods. The most important gain of automatic text summarization is, it reduces the analysis time. In this paper we focus on key approaches to automatic text summarization and also about their efficiency and limitations.

Download Full-text

An Extractive Summarization Technique for Text Documents

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.f8369.088619 ◽

2019 ◽

Vol 8 (6) ◽

pp. 1202-1206

Keyword(s):

Text Summarization ◽

Text Documents ◽

Inverse Document Frequency ◽

Extractive Summarization ◽

Term Frequency ◽

Automatic Text Summarization ◽

Document Frequency ◽

Search Information ◽

Automatic Text

In order to read as well as search information quickly, there was a need to reduce the size of the documents without any changes to its content. Therefore, in order to solve this problem, there was a solution to it by introducing a technique called as automatic text summarization which is used to generate summaries from the input document by condensing large sized input documents into smaller documents without losing its meaning as well as relevancy with respect to the original document. Text summarization stands for shortening of text into accurate, meaningful sentences. The paper shows an implementation of summarization of the original document by scoring the sentence based on term frequency and inverse document frequency matrix. The entire record was compressed so that only the relevant sentences in the document were retained. This technique can be applicable in various applications like automating text documents, quicker understanding of documents because of summarization

Download Full-text

A boundary-based tokenization technique for extractive text summarization

World Journal of Advanced Research and Reviews ◽

10.30574/wjarr.2021.11.2.0351 ◽

2021 ◽

Vol 11 (2) ◽

pp. 303-312

Author(s):

Nnaemeka M Oparauwah ◽

Juliet N Odii ◽

Ikechukwu I Ayogu ◽

Vitalis C Iwuchukwu

Keyword(s):

Academic Research ◽

Text Summarization ◽

Text Documents ◽

Health Records ◽

Extractive Summarization ◽

Content Creation ◽

Text Document ◽

Automatic Text Summarization ◽

Automatic Text ◽

Selection Of

The need to extract and manage vital information contained in copious volumes of text documents has given birth to several automatic text summarization (ATS) approaches. ATS has found application in academic research, medical health records analysis, content creation and search engine optimization, finance and media. This study presents a boundary-based tokenization method for extractive text summarization. The proposed method performs word tokenization by defining word boundaries in place of specific delimiters. An extractive summarization algorithm was further developed based on the proposed boundary-based tokenization method, as well as word length consideration to control redundancy in summary output. Experimental results showed that the proposed approach enhanced word tokenization by enhancing the selection of appropriate keywords from text document to be used for summarization.

Download Full-text

EM Clustering Algorithm for Automatic Text Summarization

Advances in Artificial Intelligence - Lecture Notes in Computer Science ◽

10.1007/978-3-642-25324-9_26 ◽

2011 ◽

pp. 305-315 ◽

Cited By ~ 7

Author(s):

Yulia Ledeneva ◽

René García Hernández ◽

Romyna Montiel Soto ◽

Rafael Cruz Reyes ◽

Alexander Gelbukh

Keyword(s):

Clustering Algorithm ◽

Text Summarization ◽

Automatic Text Summarization ◽

Em Clustering ◽

Automatic Text

Download Full-text

Automatic Text Summarization on Social Media

Proceedings of the 2020 4th International Symposium on Computer Science and Intelligent Control ◽

10.1145/3440084.3441182 ◽

2020 ◽

Author(s):

Zhang Kerui ◽

Hu Haichao ◽

Liu Yuxia

Keyword(s):

Social Media ◽

Text Summarization ◽

Automatic Text Summarization ◽

Automatic Text

Download Full-text

Using librarian techniques in automatic text summarization for information retrieval

Proceedings of the second ACM/IEEE-CS joint conference on Digital libraries - JCDL '02 ◽

10.1145/544220.544227 ◽

2002 ◽

Cited By ~ 7

Author(s):

Min-Yen Kan ◽

Judith L. Klavans

Keyword(s):

Information Retrieval ◽

Text Summarization ◽

Automatic Text Summarization ◽

Automatic Text

Download Full-text

A Quantum-Inspired Genetic Algorithm for Extractive Text Summarization

International Journal of Natural Computing Research ◽

10.4018/ijncr.2021040103 ◽

2021 ◽

Vol 10 (2) ◽

pp. 42-60

Author(s):

Khadidja Chettah ◽

Amer Draa

Keyword(s):

Genetic Algorithm ◽

State Of The Art ◽

Text Summarization ◽

Automated System ◽

Evaluation Metrics ◽

Document Summarization ◽

Automatic Text Summarization ◽

Reference Methods ◽

Textual Data ◽

Automatic Text

Automatic text summarization has recently become a key instrument for reducing the huge quantity of textual data. In this paper, the authors propose a quantum-inspired genetic algorithm (QGA) for extractive single-document summarization. The QGA is used inside a totally automated system as an optimizer to search for the best combination of sentences to be put in the final summary. The presented approach is compared with 11 reference methods including supervised and unsupervised summarization techniques. They have evaluated the performances of the proposed approach on the DUC 2001 and DUC 2002 datasets using the ROUGE-1 and ROUGE-2 evaluation metrics. The obtained results show that the proposal can compete with other state-of-the-art methods. It is ranked first out of 12, outperforming all other algorithms.

Download Full-text