Latent Semantic Analysis in Automatic Text Summarization: A state of the art analysis

Perkembangan teknologi yang pesat membuat kita lebih mudah dalam menemukan informasi-informasi yang dibutuhkan. Permasalahan muncul ketika informasi tersebut sangat banyak. Semakin banyak informasi dalam sebuah modul maka akan semakin panjang isi teks dalam modul tersebut. Hal tersebut akan memakan waktu yang cukup lama untuk memahami inti informasi dari modul tersebut. Salah satu solusi untuk mendapatkan inti informasi dari keseluruhan modul dengan cepat dan menghemat waktu adalah dengan membaca ringkasannya. Cara cepat untuk mendapatkan ringkasan sebuah dokumen adalah dengan cara peringkasan teks otomatis. Peringkasan teks otomatis (Automatic Text Summarization) merupakan teks yang dihasilkan dari satu atau lebih dokumen, yang mana hasil teks tersebut memberikan informasi penting dari sumber dokumen asli, serta secara otomatis hasil teks tersebut tidak lebih panjang dari setengah sumber dokumen aslinya. Penelitian ini bertujuan untuk menghasilkan peringkasan teks otomatis pada modul pembelajaran berbahasa Indonesia dan mengetahui hasil akurasi peringkasan teks otomatis yang menerapkan metode Cross Latent Semantic Analysis (CLSA). Jumlah data yang digunakan pada penelitian ini sebanyak 10 file modul pembelajaran yang berasal dari modul para dosen Universitas Mercu Buana, dengan format .docx sebanyak 5 file dan format .pdf sebanyak 5 file. Penelitian ini menerapkan metode Term Frequency-Inverse Document Frequency (TF-IDF) untuk pembobotan kata dan metode Cross Latent Semantic Analysis (CLSA) untuk peringkasan teks. Pengujian akurasi pada peringkasan modul pembelajaran dilakukan dengan cara membandingkan hasil ringkasan manual oleh manusia dan hasil ringkasan sistem. Yang mana pengujian ini menghasilkan rata-rata nilai f-measure, precision, dan recall tertinggi pada compression rate 20% dengan nilai berturut-turut 0.3853, 0.432, dan 0.3715.

Download Full-text

Single Document Text Summarization of a Resource-Poor Language using an Unsupervised Technique

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.a2250.109119 ◽

2019 ◽

Vol 9 (1) ◽

pp. 6278-6281

Keyword(s):

Latent Semantic Analysis ◽

Semantic Analysis ◽

Singular Value ◽

Text Summarization ◽

Text Document ◽

Automatic Text Summarization ◽

Resource Poor ◽

Value Decomposition ◽

Scarcity Of Resources ◽

Automatic Text

Automatic text summarization of a resource-poor language is a challenging task. Unsupervised extractive techniques are often preferred for such languages due to scarcity of resources. Latent Semantic Analysis (LSA) is an unsupervised technique which automatically identifies semantically important sentences from a text document. Two methods based on Latent Semantic Analysis have been evaluated on two datasets of a resource-poor language using Singular Value Decomposition (SVD) on different vector-space models. The performance of the methods is evaluated using ROUGE-L scores obtained by comparing the system generated summaries with human generated model summaries. Both the methods are found to be performing better for shorter documents than longer ones.

Download Full-text

An Automatic Text Summarization on Naive Bayes Classifier Using Latent Semantic Analysis

Data, Engineering and Applications ◽

10.1007/978-981-13-6347-4_16 ◽

2019 ◽

pp. 171-180

Author(s):

Chintan Shah ◽

Anjali Jivani

Keyword(s):

Latent Semantic Analysis ◽

Semantic Analysis ◽

Naive Bayes ◽

Naïve Bayes ◽

Text Summarization ◽

Naive Bayes Classifier ◽

Bayes Classifier ◽

Naïve Bayes Classifier ◽

Automatic Text Summarization ◽

Automatic Text

Download Full-text

Automatic text summarization using latent semantic analysis

Programming and Computer Software ◽

10.1134/s0361768811060041 ◽

2011 ◽

Vol 37 (6) ◽

pp. 299-305 ◽

Cited By ~ 18

Author(s):

I. V. Mashechkin ◽

M. I. Petrovskiy ◽

D. S. Popov ◽

D. V. Tsarev

Keyword(s):

Latent Semantic Analysis ◽

Semantic Analysis ◽

Text Summarization ◽

Automatic Text Summarization ◽

Automatic Text

Download Full-text

A Quantum-Inspired Genetic Algorithm for Extractive Text Summarization

International Journal of Natural Computing Research ◽

10.4018/ijncr.2021040103 ◽

2021 ◽

Vol 10 (2) ◽

pp. 42-60

Author(s):

Khadidja Chettah ◽

Amer Draa

Keyword(s):

Genetic Algorithm ◽

State Of The Art ◽

Text Summarization ◽

Automated System ◽

Evaluation Metrics ◽

Document Summarization ◽

Automatic Text Summarization ◽

Reference Methods ◽

Textual Data ◽

Automatic Text

Automatic text summarization has recently become a key instrument for reducing the huge quantity of textual data. In this paper, the authors propose a quantum-inspired genetic algorithm (QGA) for extractive single-document summarization. The QGA is used inside a totally automated system as an optimizer to search for the best combination of sentences to be put in the final summary. The presented approach is compared with 11 reference methods including supervised and unsupervised summarization techniques. They have evaluated the performances of the proposed approach on the DUC 2001 and DUC 2002 datasets using the ROUGE-1 and ROUGE-2 evaluation metrics. The obtained results show that the proposal can compete with other state-of-the-art methods. It is ranked first out of 12, outperforming all other algorithms.

Download Full-text

Otomatisasi Peringkasan Teks Pada Dokumen Hukum Menggunakan Metode Latent Semantic Analysis

Jurnal Informatika Polinema ◽

10.33795/jip.v7i3.515 ◽

2021 ◽

Vol 7 (3) ◽

pp. 9-16

Author(s):

Millenia Rusbandi ◽

Imam Fahrur Rozi ◽

Kadek Suarjuna Batubulan

Keyword(s):

Law Enforcement ◽

Latent Semantic Analysis ◽

Semantic Analysis ◽

Compression Rate ◽

Analysis Method ◽

Legal Documents ◽

Automatic Text Summarization ◽

Long Time ◽

Law Enforcement Officials ◽

Automatic Text

At present, the number of crimes in Indonesia is quite large. The large number of crimes in Indonesia will have an impact on the number of legal documents that will be handled by law enforcement officials. In understanding legal documents, law enforcement officials such as lawyers, judges, and prosecutors must read the entire document which will take a long time. Therefore a summary is needed so that law enforcement officials can understand it more easily. So that one solution needed is to make a summary of the legal documents where the documents are in PDF form. In terms of summarizing the text, the method that can be used is the Latent Semantic Analysis algorithm. The algorithm is used to describe or analyze the hidden meaning of a language, code or other type of representation in order to obtain important information.From testing the 10 documents summarized by experts, the results of precision, recall, f-measure and accuracy are obtained sequentially on automatic text summarization using the Latent Semantic Analysis method for a compression rate of 75%, namely 53%, 27%, 35% and 71%. for a compression rate of 50%, namely 54%, 56%, 55% and 75%, and for a compression rate of 25%, namely 51%, 79%, 61% and 75%. Based on the results of the research and testing that has been done, it can be concluded that the Latent Semantic Analysis Method can be used to summarize legal documents.

Download Full-text

Developing a new approach to summarize Arabic text automatically using syntactic and semantic analysis

International Journal of Engineering & Technology ◽

10.14419/ijet.v9i2.30324 ◽

2020 ◽

Vol 9 (2) ◽

pp. 342

Author(s):

Amal Alkhudari

Keyword(s):

Language Processing ◽

Automatic System ◽

Semantic Analysis ◽

Text Summarization ◽

Original Text ◽

Arabic Text ◽

Wide Spread ◽

New Approach ◽

Automatic Text Summarization ◽

Automatic Text

Due to the wide spread information and the diversity of its sources, there is a need to produce an accurate text summary with the least time and effort. This summary must preserve key information content and overall meaning of the original text. Text summarization is one of the most important applications of Natural Language Processing (NLP). The goal of automatic text summarization is to create summaries that are similar to human-created ones. However, in many cases, the readability of created summaries is not satisfactory, because the summaries do not consider the meaning of the words and do not cover all the semantically relevant aspects of data. In this paper we use syntactic and semantic analysis to propose an automatic system of Arabic texts summarization. This system is capable of understanding the meaning of information and retrieves only the relevant part. The effectiveness and evaluation of the proposed work are demonstrated under EASC corpus using Rouge measure. The generated summaries will be compared against those done by human and precedent researches.

Download Full-text

Automatic Text Summarization: A State-of-the-Art Review

Proceedings of the 22nd International Conference on Enterprise Information Systems ◽

10.5220/0009723306480655 ◽

2020 ◽

Author(s):

Oleksandra Klymenko ◽

Daniel Braun ◽

Florian Matthes

Keyword(s):

State Of The Art ◽

Text Summarization ◽

Automatic Text Summarization ◽

Automatic Text

Download Full-text

Abstractive Summarization: A Survey of the State of the Art

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33019815 ◽

2019 ◽

Vol 33 ◽

pp. 9815-9822 ◽

Cited By ~ 5

Author(s):

Hui Lin ◽

Vincent Ng

Keyword(s):

Machine Translation ◽

State Of The Art ◽

The State ◽

Text Summarization ◽

Abstract Representation ◽

Automatic Text Summarization ◽

Input Text ◽

Gradual Shift ◽

Abstractive Summarization ◽

Automatic Text

The focus of automatic text summarization research has exhibited a gradual shift from extractive methods to abstractive methods in recent years, owing in part to advances in neural methods. Originally developed for machine translation, neural methods provide a viable framework for obtaining an abstract representation of the meaning of an input text and generating informative, fluent, and human-like summaries. This paper surveys existing approaches to abstractive summarization, focusing on the recently developed neural approaches.

Download Full-text