Latent semantic analysis in automatic text summarisation: a state-of-the-art analysis

At present, the number of crimes in Indonesia is quite large. The large number of crimes in Indonesia will have an impact on the number of legal documents that will be handled by law enforcement officials. In understanding legal documents, law enforcement officials such as lawyers, judges, and prosecutors must read the entire document which will take a long time. Therefore a summary is needed so that law enforcement officials can understand it more easily. So that one solution needed is to make a summary of the legal documents where the documents are in PDF form. In terms of summarizing the text, the method that can be used is the Latent Semantic Analysis algorithm. The algorithm is used to describe or analyze the hidden meaning of a language, code or other type of representation in order to obtain important information.From testing the 10 documents summarized by experts, the results of precision, recall, f-measure and accuracy are obtained sequentially on automatic text summarization using the Latent Semantic Analysis method for a compression rate of 75%, namely 53%, 27%, 35% and 71%. for a compression rate of 50%, namely 54%, 56%, 55% and 75%, and for a compression rate of 25%, namely 51%, 79%, 61% and 75%. Based on the results of the research and testing that has been done, it can be concluded that the Latent Semantic Analysis Method can be used to summarize legal documents.

Download Full-text

Peringkasan Teks Otomatis pada Modul Pembelajaran Berbahasa Indonesia Menggunakan Metode Cross Latent Semantic Analysis (CLSA)

Jurnal Edukasi dan Penelitian Informatika (JEPIN) ◽

10.26418/jp.v7i2.47768 ◽

2021 ◽

Vol 7 (2) ◽

pp. 153

Author(s):

Yunita Maulidia Sari ◽

Nenden Siti Fatonah

Keyword(s):

Latent Semantic Analysis ◽

Semantic Analysis ◽

Text Summarization ◽

Compression Rate ◽

Inverse Document Frequency ◽

Term Frequency ◽

Automatic Text Summarization ◽

Document Frequency ◽

Automatic Text ◽

F Measure

Perkembangan teknologi yang pesat membuat kita lebih mudah dalam menemukan informasi-informasi yang dibutuhkan. Permasalahan muncul ketika informasi tersebut sangat banyak. Semakin banyak informasi dalam sebuah modul maka akan semakin panjang isi teks dalam modul tersebut. Hal tersebut akan memakan waktu yang cukup lama untuk memahami inti informasi dari modul tersebut. Salah satu solusi untuk mendapatkan inti informasi dari keseluruhan modul dengan cepat dan menghemat waktu adalah dengan membaca ringkasannya. Cara cepat untuk mendapatkan ringkasan sebuah dokumen adalah dengan cara peringkasan teks otomatis. Peringkasan teks otomatis (Automatic Text Summarization) merupakan teks yang dihasilkan dari satu atau lebih dokumen, yang mana hasil teks tersebut memberikan informasi penting dari sumber dokumen asli, serta secara otomatis hasil teks tersebut tidak lebih panjang dari setengah sumber dokumen aslinya. Penelitian ini bertujuan untuk menghasilkan peringkasan teks otomatis pada modul pembelajaran berbahasa Indonesia dan mengetahui hasil akurasi peringkasan teks otomatis yang menerapkan metode Cross Latent Semantic Analysis (CLSA). Jumlah data yang digunakan pada penelitian ini sebanyak 10 file modul pembelajaran yang berasal dari modul para dosen Universitas Mercu Buana, dengan format .docx sebanyak 5 file dan format .pdf sebanyak 5 file. Penelitian ini menerapkan metode Term Frequency-Inverse Document Frequency (TF-IDF) untuk pembobotan kata dan metode Cross Latent Semantic Analysis (CLSA) untuk peringkasan teks. Pengujian akurasi pada peringkasan modul pembelajaran dilakukan dengan cara membandingkan hasil ringkasan manual oleh manusia dan hasil ringkasan sistem. Yang mana pengujian ini menghasilkan rata-rata nilai f-measure, precision, dan recall tertinggi pada compression rate 20% dengan nilai berturut-turut 0.3853, 0.432, dan 0.3715.

Download Full-text

Single Document Text Summarization of a Resource-Poor Language using an Unsupervised Technique

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.a2250.109119 ◽

2019 ◽

Vol 9 (1) ◽

pp. 6278-6281

Keyword(s):

Latent Semantic Analysis ◽

Semantic Analysis ◽

Singular Value ◽

Text Summarization ◽

Text Document ◽

Automatic Text Summarization ◽

Resource Poor ◽

Value Decomposition ◽

Scarcity Of Resources ◽

Automatic Text

Automatic text summarization of a resource-poor language is a challenging task. Unsupervised extractive techniques are often preferred for such languages due to scarcity of resources. Latent Semantic Analysis (LSA) is an unsupervised technique which automatically identifies semantically important sentences from a text document. Two methods based on Latent Semantic Analysis have been evaluated on two datasets of a resource-poor language using Singular Value Decomposition (SVD) on different vector-space models. The performance of the methods is evaluated using ROUGE-L scores obtained by comparing the system generated summaries with human generated model summaries. Both the methods are found to be performing better for shorter documents than longer ones.

Download Full-text

An Automatic Text Summarization on Naive Bayes Classifier Using Latent Semantic Analysis

Data, Engineering and Applications ◽

10.1007/978-981-13-6347-4_16 ◽

2019 ◽

pp. 171-180

Author(s):

Chintan Shah ◽

Anjali Jivani

Keyword(s):

Latent Semantic Analysis ◽

Semantic Analysis ◽

Naive Bayes ◽

Naïve Bayes ◽

Text Summarization ◽

Naive Bayes Classifier ◽

Bayes Classifier ◽

Naïve Bayes Classifier ◽

Automatic Text Summarization ◽

Automatic Text

Download Full-text

Automatic text summarization using latent semantic analysis

Programming and Computer Software ◽

10.1134/s0361768811060041 ◽

2011 ◽

Vol 37 (6) ◽

pp. 299-305 ◽

Cited By ~ 18

Author(s):

I. V. Mashechkin ◽

M. I. Petrovskiy ◽

D. S. Popov ◽

D. V. Tsarev

Keyword(s):

Latent Semantic Analysis ◽

Semantic Analysis ◽

Text Summarization ◽

Automatic Text Summarization ◽

Automatic Text

Download Full-text

Forecasting in Service Supply Chain Systems: A State-of-the-Art Review Using Latent Semantic Analysis

Advances in Business and Management Forecasting ◽

10.1108/s1477-407020170000012011 ◽

2017 ◽

pp. 181-212 ◽

Cited By ~ 2

Author(s):

Sudhanshu Joshi ◽

Manu Sharma ◽

Shalu Rathi

Keyword(s):

Supply Chain ◽

Latent Semantic Analysis ◽

Semantic Analysis ◽

State Of The Art ◽

Service Supply Chain

Download Full-text

Improving Website Usability with Latent Semantic Analysis

PsycEXTRA Dataset ◽

10.1037/e577712012-027 ◽

2006 ◽

Author(s):

Sarah A. Nuehring ◽

Peter W. Foltz

Keyword(s):

Latent Semantic Analysis ◽

Semantic Analysis ◽

Website Usability

Download Full-text

Task Estimation Using Latent Semantic Analysis of Visual Scenes and Spoken Words

IEEJ Transactions on Electronics Information and Systems ◽

10.1541/ieejeiss.132.1473 ◽

2012 ◽

Vol 132 (9) ◽

pp. 1473-1480

Author(s):

Masashi Kimura ◽

Shinta Sawada ◽

Yurie Iribe ◽

Kouichi Katsurada ◽

Tsuneo Nitta

Keyword(s):

Latent Semantic Analysis ◽

Semantic Analysis ◽

Spoken Words ◽

Visual Scenes

Download Full-text

Similarity Detection Using Latent Semantic Analysis Algorithm

International Journal of Emerging Research in Management and Technology ◽

10.23956/ijermt.v6i8.124 ◽

2018 ◽

Vol 6 (8) ◽

pp. 102

Author(s):

Priyanka R. Patil ◽

Shital A. Patil

Keyword(s):

Latent Semantic Analysis ◽

Latent Dirichlet Allocation ◽

Semantic Analysis ◽

Mining Method ◽

Research Papers ◽

Information Measures ◽

Automated Software ◽

Day By Day ◽

Ways Of Life ◽

Dirichlet Allocation

Similarity View is an application for visually comparing and exploring multiple models of text and collection of document. Friendbook finds ways of life of clients from client driven sensor information, measures the closeness of ways of life amongst clients, and prescribes companions to clients if their ways of life have high likeness. Roused by demonstrate a clients day by day life as life records, from their ways of life are separated by utilizing the Latent Dirichlet Allocation Algorithm. Manual techniques can't be utilized for checking research papers, as the doled out commentator may have lacking learning in the exploration disciplines. For different subjective views, causing possible misinterpretations. An urgent need for an effective and feasible approach to check the submitted research papers with support of automated software. A method like text mining method come to solve the problem of automatically checking the research papers semantically. The proposed method to finding the proper similarity of text from the collection of documents by using Latent Dirichlet Allocation (LDA) algorithm and Latent Semantic Analysis (LSA) with synonym algorithm which is used to find synonyms of text index wise by using the English wordnet dictionary, another algorithm is LSA without synonym used to find the similarity of text based on index. LSA with synonym rate of accuracy is greater when the synonym are consider for matching.

Download Full-text