Using SVD on Clusters to Improve Precision of Interdocument Similarity Measure

Recently, LSI (Latent Semantic Indexing) based on SVD (Singular Value Decomposition) is proposed to overcome the problems of polysemy and homonym in traditional lexical matching. However, it is usually criticized as with low discriminative power for representing documents although it has been validated as with good representative quality. In this paper, SVD on clusters is proposed to improve the discriminative power of LSI. The contribution of this paper is three manifolds. Firstly, we make a survey of existing linear algebra methods for LSI, including both SVD based methods and non-SVD based methods. Secondly, we propose SVD on clusters for LSI and theoretically explain that dimension expansion of document vectors and dimension projection using SVD are the two manipulations involved in SVD on clusters. Moreover, we develop updating processes to fold in new documents and terms in a decomposed matrix by SVD on clusters. Thirdly, two corpora, a Chinese corpus and an English corpus, are used to evaluate the performances of the proposed methods. Experiments demonstrate that, to some extent, SVD on clusters can improve the precision of interdocument similarity measure in comparison with other SVD based LSI methods.

Download Full-text

Two uses for updating the partial singular value decomposition in latent semantic indexing

Applied Numerical Mathematics ◽

10.1016/j.apnum.2007.01.016 ◽

2008 ◽

Vol 58 (4) ◽

pp. 499-510 ◽

Cited By ~ 5

Author(s):

Jane E. Tougas ◽

Raymond J. Spiteri

Keyword(s):

Singular Value Decomposition ◽

Latent Semantic Indexing ◽

Singular Value ◽

Semantic Indexing ◽

Partial Singular Value Decomposition ◽

Value Decomposition

Download Full-text

Clustering and latent semantic indexing aspects of the singular value decomposition

International Journal of Information and Decision Sciences ◽

10.1504/ijids.2016.075790 ◽

2016 ◽

Vol 8 (1) ◽

pp. 53 ◽

Cited By ~ 2

Author(s):

Andri Mirzal

Keyword(s):

Singular Value Decomposition ◽

Latent Semantic Indexing ◽

Singular Value ◽

Semantic Indexing ◽

Value Decomposition

Download Full-text

A COMPARISON OF METHODS FOR MODIFYING THE PARTIAL SINGULAR VALUE DECOMPOSITION IN LATENT SEMANTIC INDEXING

Proceedings of the Nova Scotian Institute of Science (NSIS) ◽

10.15273/pnsis.v43i2.3645 ◽

2006 ◽

Vol 43 (2) ◽

Author(s):

Jane E. Tougas

Keyword(s):

Singular Value Decomposition ◽

Matrix Factorization ◽

Latent Semantic Indexing ◽

Factorization Method ◽

Singular Value ◽

Semantic Indexing ◽

Partial Singular Value Decomposition ◽

Base De Données ◽

Computationally Expensive ◽

Value Decomposition

The tremendous size of the Internet and modern databases has made efficientsearching and information retrieval (IR) important. Latent semantic indexing (LSI) is an IR method that represents a dataset as a term-document matrix. LSI uses a matrix factorization method known as the partial singular value decomposition (PSVD). Calculating the PSVD of a large term-document matrix is computationally expensive. In a rapidly expanding environment, a term-document matrix is altered often as new documents and terms are added. Recomputing the PSVD of the term-document matrix each time these slight alterations occur can be prohibitively expensive. Folding-in is one method of adding new documents or terms to an LSI database; updating the PSVD of the existing LSI database is another. The folding-in method is computationally inexpensive, but may cause deterioration in the accuracy of the PSVD. The PSVD-updating method is computationally more expensive than the folding-inmethod, but better maintains the accuracy of the PSVD. Folding-up is a new method that combines folding-in and PSVD-updating. Folding-up is faster than either recomputing the PSVD or PSVD-updating, but avoids the degradation in the PSVD that can occur when the folding-in method is used on its own.La taille incroyable d’Internet et des bases de données modernes a fait en sorteque la recherche efficace d’informations est maintenant importante. L’indexation par sémantique latente (ISL) est une méthode de recherche d’informations qui représente un jeu de données comme une matrice document-terme. L’ISL comprend l’utilisation d’une méthode de factorisation matricielle connue sous le nom de décomposition partielle en valeurs singulières (DPVS). Le calcul de la DPVS d’une grande matrice document-terme est coûteux sur le plan des calculs. Dans un environnement en expansion rapide, une matrice document-terme est souvent modifiée à mesure que de nouveaux documents et termes sont ajoutés. Le recalcul de la DPVS de la matrice document-terme chaque fois qu’une légère modification est apportée peut devenir très coûteux. L’intégration (folding-in) est une méthode pour ajouter de nouveaux documents ou termes dans une base de donnée ISL, et la mise à jour de la DPVS de la base de données ISL existante en est une autre. La méthode d’intégration est peu coûteuse sur le plan des calculs, mais elle peut entraîner une perte d’exactitude de la DPVS. La méthode de mise à jour de la DPVS est plus coûteuse sur le plan des calculs, mais elle permet de mieux préserver l’exactitude de la DPVS. La méthode d’intégration et de mise à jour (folding-up) est une nouvelle méthode qui combine l’intégration et la mise à jour de la DPVS. Cette méthode est plus rapide que le recalcul ou la mise à jour de la DPVS, mais elle permet d’éviter la perte d’exactitude de la DPVS qui peut survenir quand seule la méthode d’intégration est utilisée.

Download Full-text

Updating the partial singular value decomposition in latent semantic indexing

Computational Statistics & Data Analysis ◽

10.1016/j.csda.2006.12.018 ◽

2007 ◽

Vol 52 (1) ◽

pp. 174-183 ◽

Cited By ~ 12

Author(s):

Jane E. Tougas ◽

Raymond J. Spiteri

Keyword(s):

Singular Value Decomposition ◽

Latent Semantic Indexing ◽

Singular Value ◽

Semantic Indexing ◽

Partial Singular Value Decomposition ◽

Value Decomposition

Download Full-text

PENERAPAN LATENT SEMANTIC INDEXING PADA SISTEM TEMU BALIK INFORMASI PADA UNDANG-UNDANG PEMILU BERDASARKAN KASUS

Jurnal Mnemonic ◽

10.36040/mnemonic.v4i2.4165 ◽

2021 ◽

Vol 4 (2) ◽

pp. 64-70

Author(s):

Agung Hasbi Ardiansyah ◽

Kurnia Paranita Kartika ◽

Saiful Nur Budiman

Keyword(s):

Singular Value Decomposition ◽

Latent Semantic Indexing ◽

Singular Value ◽

Semantic Indexing ◽

Value Decomposition ◽

F Measure

Ketika mendapat temuan atau laporan dugaan kasus pelanggaran pemilu, pengawas pemilu akan melakukan klarifikasi dan pencarian bukti-bukti yang cukup sebelum menentukan temuan atau laporan tersebut termasuk kedalam pelanggaran atau tidak. Pada saat proses klarifikasi, pengawas pemilu mencari pasal yang kemungkinan dilanggar pada temuan atau laporan yang masuk. Banyaknya pasal rujukan untuk masing-masing kasus pada temuan atau laporan terkadang menghambat pekerjaan petugas pengawas pemilu, sehingga dibutuhkan sebuah alat bantu untuk mempercepat proses pencarian pasal berdasarkan kasus pelanggaran. Pada penelitian ini, sistem temu balik informasi digunakan untuk mencari pasal-pasal pada undang-undang nomor 10 tahun 2016 yang relevan pada suatu kasus berdasarkan deskripsi kasus. Pada penelitian ini digunakan metode Latent Semantic Indexing (LSI). LSI menggunakan teknik Singular Value Decomposition (SVD) untuk mereduksi dimensi. Pada penelitian ini digunakan 37 pasal, dan 4 kasus atau deskripsi pelanggaran sebagai query. Sistem menerima masukkan berupa query atau deskripsi kasus pelanggaran kemudian menghitung dan menentukan pasal yang terkait. Tingkat keberhasilan dari metode ini untuk menemukan hasil pencarian yang relevan dapat dilihat melalui besar 100% untuk recall, 70% untuk precision dan 82% untuk f-measure.

Download Full-text

A Singular Value Decomposition Approach to Similarity Evaluation Between Servo Loops of CNC Machine Tools

Journal of Manufacturing Science and Engineering ◽

10.1115/1.3010713 ◽

2008 ◽

Vol 130 (6) ◽

Author(s):

Zhixiang Xu ◽

Tim Green

Keyword(s):

Time Series ◽

Singular Value Decomposition ◽

Similarity Measure ◽

Machine Tools ◽

Singular Value ◽

Cnc Machine Tools ◽

Cnc Machine ◽

Similarity Ratio ◽

Value Decomposition ◽

Circular Interpolation

In most cases, the servo loops of computer numerically controlled (CNC) machine tools consist of position controllers, drivers, power transmissions, and tables. In the process of diagnosis, adjustment, and calibration of CNC machine tools, it is crucial to make servo loops’ performances as similar as possible, and ideally identical. This work is motivated by establishing a measure to evaluate the similarities between all coordinated axes. Based on the singular value decomposition (SVD) of time series, this contribution addresses an innovative approach to set up a similarity measure for evaluating the performances of CNC machines. A circular interpolation is carried out to sample the displacements of two involved axes into two independent time series. Then a special matrix called attractor is constructed from the time series and SVD algorithm is adopted to process attractors. As a result, a series of singular values is produced. From these values, the singular value ratio spectrum is formed and the similarity ratio, which numerically represents the similarity between the coordinated axes, is proposed. According to the similarity ratio, the similarity of the two series is compared. Finally, the approach has been validated by experimental measurements. The similarity measure presented in this paper provides an overall index on evaluating the mismatch between coordinated axes of CNC machine tools.

Download Full-text

A novel singular value decomposition-based similarity measure method for non-local means denoising

Signal Image and Video Processing ◽

10.1007/s11760-021-01948-9 ◽

2021 ◽

Author(s):

Yi Wang ◽

Xiao Song ◽

Kai Chen ◽

Xing Zhang ◽

Ming Tie ◽

...

Keyword(s):

Singular Value Decomposition ◽

Similarity Measure ◽

Singular Value ◽

Local Means ◽

Non Local ◽

Value Decomposition

Download Full-text

Analysis and Linear Algebra: The Singular Value Decomposition and Applications

10.1090/stml/094 ◽

2021 ◽

Author(s):

James Bisgard

Keyword(s):

Singular Value Decomposition ◽

Linear Algebra ◽

Singular Value ◽

Value Decomposition

Download Full-text

Linear Algebra for Pattern Processing: Projection, Singular Value Decomposition, and Pseudoinverse

Synthesis Lectures on Signal Processing ◽

10.2200/s01084ed1v01y202104spr021 ◽

2021 ◽

Vol 12 (1) ◽

pp. 1-155

Author(s):

Kenichi Kanatani

Keyword(s):

Singular Value Decomposition ◽

Linear Algebra ◽

Singular Value ◽

Pattern Processing ◽

Value Decomposition

Download Full-text

Truncated singular value decomposition in ripped photo recovery

ITM Web of Conferences ◽

10.1051/itmconf/20213604008 ◽

2021 ◽

Vol 36 ◽

pp. 04008

Author(s):

Kong Hoong Lem

Keyword(s):

Singular Value Decomposition ◽

Linear Algebra ◽

Singular Value ◽

Frobenius Norm ◽

Truncated Singular Value Decomposition ◽

Matrix Decompositions ◽

Truncated Svd ◽

Value Decomposition

Singular value decomposition (SVD) is one of the most useful matrix decompositions in linear algebra. Here, a novel application of SVD in recovering ripped photos was exploited. Recovery was done by applying truncated SVD iteratively. Performance was evaluated using the Frobenius norm. Results from a few experimental photos were decent.

Download Full-text