A PSO Based Approach for Producing Optimized Latent Factor in Special Reference to Big Data

Author(s):  
Bharat Singh ◽  
Om Prakash Vyas

Now a day's application deal with Big Data has tremendously been used in the popular areas. To tackle with such kind of data various approaches have been developed by researchers in the last few decades. A recent investigated techniques to factored the data matrix through a known latent factor in a lower size space is the so called matrix factorization. In addition, one of the problems with the NMF approaches, its randomized valued could not provide absolute optimization in limited iteration, but having local optimization. Due to this, the authors have proposed a new approach that considers the initial values of the decomposition to tackle the issues of computationally expensive. They have devised an algorithm for initializing the values of the decomposed matrix based on the PSO. In this paper, the auhtors have intended a genetic algorithm based technique while incorporating the nonnegative matrix factorization. Through the experimental result, they will show the proposed method converse very fast in comparison to other low rank approximation like simple NMF multiplicative, and ACLS technique.

2019 ◽  
Vol 364 ◽  
pp. 129-137
Author(s):  
Peitao Wang ◽  
Zhaoshui He ◽  
Kan Xie ◽  
Junbin Gao ◽  
Michael Antolovich ◽  
...  

2021 ◽  
Vol 37 ◽  
pp. 583-597
Author(s):  
Patrick Groetzner

In data science and machine learning, the method of nonnegative matrix factorization (NMF) is a powerful tool that enjoys great popularity. Depending on the concrete application, there exist several subclasses each of which performs a NMF under certain constraints. Consider a given square matrix $A$. The symmetric NMF aims for a nonnegative low-rank approximation $A\approx XX^T$ to $A$, where $X$ is entrywise nonnegative and of given order. Considering a rectangular input matrix $A$, the general NMF again aims for a nonnegative low-rank approximation to $A$ which is now of the type $A\approx XY$ for entrywise nonnegative matrices $X,Y$ of given order. In this paper, we introduce a new heuristic method to tackle the exact nonnegative matrix factorization problem (of type $A=XY$), based on projection approaches to solve a certain feasibility problem.


Author(s):  
Xunpeng Huang ◽  
Le Wu ◽  
Enhong Chen ◽  
Hengshu Zhu ◽  
Qi Liu ◽  
...  

Matrix Factorization (MF) is among the most widely used techniques for collaborative filtering based recommendation. Along this line, a critical demand is to incrementally refine the MF models when new ratings come in an online scenario. However, most of existing incremental MF algorithms are limited by specific MF models or strict use restrictions. In this paper, we propose a general incremental MF framework by designing a linear transformation of user and item latent vectors over time. This framework shows a relatively high accuracy with a computation and space efficient training process in an online scenario. Meanwhile, we explain the framework with a low-rank approximation perspective, and give an upper bound on the training error when this framework is used for incremental learning in some special cases. Finally, extensive experimental results on two real-world datasets clearly validate the effectiveness, efficiency and storage performance of the proposed framework.


2021 ◽  
pp. 000370282110447
Author(s):  
Joseph Dubrovkin

Storage, processing, and transfer of huge matrices are becoming challenging tasks in the process analytical technology and scientific research. Matrix compression can solve these problems successfully. We developed a novel compression method of spectral data matrix based on its low-rank approximation and the fast Fourier transform of the singular vectors. This method differs from the known ones in that it does not require restoring the low-rank approximated matrix for further Fourier processing. Therefore, the compression ratio increases. A compromise between the losses of the accuracy of the data matrix restoring and the compression ratio was achieved by selecting the processing parameters. The method was applied to multivariate chemometrics analysis of the cow milk for determining fat and protein content using two data matrices (the file sizes were 5.7 and 12.0 MB) restored from their compressed form. The corresponding compression ratios were about 52 and 114, while the loss of accuracy of the analysis was less than 1% compared with processing of the non-compressed matrix. A huge, simulated matrix, compressed from 400 MB to 1.9 MB, was successfully used for multivariate calibration and segment cross-validation. The data set simulated a large matrix of 10 000 low-noise infrared spectra, measured in the range 4000–400 cm−1 with a resolution of 0.5 cm−1. The corresponding file was compressed from 262.8 MB to 19.8 MB. The discrepancies between original and restored spectra were less than the standard deviation of the noise. The method developed in the article clearly demonstrated its potential for future applications to chemometrics-enhanced spectrometric analysis with limited options of memory size and data transfer rate. The algorithm used the standard routines of Matlab software.


Acta Numerica ◽  
2006 ◽  
Vol 15 ◽  
pp. 327-384 ◽  
Author(s):  
Lars Eldén

Ideas and algorithms from numerical linear algebra are important in several areas of data mining. We give an overview of linear algebra methods in text mining (information retrieval), pattern recognition (classification of handwritten digits), and PageRank computations for web search engines. The emphasis is on rank reduction as a method of extracting information from a data matrix, low-rank approximation of matrices using the singular value decomposition and clustering, and on eigenvalue methods for network analysis.


2019 ◽  
Vol 12 (S10) ◽  
Author(s):  
Junning Gao ◽  
Lizhi Liu ◽  
Shuwei Yao ◽  
Xiaodi Huang ◽  
Hiroshi Mamitsuka ◽  
...  

Abstract Background As a standardized vocabulary of phenotypic abnormalities associated with human diseases, the Human Phenotype Ontology (HPO) has been widely used by researchers to annotate phenotypes of genes/proteins. For saving the cost and time spent on experiments, many computational approaches have been proposed. They are able to alleviate the problem to some extent, but their performances are still far from satisfactory. Method For inferring large-scale protein-phenotype associations, we propose HPOAnnotator that incorporates multiple Protein-Protein Interaction (PPI) information and the hierarchical structure of HPO. Specifically, we use a dual graph to regularize Non-negative Matrix Factorization (NMF) in a way that the information from different sources can be seamlessly integrated. In essence, HPOAnnotator solves the sparsity problem of a protein-phenotype association matrix by using a low-rank approximation. Results By combining the hierarchical structure of HPO and co-annotations of proteins, our model can well capture the HPO semantic similarities. Moreover, graph Laplacian regularizations are imposed in the latent space so as to utilize multiple PPI networks. The performance of HPOAnnotator has been validated under cross-validation and independent test. Experimental results have shown that HPOAnnotator outperforms the competing methods significantly. Conclusions Through extensive comparisons with the state-of-the-art methods, we conclude that the proposed HPOAnnotator is able to achieve the superior performance as a result of using a low-rank approximation with a graph regularization. It is promising in that our approach can be considered as a starting point to study more efficient matrix factorization-based algorithms.


Sign in / Sign up

Export Citation Format

Share Document