scholarly journals Feature Sampling Based Unsupervised Semantic Clustering for Real Web Multi-View Content

Author(s):  
Xiaolong Gong ◽  
Linpeng Huang ◽  
Fuwei Wang

Real web datasets are often associated with multiple views such as long and short commentaries, users preference and so on. However, with the rapid growth of user generated texts, each view of the dataset has a large feature space and leads to the computational challenge during matrix decomposition process. In this paper, we propose a novel multi-view clustering algorithm based on the non-negative matrix factorization that attempts to use feature sampling strategy in order to reduce the complexity during the iteration process. In particular, our method exploits unsupervised semantic information in the learning process to capture the intrinsic similarity through a graph regularization. Moreover, we use Hilbert Schmidt Independence Criterion (HSIC) to explore the unsupervised semantic diversity information among multi-view contents of one web item. The overall objective is to minimize the loss function of multi-view non-negative matrix factorization that combines with an intra-semantic similarity graph regularizer and an inter-semantic diversity term. Compared with some state-of-the-art methods, we demonstrate the effectiveness of our proposed method on a large real-world dataset Doucom and the other three smaller datasets.

2019 ◽  
Vol 13 (S1) ◽  
Author(s):  
Na Yu ◽  
Ying-Lian Gao ◽  
Jin-Xing Liu ◽  
Juan Wang ◽  
Junliang Shang

Abstract Background As one of the most popular data representation methods, non-negative matrix decomposition (NMF) has been widely concerned in the tasks of clustering and feature selection. However, most of the previously proposed NMF-based methods do not adequately explore the hidden geometrical structure in the data. At the same time, noise and outliers are inevitably present in the data. Results To alleviate these problems, we present a novel NMF framework named robust hypergraph regularized non-negative matrix factorization (RHNMF). In particular, the hypergraph Laplacian regularization is imposed to capture the geometric information of original data. Unlike graph Laplacian regularization which captures the relationship between pairwise sample points, it captures the high-order relationship among more sample points. Moreover, the robustness of the RHNMF is enhanced by using the L2,1-norm constraint when estimating the residual. This is because the L2,1-norm is insensitive to noise and outliers. Conclusions Clustering and common abnormal expression gene (com-abnormal expression gene) selection are conducted to test the validity of the RHNMF model. Extensive experimental results on multi-view datasets reveal that our proposed model outperforms other state-of-the-art methods.


2012 ◽  
Vol 226-228 ◽  
pp. 760-764
Author(s):  
Ning Li ◽  
Hai Ting Chen

Blind source separation (BSS) has been successfully used to extract undetected fault vibration sources from mixed observation signals by assuming that each unknown vibration source is mutually independent. However, conventional BSS algorithms cannot address the situation in which the fault source could be partially dependent on or correlated to other sources. For this, a new matrix decomposition method, called Non-negative Matrix Factorization (NMF), is introduced to separate these partially correlated signals. In this paper, the observed temporal signals are transformed into the frequency domain to satisfy the non-negative limit of NMF. The constraint of the least correlation between the separated sources is added into the cost function of NMF to enhance the stability of NMF, and the constrained non-negative matrix factorization (CNMF) is proposed. The simulation results show that the separation performance of CNMF is superior to the common BSS algorithms and the experiment result verifies the practical performance of CNMF.


2020 ◽  
Vol 2 (4) ◽  
pp. 630-646
Author(s):  
Nannan Li ◽  
Shengfa Wang ◽  
Haohao Li ◽  
Zhiyang Li

Feature analysis is a fundamental research area in computer graphics; meanwhile, meaningful and part-aware feature bases are always demanding. This paper proposes a framework for conducting feature analysis on a three-dimensional (3D) model by introducing modified Non-negative Matrix Factorization (NMF) model into the graphical feature space and push forward further applications. By analyzing and utilizing the intrinsic ideas behind NMF, we propose conducting the factorization on feature matrices constructed based on descriptors or graphs, which provides a simple but effective way to raise compressed and scale-aware descriptors. In order to enable part-aware model analysis, we modify the NMF model to be sparse and constrained regarding to both bases and encodings, which gives rise to Sparse and Constrained Non-negative Matrix Factorization (SAC-NMF). Subsequently, by adapting the analytical components (including hidden variables, bases, and encodings) to design descriptors, several applications have been easily but effectively realized. The extensive experimental results demonstrate that the proposed framework has many attractive advantages, such as being efficient, extendable, and so forth.


Sign in / Sign up

Export Citation Format

Share Document