scholarly journals Semi-supervised Orthogonal Graph Embedding with Recursive Projections

Author(s):  
Hanyang Liu ◽  
Junwei Han ◽  
Feiping Nie

Many graph based semi-supervised dimensionality reduction algorithms utilize the projection matrix to linearly map the data matrix from the original feature space to a lower dimensional representation. But the dimensionality after reduction is inevitably restricted to the number of classes, and the learned non-orthogonal projection matrix usually fails to preserve distances well and balance the weight on different projection direction. This paper proposes a novel dimensionality reduction method, called the semi-supervised orthogonal graph embedding with recursive projections (SOGE). We integrate the manifold smoothness and label fitness as well as the penalization of the linear mapping mismatch, and learn the orthogonal projection on the Stiefel manifold that empirically demonstrates better performance. Moreover, we recursively update the projection matrix in its orthocomplemented space to continuously learn more projection vectors, so as to better control the dimension of reduction. Comprehensive experiment on several benchmarks demonstrates the significant improvement over the existing methods.

IEEE Access ◽  
2018 ◽  
Vol 6 ◽  
pp. 75748-75766 ◽  
Author(s):  
Jianping Gou ◽  
Zhang Yi ◽  
David Zhang ◽  
Yongzhao Zhan ◽  
Xiangjun Shen ◽  
...  

2021 ◽  
Author(s):  
Rogini Runghen ◽  
Daniel B Stouffer ◽  
Giulio Valentino Dalla Riva

Collecting network interaction data is difficult. Non-exhaustive sampling and complex hidden processes often result in an incomplete data set. Thus, identifying potentially present but unobserved interactions is crucial both in understanding the structure of large scale data, and in predicting how previously unseen elements will interact. Recent studies in network analysis have shown that accounting for metadata (such as node attributes) can improve both our understanding of how nodes interact with one another, and the accuracy of link prediction. However, the dimension of the object we need to learn to predict interactions in a network grows quickly with the number of nodes. Therefore, it becomes computationally and conceptually challenging for large networks. Here, we present a new predictive procedure combining a graph embedding method with machine learning techniques to predict interactions on the base of nodes' metadata. Graph embedding methods project the nodes of a network onto a---low dimensional---latent feature space. The position of the nodes in the latent feature space can then be used to predict interactions between nodes. Learning a mapping of the nodes' metadata to their position in a latent feature space corresponds to a classic---and low dimensional---machine learning problem. In our current study we used the Random Dot Product Graph model to estimate the embedding of an observed network, and we tested different neural networks architectures to predict the position of nodes in the latent feature space. Flexible machine learning techniques to map the nodes onto their latent positions allow to account for multivariate and possibly complex nodes' metadata. To illustrate the utility of the proposed procedure, we apply it to a large dataset of tourist visits to destinations across New Zealand. We found that our procedure accurately predicts interactions for both existing nodes and nodes newly added to the network, while being computationally feasible even for very large networks. Overall, our study highlights that by exploiting the properties of a well understood statistical model for complex networks and combining it with standard machine learning techniques, we can simplify the link prediction problem when incorporating multivariate node metadata. Our procedure can be immediately applied to different types of networks, and to a wide variety of data from different systems. As such, both from a network science and data science perspective, our work offers a flexible and generalisable procedure for link prediction.


Entropy ◽  
2019 ◽  
Vol 21 (2) ◽  
pp. 121 ◽  
Author(s):  
Yongsheng Qi ◽  
Xuebin Meng ◽  
Chenxi Lu ◽  
Xuejin Gao ◽  
Lin Wang

Multiple phases with phase to phase transitions are important characteristics of many batch processes. The linear characteristics between phases are taken into consideration in the traditional algorithms while nonlinearities are neglected, which can lead to inaccuracy and inefficiency in monitoring. The focus of this paper is nonlinear multi-phase batch processes. A similarity metric is defined based on kernel entropy component analysis (KECA). A KECA similarity-based method is proposed for phase division and fault monitoring. First, nonlinear characteristics can be extracted in feature space via performing KECA on each preprocessed time-slice data matrix. Then phase division is achieved with the similarity variation of the extracted feature information. Then, a series of KECA models and slide-KECA models are established for steady and transitions phases respectively, which can reflect the diversity of transitional characteristics objectively and preferably deal with the stage-transition monitoring problem in multistage batch processes. Next, in order to overcome the problem that the traditional contribution plot cannot be applied to the kernel mapping space, a nonlinear contribution plot diagnosis algorithm is proposed, which is easier, more intuitive and implementable compared with the traditional one. Finally, simulations are performed on penicillin fermentation and industrial application. Specifically, the proposed method detects the abnormal agitation power and the abnormal substrate supply at 47 h and 86 h, respectively. Compared with traditional methods, it has better real-time performance and higher efficiency. Results demonstrate the ability of the proposed method to detect faults accurately and effectively in practice.


2020 ◽  
Vol 144 ◽  
pp. 113079 ◽  
Author(s):  
Jianping Gou ◽  
Yuanyuan Yang ◽  
Zhang Yi ◽  
Jiancheng Lv ◽  
Qirong Mao ◽  
...  

2015 ◽  
Vol 4 (2) ◽  
pp. 336
Author(s):  
Alaa Najim

<p><span lang="EN-GB">Using dimensionality reduction idea to visualize graph data sets can preserve the properties of the original space and reveal the underlying information shared among data points. Continuity Trustworthy Graph Embedding (CTGE) is new method we have introduced in this paper to improve the faithfulness of the graph visualization. We will use CTGE in graph field to find new understandable representation to be more easy to analyze and study. Several experiments on real graph data sets are applied to test the effectiveness and efficiency of the proposed method, which showed CTGE generates highly faithfulness graph representation when compared its representation with other methods.</span></p>


2020 ◽  
Vol 98 ◽  
pp. 107023 ◽  
Author(s):  
Xiang-Jun Shen ◽  
Si-Xing Liu ◽  
Bing-Kun Bao ◽  
Chun-Hong Pan ◽  
Zheng-Jun Zha ◽  
...  

Biostatistics ◽  
2018 ◽  
Vol 21 (3) ◽  
pp. 610-624
Author(s):  
Ziyi Li ◽  
Changgee Chang ◽  
Suprateek Kundu ◽  
Qi Long

Summary Biclustering techniques can identify local patterns of a data matrix by clustering feature space and sample space at the same time. Various biclustering methods have been proposed and successfully applied to analysis of gene expression data. While existing biclustering methods have many desirable features, most of them are developed for continuous data and few of them can efficiently handle -omics data of various types, for example, binomial data as in single nucleotide polymorphism data or negative binomial data as in RNA-seq data. In addition, none of existing methods can utilize biological information such as those from functional genomics or proteomics. Recent work has shown that incorporating biological information can improve variable selection and prediction performance in analyses such as linear regression and multivariate analysis. In this article, we propose a novel Bayesian biclustering method that can handle multiple data types including Gaussian, Binomial, and Negative Binomial. In addition, our method uses a Bayesian adaptive structured shrinkage prior that enables feature selection guided by existing biological information. Our simulation studies and application to multi-omics datasets demonstrate robust and superior performance of the proposed method, compared to other existing biclustering methods.


Sign in / Sign up

Export Citation Format

Share Document