scholarly journals Wasserstein Soft Label Propagation on Hypergraphs: Algorithm and Generalization Error Bounds

Author(s):  
Tingran Gao ◽  
Shahab Asoodeh ◽  
Yi Huang ◽  
James Evans

Inspired by recent interests of developing machine learning and data mining algorithms on hypergraphs, we investigate in this paper the semi-supervised learning algorithm of propagating ”soft labels” (e.g. probability distributions, class membership scores) over hypergraphs, by means of optimal transportation. Borrowing insights from Wasserstein propagation on graphs [Solomon et al. 2014], we re-formulate the label propagation procedure as a message-passing algorithm, which renders itself naturally to a generalization applicable to hypergraphs through Wasserstein barycenters. Furthermore, in a PAC learning framework, we provide generalization error bounds for propagating one-dimensional distributions on graphs and hypergraphs using 2-Wasserstein distance, by establishing the algorithmic stability of the proposed semisupervised learning algorithm. These theoretical results also shed new lights upon deeper understandings of the Wasserstein propagation on graphs.

Author(s):  
Pinar Demetci ◽  
Rebecca Santorella ◽  
Björn Sandstede ◽  
William Stafford Noble ◽  
Ritambhara Singh

AbstractData integration of single-cell measurements is critical for understanding cell development and disease, but the lack of correspondence between different types of measurements makes such efforts challenging. Several unsupervised algorithms can align heterogeneous single-cell measurements in a shared space, enabling the creation of mappings between single cells in different data domains. However, these algorithms require hyperparameter tuning for high-quality alignments, which is difficult in an unsupervised setting without correspondence information for validation. We present Single-Cell alignment using Optimal Transport (SCOT), an unsupervised learning algorithm that uses Gromov Wasserstein-based optimal transport to align single-cell multi-omics datasets. We compare the alignment performance of SCOT with state-of-the-art algorithms on four simulated and two real-world datasets. SCOT performs on par with state-of-the-art methods but is faster and requires tuning fewer hyperparameters. Furthermore, we provide an algorithm for SCOT to use Gromov Wasserstein distance to guide the parameter selection. Thus, unlike previous methods, SCOT aligns well without using any orthogonal correspondence information to pick the hyperparameters. Our source code and scripts for replicating the results are available at https://github.com/rsinghlab/SCOT.


Author(s):  
Bao Bing-Kun ◽  
Yan Shuicheng

Graph-based learning provides a useful approach for modeling data in image annotation problems. In this chapter, the authors introduce how to construct a region-based graph to annotate large scale multi-label images. It has been well recognized that analysis in semantic region level may greatly improve image annotation performance compared to that in whole image level. However, the region level approach increases the data scale to several orders of magnitude and lays down new challenges to most existing algorithms. To this end, each image is firstly encoded as a Bag-of-Regions based on multiple image segmentations. And then, all image regions are constructed into a large k-nearest-neighbor graph with efficient Locality Sensitive Hashing (LSH) method. At last, a sparse and region-aware image-based graph is fed into the multi-label extension of the Entropic graph regularized semi-supervised learning algorithm (Subramanya & Bilmes, 2009). In combination they naturally yield the capability in handling large-scale dataset. Extensive experiments on NUS-WIDE (260k images) and COREL-5k datasets well validate the effectiveness and efficiency of the framework for region-aware and scalable multi-label propagation.


2012 ◽  
Vol 24 (3) ◽  
pp. 700-723 ◽  
Author(s):  
Aykut Erdem ◽  
Marcello Pelillo

Graph transduction is a popular class of semisupervised learning techniques that aims to estimate a classification function defined over a graph of labeled and unlabeled data points. The general idea is to propagate the provided label information to unlabeled nodes in a consistent way. In contrast to the traditional view, in which the process of label propagation is defined as a graph Laplacian regularization, this article proposes a radically different perspective, based on game-theoretic notions. Within the proposed framework, the transduction problem is formulated in terms of a noncooperative multiplayer game whereby equilibria correspond to consistent labelings of the data. An attractive feature of this formulation is that it is inherently a multiclass approach and imposes no constraint whatsoever on the structure of the pairwise similarity matrix, being able to naturally deal with asymmetric and negative similarities alike. Experiments on a number of real-world problems demonstrate that the proposed approach performs well compared with state-of-the-art algorithms, and it can deal effectively with various types of similarity relations.


2011 ◽  
Vol 23 (12) ◽  
pp. 3287-3302 ◽  
Author(s):  
Henning Sprekeler

The past decade has seen a rise of interest in Laplacian eigenmaps (LEMs) for nonlinear dimensionality reduction. LEMs have been used in spectral clustering, in semisupervised learning, and for providing efficient state representations for reinforcement learning. Here, we show that LEMs are closely related to slow feature analysis (SFA), a biologically inspired, unsupervised learning algorithm originally designed for learning invariant visual representations. We show that SFA can be interpreted as a function approximation of LEMs, where the topological neighborhoods required for LEMs are implicitly defined by the temporal structure of the data. Based on this relation, we propose a generalization of SFA to arbitrary neighborhood relations and demonstrate its applicability for spectral clustering. Finally, we review previous work with the goal of providing a unifying view on SFA and LEMs.


1997 ◽  
Vol 9 (6) ◽  
pp. 1211-1243 ◽  
Author(s):  
David H. Wolpert

This article presents several additive corrections to the conventional quadratic loss bias-plus-variance formula. One of these corrections is appropriate when both the target is not fixed (as in Bayesian analysis) and training sets are averaged over (as in the conventional bias plus variance formula). Another additive correction casts conventional fixed-trainingset Bayesian analysis directly in terms of bias plus variance. Another correction is appropriate for measuring full generalization error over a test set rather than (as with conventional bias plus variance) error at a single point. Yet another correction can help explain the recent counterintuitive bias-variance decomposition of Friedman for zero-one loss. After presenting these corrections, this article discusses some other loss function-specific aspects of supervised learning. In particular, there is a discussion of the fact that if the loss function is a metric (e.g., zero-one loss), then there is bound on the change in generalization error accompanying changing the algorithm's guess from h1 to h2, a bound that depends only on h1 and h2 and not on the target. This article ends by presenting versions of the bias-plus-variance formula appropriate for logarithmic and quadratic scoring, and then all the additive corrections appropriate to those formulas. All the correction terms presented are a covariance, between the learning algorithm and the posterior distribution over targets. Accordingly, in the (very common) contexts in which those terms apply, there is not a “bias-variance trade-off” or a “bias-variance dilemma,” as one often hears. Rather there is a bias-variance-covariance trade-off.


Author(s):  
Yunsheng Shi ◽  
Zhengjie Huang ◽  
Shikun Feng ◽  
Hui Zhong ◽  
Wenjing Wang ◽  
...  

Graph neural network (GNN) and label propagation algorithm (LPA) are both message passing algorithms, which have achieved superior performance in semi-supervised classification. GNN performs feature propagation by a neural network to make predictions, while LPA uses label propagation across graph adjacency matrix to get results. However, there is still no effective way to directly combine these two kinds of algorithms. To address this issue, we propose a novel Unified Message Passaging Model (UniMP) that can incorporate feature and label propagation at both training and inference time. First, UniMP adopts a Graph Transformer network, taking feature embedding and label embedding as input information for propagation. Second, to train the network without overfitting in self-loop input label information, UniMP introduces a masked label prediction strategy, in which some percentage of input label information are masked at random, and then predicted. UniMP conceptually unifies feature propagation and label propagation and is empirically powerful. It obtains new state-of-the-art semi-supervised classification results in Open Graph Benchmark (OGB).


Sign in / Sign up

Export Citation Format

Share Document