scholarly journals Entropy-Regularized Optimal Transport on Multivariate Normal and q-normal Distributions

Entropy ◽  
2021 ◽  
Vol 23 (3) ◽  
pp. 302
Author(s):  
Qijun Tong ◽  
Kei Kobayashi

The distance and divergence of the probability measures play a central role in statistics, machine learning, and many other related fields. The Wasserstein distance has received much attention in recent years because of its distinctions from other distances or divergences. Although computing the Wasserstein distance is costly, entropy-regularized optimal transport was proposed to computationally efficiently approximate the Wasserstein distance. The purpose of this study is to understand the theoretical aspect of entropy-regularized optimal transport. In this paper, we focus on entropy-regularized optimal transport on multivariate normal distributions and q-normal distributions. We obtain the explicit form of the entropy-regularized optimal transport cost on multivariate normal and q-normal distributions; this provides a perspective to understand the effect of entropy regularization, which was previously known only experimentally. Furthermore, we obtain the entropy-regularized Kantorovich estimator for the probability measure that satisfies certain conditions. We also demonstrate how the Wasserstein distance, optimal coupling, geometric structure, and statistical efficiency are affected by entropy regularization in some experiments. In particular, our results about the explicit form of the optimal coupling of the Tsallis entropy-regularized optimal transport on multivariate q-normal distributions and the entropy-regularized Kantorovich estimator are novel and will become the first step towards the understanding of a more general setting.

2019 ◽  
Vol 31 (5) ◽  
pp. 827-848 ◽  
Author(s):  
Shun-ichi Amari ◽  
Ryo Karakida ◽  
Masafumi Oizumi ◽  
Marco Cuturi

We propose a new divergence on the manifold of probability distributions, building on the entropic regularization of optimal transportation problems. As Cuturi ( 2013 ) showed, regularizing the optimal transport problem with an entropic term is known to bring several computational benefits. However, because of that regularization, the resulting approximation of the optimal transport cost does not define a proper distance or divergence between probability distributions. We recently tried to introduce a family of divergences connecting the Wasserstein distance and the Kullback-Leibler divergence from an information geometry point of view (see Amari, Karakida, & Oizumi, 2018 ). However, that proposal was not able to retain key intuitive aspects of the Wasserstein geometry, such as translation invariance, which plays a key role when used in the more general problem of computing optimal transport barycenters. The divergence we propose in this work is able to retain such properties and admits an intuitive interpretation.


Author(s):  
Nhan Dam ◽  
Quan Hoang ◽  
Trung Le ◽  
Tu Dinh Nguyen ◽  
Hung Bui ◽  
...  

We propose a new formulation for learning generative adversarial networks (GANs) using optimal transport cost (the general form of Wasserstein distance) as the objective criterion to measure the dissimilarity between target distribution and learned distribution. Our formulation is based on the general form of the Kantorovich duality which is applicable to optimal transport with a wide range of cost functions that are not necessarily metric. To make optimising this duality form amenable to gradient-based methods, we employ a function that acts as an amortised optimiser for the innermost optimisation problem. Interestingly, the amortised optimiser can be viewed as a mover since it strategically shifts around data points. The resulting formulation is a sequential min-max-min game with 3 players: the generator, the critic, and the mover where the new player, the mover, attempts to fool the critic by shifting the data around. Despite involving three players, we demonstrate that our proposed formulation can be trained reasonably effectively via a simple alternative gradient learning strategy. Compared with the existing Lipschitz-constrained formulations of Wasserstein GAN on CIFAR-10, our model yields significantly better diversity scores than weight clipping and comparable performance to gradient penalty method.


2020 ◽  
Vol 24 ◽  
pp. 703-717
Author(s):  
Aurélien Alfonsi ◽  
Benjamin Jourdain

In this paper, we remark that any optimal coupling for the quadratic Wasserstein distance W22(μ,ν) between two probability measures μ and ν with finite second order moments on ℝd is the composition of a martingale coupling with an optimal transport map 𝛵. We check the existence of an optimal coupling in which this map gives the unique optimal coupling between μ and 𝛵#μ. Next, we give a direct proof that σ ↦ W22(σ,ν) is differentiable at μ in the Lions (Cours au Collège de France. 2008) sense iff there is a unique optimal coupling between μ and ν and this coupling is given by a map. It was known combining results by Ambrosio, Gigli and Savaré (Lectures in Mathematics ETH Zürich. Birkhäuser Verlag, Basel, 2005) and Ambrosio and Gangbo (Comm. Pure Appl. Math., 61:18–53, 2008) that, under the latter condition, geometric differentiability holds. Moreover, the two notions of differentiability are equivalent according to the recent paper of Gangbo and Tudorascu (J. Math. Pures Appl. 125:119–174, 2019). Besides, we give a self-contained probabilistic proof that mere Fréchet differentiability of a law invariant function F on L2(Ω, ℙ; ℝd) is enough for the Fréchet differential at X to be a measurable function of X.


SLEEP ◽  
2021 ◽  
Vol 44 (Supplement_2) ◽  
pp. A111-A112
Author(s):  
Austin Vandegriffe ◽  
V A Samaranayake ◽  
Matthew Thimgan

Abstract Introduction Technological innovations have broadened the type and amount of activity data that can be captured in the home and under normal living conditions. Yet, converting naturalistic activity patterns into sleep and wakefulness states has remained a challenge. Despite the successes of current algorithms, they do not fill all actigraphy needs. We have developed a novel statistical approach to determine sleep and wakefulness times, called the Wasserstein Algorithm for Classifying Sleep and Wakefulness (WACSAW), and validated the algorithm in a small cohort of healthy participants. Methods WACSAW functional routines: 1) Conversion of the triaxial movement data into a univariate time series; 2) Construction of a Wasserstein weighted sum (WSS) time series by measuring the Wasserstein distance between equidistant distributions of movement data before and after the time-point of interest; 3) Segmenting the time series by identifying changepoints based on the behavior of the WSS series; 4) Merging segments deemed similar by the Levene test; 5) Comparing segments by optimal transport methodology to determine the difference from a flat, invariant distribution at zero. The resulting histogram can be used to determine sleep and wakefulness parameters around a threshold determined for each individual based on histogram properties. To validate the algorithm, participants wore the GENEActiv and a commercial grade actigraphy watch for 48 hours. The accuracy of WACSAW was compared to a detailed activity log and benchmarked against the results of the output from commercial wrist actigraph. Results WACSAW performed with an average accuracy, sensitivity, and specificity of >95% compared to detailed activity logs in 10 healthy-sleeping individuals of mixed sexes and ages. We then compared WACSAW’s performance against a common wrist-worn, commercial sleep monitor. WACSAW outperformed the commercial grade system in each participant compared to activity logs and the variability between subjects was cut substantially. Conclusion The performance of WACSAW demonstrates good results in a small test cohort. In addition, WACSAW is 1) open-source, 2) individually adaptive, 3) indicates individual reliability, 4) based on the activity data stream, and 5) requires little human intervention. WACSAW is worthy of validating against polysomnography and in patients with sleep disorders to determine its overall effectiveness. Support (if any):


Author(s):  
Pinar Demetci ◽  
Rebecca Santorella ◽  
Björn Sandstede ◽  
William Stafford Noble ◽  
Ritambhara Singh

AbstractData integration of single-cell measurements is critical for understanding cell development and disease, but the lack of correspondence between different types of measurements makes such efforts challenging. Several unsupervised algorithms can align heterogeneous single-cell measurements in a shared space, enabling the creation of mappings between single cells in different data domains. However, these algorithms require hyperparameter tuning for high-quality alignments, which is difficult in an unsupervised setting without correspondence information for validation. We present Single-Cell alignment using Optimal Transport (SCOT), an unsupervised learning algorithm that uses Gromov Wasserstein-based optimal transport to align single-cell multi-omics datasets. We compare the alignment performance of SCOT with state-of-the-art algorithms on four simulated and two real-world datasets. SCOT performs on par with state-of-the-art methods but is faster and requires tuning fewer hyperparameters. Furthermore, we provide an algorithm for SCOT to use Gromov Wasserstein distance to guide the parameter selection. Thus, unlike previous methods, SCOT aligns well without using any orthogonal correspondence information to pick the hyperparameters. Our source code and scripts for replicating the results are available at https://github.com/rsinghlab/SCOT.


Author(s):  
Renjun Xu ◽  
Pelen Liu ◽  
Yin Zhang ◽  
Fang Cai ◽  
Jindong Wang ◽  
...  

Domain adaptation (DA) has achieved a resounding success to learn a good classifier by leveraging labeled data from a source domain to adapt to an unlabeled target domain. However, in a general setting when the target domain contains classes that are never observed in the source domain, namely in Open Set Domain Adaptation (OSDA), existing DA methods failed to work because of the interference of the extra unknown classes. This is a much more challenging problem, since it can easily result in negative transfer due to the mismatch between the unknown and known classes. Existing researches are susceptible to misclassification when target domain unknown samples in the feature space distributed near the decision boundary learned from the labeled source domain. To overcome this, we propose Joint Partial Optimal Transport (JPOT), fully utilizing information of not only the labeled source domain but also the discriminative representation of unknown class in the target domain. The proposed joint discriminative prototypical compactness loss can not only achieve intra-class compactness and inter-class separability, but also estimate the mean and variance of the unknown class through backpropagation, which remains intractable for previous methods due to the blindness about the structure of the unknown classes. To our best knowledge, this is the first optimal transport model for OSDA. Extensive experiments demonstrate that our proposed model can significantly boost the performance of open set domain adaptation on standard DA datasets.


Sign in / Sign up

Export Citation Format

Share Document