Domain decomposition for entropy regularized optimal transport

Numerische Mathematik ◽

10.1007/s00211-021-01245-0 ◽

2021 ◽

Author(s):

Mauro Bonafini ◽

Bernhard Schmitzer

Keyword(s):

Domain Decomposition ◽

Large Scale ◽

Optimal Transport ◽

Optimal Solution ◽

Linear Convergence ◽

Efficient Implementation ◽

Computationally Efficient ◽

Solution Quality ◽

Leibler Divergence ◽

Transport Problems

AbstractWe study Benamou’s domain decomposition algorithm for optimal transport in the entropy regularized setting. The key observation is that the regularized variant converges to the globally optimal solution under very mild assumptions. We prove linear convergence of the algorithm with respect to the Kullback–Leibler divergence and illustrate the (potentially very slow) rates with numerical examples. On problems with sufficient geometric structure (such as Wasserstein distances between images) we expect much faster convergence. We then discuss important aspects of a computationally efficient implementation, such as adaptive sparsity, a coarse-to-fine scheme and parallelization, paving the way to numerically solving large-scale optimal transport problems. We demonstrate efficient numerical performance for computing the Wasserstein-2 distance between 2D images and observe that, even without parallelization, domain decomposition compares favorably to applying a single efficient implementation of the Sinkhorn algorithm in terms of runtime, memory and solution quality.

Download Full-text

The back-and-forth method for Wasserstein gradient flows

ESAIM Control Optimisation and Calculus of Variations ◽

10.1051/cocv/2021029 ◽

2021 ◽

Vol 27 ◽

pp. 28

Author(s):

Matt Jacobs ◽

Wonjun Lee ◽

Flavien Léger

Keyword(s):

Large Class ◽

Large Scale ◽

Dual Problem ◽

Optimal Transport ◽

Gradient Flow ◽

Gradient Flows ◽

Primal Problem ◽

Transport Problems ◽

Wasserstein Gradient Flows ◽

Numer Math

We present a method to efficiently compute Wasserstein gradient flows. Our approach is based on a generalization of the back-and-forth method (BFM) introduced in Jacobs and Léger [Numer. Math. 146 (2020) 513–544.]. to solve optimal transport problems. We evolve the gradient flow by solving the dual problem to the JKO scheme. In general, the dual problem is much better behaved than the primal problem. This allows us to efficiently run large scale gradient flows simulations for a large class of internal energies including singular and non-convex energies.

Download Full-text

Visualizing Single-Cell RNA-seq Data with Semisupervised Principal Component Analysis

International Journal of Molecular Sciences ◽

10.3390/ijms21165797 ◽

2020 ◽

Vol 21 (16) ◽

pp. 5797

Author(s):

Zhenqiu Liu

Keyword(s):

Principal Component Analysis ◽

Dimension Reduction ◽

Single Cell ◽

Optimal Solution ◽

Principal Component ◽

Component Analysis ◽

Biological Information ◽

Rna Seq ◽

Computationally Efficient ◽

Leibler Divergence

Single-cell RNA-seq (scRNA-seq) is a powerful tool for analyzing heterogeneous and functionally diverse cell population. Visualizing scRNA-seq data can help us effectively extract meaningful biological information and identify novel cell subtypes. Currently, the most popular methods for scRNA-seq visualization are principal component analysis (PCA) and t-distributed stochastic neighbor embedding (t-SNE). While PCA is an unsupervised dimension reduction technique, t-SNE incorporates cluster information into pairwise probability, and then maximizes the Kullback–Leibler divergence. Uniform Manifold Approximation and Projection (UMAP) is another recently developed visualization method similar to t-SNE. However, one limitation with UMAP and t-SNE is that they can only capture the local structure of the data, the global structure of the data is not faithfully preserved. In this manuscript, we propose a semisupervised principal component analysis (ssPCA) approach for scRNA-seq visualization. The proposed approach incorporates cluster-labels into dimension reduction and discovers principal components that maximize both data variance and cluster dependence. ssPCA must have cluster-labels as its input. Therefore, it is most useful for visualizing clusters from a scRNA-seq clustering software. Our experiments with simulation and real scRNA-seq data demonstrate that ssPCA is able to preserve both local and global structures of the data, and uncover the transition and progressions in the data, if they exist. In addition, ssPCA is convex and has a global optimal solution. It is also robust and computationally efficient, making it viable for scRNA-seq cluster visualization.

Download Full-text

A First-Order Optimization Algorithm for Statistical Learning with Hierarchical Sparsity Structure

INFORMS Journal on Computing ◽

10.1287/ijoc.2021.1069 ◽

2021 ◽

Author(s):

Dewei Zhang ◽

Yin Liu ◽

Sam Davanloo Tajbakhsh

Keyword(s):

Numerical Simulation ◽

Statistical Learning ◽

Optimization Algorithm ◽

Large Scale ◽

Optimal Solution ◽

Linear Convergence ◽

Simulation Studies ◽

Speed Of Convergence ◽

Proximal Operator ◽

Sparsity Structure

In many statistical learning problems, it is desired that the optimal solution conform to an a priori known sparsity structure represented by a directed acyclic graph. Inducing such structures by means of convex regularizers requires nonsmooth penalty functions that exploit group overlapping. Our study focuses on evaluating the proximal operator of the latent overlapping group lasso developed by Jacob et al. in 2009. We implemented an alternating direction method of multiplier with a sharing scheme to solve large-scale instances of the underlying optimization problem efficiently. In the absence of strong convexity, global linear convergence of the algorithm is established using the error bound theory. More specifically, the paper contributes to establishing primal and dual error bounds when the nonsmooth component in the objective function does not have a polyhedral epigraph. We also investigate the effect of the graph structure on the speed of convergence of the algorithm. Detailed numerical simulation studies over different graph structures supporting the proposed algorithm and two applications in learning are provided. Summary of Contribution: The paper proposes a computationally efficient optimization algorithm to evaluate the proximal operator of a nonsmooth hierarchical sparsity-inducing regularizer and establishes its convergence properties. The computationally intensive subproblem of the proposed algorithm can be fully parallelized, which allows solving large-scale instances of the underlying problem. Comprehensive numerical simulation studies benchmarking the proposed algorithm against five other methods on the speed of convergence to optimality are provided. Furthermore, performance of the algorithm is demonstrated on two statistical learning applications related to topic modeling and breast cancer classification. The code along with the simulation studies and benchmarks are available on the corresponding author’s GitHub website for evaluation and future use.

Download Full-text

Cloud-Based Multimedia Content Protection System

International Journal of Scientific Research in Science Engineering and Technology ◽

10.32628/ijsrset207448 ◽

2020 ◽

pp. 164-169

Author(s):

B. Aparna ◽

S. Madhavi ◽

G. Mounika ◽

P. Avinash ◽

S. Chakravarthi

Keyword(s):

Large Scale ◽

Protection System ◽

Multimedia Content ◽

Computationally Efficient ◽

Content Protection ◽

Protection Systems ◽

Multimedia Content Protection ◽

High Scalability ◽

Cloud Infrastructures ◽

Rapid Deployment

We propose a new design for large-scale multimedia content protection systems. Our design leverages cloud infrastructures to provide cost efficiency, rapid deployment, scalability, and elasticity to accommodate varying workloads. The proposed system can be used to protect different multimedia content types, including videos, images, audio clips, songs, and music clips. The system can be deployed on private and/or public clouds. Our system has two novel components: (i) method to create signatures of videos, and (ii) distributed matching engine for multimedia objects. The signature method creates robust and representative signatures of videos that capture the depth signals in these videos and it is computationally efficient to compute and compare as well as it requires small storage. The distributed matching engine achieves high scalability and it is designed to support different multimedia objects. We implemented the proposed system and deployed it on two clouds: Amazon cloud and our private cloud. Our experiments with more than 11,000 videos and 1 million images show the high accuracy and scalability of the proposed system. In addition, we compared our system to the protection system used by YouTube and our results show that the YouTube protection system fails to detect most copies of videos, while our system detects more than 98% of them.

Download Full-text

Simulation and Performance Analysis of Tilted Time Window and Support Vector Machine Based Learning Object Ranking Method

Recent Advances in Electrical & Electronic Engineering (Formerly Recent Patents on Electrical & Electronic Engineering) ◽

10.2174/2213111607666190215120017 ◽

2020 ◽

Vol 13 (2) ◽

pp. 153-164

Author(s):

Narina Thakur ◽

Deepti Mehrotra ◽

Abhay Bansal ◽

Manju Bala

Keyword(s):

Support Vector Machine ◽

Classification Accuracy ◽

Time Window ◽

Optimal Solution ◽

Similarity Score ◽

Learning Objects ◽

Retrieval Algorithm ◽

Support Vector ◽

Computationally Efficient ◽

User Query

Objective: Since the adequacy of Learning Objects (LO) is a dynamic concept and changes in its use, needs and evolution, it is important to consider the importance of LO in terms of time to assess its relevance as the main objective of the proposed research. Another goal is to increase the classification accuracy and precision. Methods: With existing IR and ranking algorithms, MAP optimization either does not lead to a comprehensively optimal solution or is expensive and time - consuming. Nevertheless, Support Vector Machine learning competently leads to a globally optimal solution. SVM is a powerful classifier method with its high classification accuracy and the Tilted time window based model is computationally efficient. Results: This paper proposes and implements the LO ranking and retrieval algorithm based on the Tilted Time window and the Support Vector Machine, which uses the merit of both methods. The proposed model is implemented for the NCBI dataset and MAT Lab. Conclusion: The experiments have been carried out on the NCBI dataset, and LO weights are assigned to be relevant and non - relevant for a given user query according to the Tilted Time series and the Cosine similarity score. Results showed that the model proposed has much better accuracy.

Download Full-text

GIANA allows computationally-efficient TCR clustering and multi-disease repertoire classification by isometric transformation

Nature Communications ◽

10.1038/s41467-021-25006-7 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Hongyi Zhang ◽

Xiaowei Zhan ◽

Bo Li

Keyword(s):

T Cell ◽

T Cell Receptor ◽

Large Scale ◽

Cell Receptor ◽

Alignment Algorithm ◽

Computationally Efficient ◽

Antigen Specificity ◽

Non Invasive ◽

Isometric Transformation ◽

Specific Receptors

AbstractSimilarity in T-cell receptor (TCR) sequences implies shared antigen specificity between receptors, and could be used to discover novel therapeutic targets. However, existing methods that cluster T-cell receptor sequences by similarity are computationally inefficient, making them impractical to use on the ever-expanding datasets of the immune repertoire. Here, we developed GIANA (Geometric Isometry-based TCR AligNment Algorithm) a computationally efficient tool for this task that provides the same level of clustering specificity as TCRdist at 600 times its speed, and without sacrificing accuracy. GIANA also allows the rapid query of large reference cohorts within minutes. Using GIANA to cluster large-scale TCR datasets provides candidate disease-specific receptors, and provides a new solution to repertoire classification. Querying unseen TCR-seq samples against an existing reference differentiates samples from patients across various cohorts associated with cancer, infectious and autoimmune disease. Our results demonstrate how GIANA could be used as the basis for a TCR-based non-invasive multi-disease diagnostic platform.

Download Full-text

Least-looping stepping-stone-based ASM approach for transportation and triangular intuitionistic fuzzy transportation problems

Complex & Intelligent Systems ◽

10.1007/s40747-021-00472-0 ◽

2021 ◽

Author(s):

Kedar Nath Das ◽

Rajeev Das ◽

Debi Prasanna Acharjya

Keyword(s):

Optimal Solution ◽

Transportation Engineering ◽

Key Factors ◽

Solution Quality ◽

Transportation Problems ◽

Stepping Stone ◽

Intuitionistic Fuzzy ◽

Real World Problem ◽

Statistical Results ◽

Made In

AbstractTransportation problem (TP) is a popular branch of Linear Programming Problem in the field of Transportation engineering. Over the years, attempts have been made in finding improved approaches to solve the TPs. Recently, in Quddoos et al. (Int J Comput Sci Eng (IJCSE) 4(7): 1271–1274, 2012), an efficient approach, namely ASM, is proposed for solving crisp TPs. However, it is found that ASM fails to provide better optimal solution in some cases. Therefore, a new and efficient ASM appoach is proposed in this paper to enhance the inherent mechanism of the existing ASM method to solve both crisp TPs and Triangular Intuitionistic Fuzzy Transportation Problems (TIFTPs). A least-looping stepping-stone method has been employed as one of the key factors to improve the solution quality, which is an improved version of the existing stepping-stone method (Roy and Hossain in, Operation research Titus Publication, 2015). Unlike stepping stone method, least-looping stepping-stone method only deals with few selected non-basic cells under some prescribed conditions and hence minimizes the computational burden. Therefore, the framework of the proposed method (namely LS-ASM) is a combination of ASM (Quddoos et al. 2012) and least-looping stepping-stone approach. To validate the performance of LS-ASM, a set of six case studies and a real-world problem (those include both crisp TPs and TIFTPs) have been solved. The statistical results obtained by LS-ASM have been well compared with the existing popular modified distribution (MODI) method and the original ASM method, as well. The statistical results confirm the superiority of the LS-ASM over other compared algorithms with a less computationl effort.

Download Full-text

Temporal concatenation for Markov decision processes

Probability in the Engineering and Informational Sciences ◽

10.1017/s0269964821000206 ◽

2021 ◽

pp. 1-28

Author(s):

Ruiyang Song ◽

Kuang Xu

Keyword(s):

Markov Decision Processes ◽

Large Scale ◽

Optimal Solution ◽

Upper Bounds ◽

Black Box ◽

Decision Processes ◽

Optimal Solutions ◽

Wide Range ◽

Markov Decision ◽

Speed Up

We propose and analyze a temporal concatenation heuristic for solving large-scale finite-horizon Markov decision processes (MDP), which divides the MDP into smaller sub-problems along the time horizon and generates an overall solution by simply concatenating the optimal solutions from these sub-problems. As a “black box” architecture, temporal concatenation works with a wide range of existing MDP algorithms. Our main results characterize the regret of temporal concatenation compared to the optimal solution. We provide upper bounds for general MDP instances, as well as a family of MDP instances in which the upper bounds are shown to be tight. Together, our results demonstrate temporal concatenation's potential of substantial speed-up at the expense of some performance degradation.

Download Full-text

Fault diagnosis of industrial robot reducer by an extreme learning machine with a level-based learning swarm optimizer

Advances in Mechanical Engineering ◽

10.1177/16878140211019540 ◽

2021 ◽

Vol 13 (5) ◽

pp. 168781402110195

Author(s):

Jianwen Guo ◽

Xiaoyan Li ◽

Zhenpeng Lao ◽

Yandong Luo ◽

Jiapeng Wu ◽

...

Keyword(s):

Fault Diagnosis ◽

Extreme Learning Machine ◽

Large Scale ◽

Production Efficiency ◽

Industrial Robot ◽

Optimal Solution ◽

Industrial Robots ◽

Generalization Performance ◽

Gradient Descent Algorithm ◽

Learning Machine

Fault diagnosis is of great significance to improve the production efficiency and accuracy of industrial robots. Compared with the traditional gradient descent algorithm, the extreme learning machine (ELM) has the advantage of fast computing speed, but the input weights and the hidden node biases that are obtained at random affects the accuracy and generalization performance of ELM. However, the level-based learning swarm optimizer algorithm (LLSO) can quickly and effectively find the global optimal solution of large-scale problems, and can be used to solve the optimal combination of large-scale input weights and hidden biases in ELM. This paper proposes an extreme learning machine with a level-based learning swarm optimizer (LLSO-ELM) for fault diagnosis of industrial robot RV reducer. The model is tested by combining the attitude data of reducer gear under different fault modes. Compared with ELM, the experimental results show that this method has good stability and generalization performance.

Download Full-text

Space–time reduced order model for large-scale linear dynamical systems with application to Boltzmann transport problems

Journal of Computational Physics ◽

10.1016/j.jcp.2020.109845 ◽

2021 ◽

Vol 424 ◽

pp. 109845

Author(s):

Youngsoo Choi ◽

Peter Brown ◽

William Arrighi ◽

Robert Anderson ◽

Kevin Huynh

Keyword(s):

Dynamical Systems ◽

Large Scale ◽

Space Time ◽

Reduced Order Model ◽

Boltzmann Transport ◽

Order Model ◽

Linear Dynamical Systems ◽

Reduced Order ◽

Transport Problems

Download Full-text