Non-parametric co-clustering of large scale sparse bipartite networks on the GPU

Author(s):  
Toke Jansen Hansen ◽  
Morten Morup ◽  
Lars Kai Hansen
2018 ◽  
Vol 38 (1) ◽  
pp. 3-22 ◽  
Author(s):  
Ajay Kumar Tanwani ◽  
Sylvain Calinon

Small-variance asymptotics is emerging as a useful technique for inference in large-scale Bayesian non-parametric mixture models. This paper analyzes the online learning of robot manipulation tasks with Bayesian non-parametric mixture models under small-variance asymptotics. The analysis yields a scalable online sequence clustering (SOSC) algorithm that is non-parametric in the number of clusters and the subspace dimension of each cluster. SOSC groups the new datapoint in low-dimensional subspaces by online inference in a non-parametric mixture of probabilistic principal component analyzers (MPPCA) based on a Dirichlet process, and captures the state transition and state duration information online in a hidden semi-Markov model (HSMM) based on a hierarchical Dirichlet process. A task-parameterized formulation of our approach autonomously adapts the model to changing environmental situations during manipulation. We apply the algorithm in a teleoperation setting to recognize the intention of the operator and remotely adjust the movement of the robot using the learned model. The generative model is used to synthesize both time-independent and time-dependent behaviors by relying on the principles of shared and autonomous control. Experiments with the Baxter robot yield parsimonious clusters that adapt online with new demonstrations and assist the operator in performing remote manipulation tasks.


2019 ◽  
Vol 12 (10) ◽  
pp. 1139-1152 ◽  
Author(s):  
Kai Wang ◽  
Xuemin Lin ◽  
Lu Qin ◽  
Wenjie Zhang ◽  
Ying Zhang

Author(s):  
Young Jin Ko ◽  
Mina Hur ◽  
Hanah Kim ◽  
Sang Gyeu Choi ◽  
Hee-Won Moon ◽  
...  

AbstractRecently introduced hematology analyzer, the Sysmex XN modular system (Sysmex, Kobe, Japan), has newly adopted a florescent channel to detect platelets and immature platelet fraction (IPF). This study aimed to establish new reference intervals for %-IPF and absolute number of IPF (A-IPF) on Sysmex XN. Platelet counts, %-IPF, and A-IPF were also compared between Sysmex XN and XE-2100 systems (Sysmex).Except outliers, blood samples from 2104 healthy individuals and 140 umbilical cord blood were analyzed using both Sysmex XN and XE-2100. The results of two systems were compared using Bland-Altman plot. The reference intervals for %-IPF and A-IPF were defined using non-parametric percentile methods according to the Clinical and Laboratory Standard Institute guideline (C28-A3).The platelet counts, %-IPF, and A-IPF showed non-parametric distributions. The mean difference between Sysmex XN and XE-2100 in healthy individuals revealed a positive bias in platelets (+8.0×10This large-scale study demonstrates a clear difference of platelet counts and IPF between Sysmex XN and XE-2100. The new reference intervals for IPF on Sysmex XN would provide fundamental data for clinical practice and future research.


2015 ◽  
Vol 2 (8) ◽  
pp. 140436 ◽  
Author(s):  
Wasiu A. Akanni ◽  
Mark Wilkinson ◽  
Christopher J. Creevey ◽  
Peter G. Foster ◽  
Davide Pisani

Since their advent, supertrees have been increasingly used in large-scale evolutionary studies requiring a phylogenetic framework and substantial efforts have been devoted to developing a wide variety of supertree methods (SMs). Recent advances in supertree theory have allowed the implementation of maximum likelihood (ML) and Bayesian SMs, based on using an exponential distribution to model incongruence between input trees and the supertree. Such approaches are expected to have advantages over commonly used non-parametric SMs, e.g. matrix representation with parsimony (MRP). We investigated new implementations of ML and Bayesian SMs and compared these with some currently available alternative approaches. Comparisons include hypothetical examples previously used to investigate biases of SMs with respect to input tree shape and size, and empirical studies based either on trees harvested from the literature or on trees inferred from phylogenomic scale data. Our results provide no evidence of size or shape biases and demonstrate that the Bayesian method is a viable alternative to MRP and other non-parametric methods. Computation of input tree likelihoods allows the adoption of standard tests of tree topologies (e.g. the approximately unbiased test). The Bayesian approach is particularly useful in providing support values for supertree clades in the form of posterior probabilities.


Author(s):  
Rui Zhang ◽  
Christian Walder ◽  
Marian-Andrei Rizoiu ◽  
Lexing Xie

In this paper, we develop an efficient non-parametric Bayesian estimation of the kernel function of Hawkes processes. The non-parametric Bayesian approach is important because it provides flexible Hawkes kernels and quantifies their uncertainty. Our method is based on the cluster representation of Hawkes processes. Utilizing the stationarity of the Hawkes process, we efficiently sample random branching structures and thus, we split the Hawkes process into clusters of Poisson processes. We derive two algorithms --- a block Gibbs sampler and a maximum a posteriori estimator based on expectation maximization --- and we show that our methods have a linear time complexity, both theoretically and empirically. On synthetic data, we show our methods to be able to infer flexible Hawkes triggering kernels. On two large-scale Twitter diffusion datasets, we show that our methods outperform the current state-of-the-art in goodness-of-fit and that the time complexity is linear in the size of the dataset. We also observe that on diffusions related to online videos, the learned kernels reflect the perceived longevity for different content types such as music or pets videos.


2021 ◽  
Vol 12 ◽  
Author(s):  
Yuanyuan Ma ◽  
Lifang Liu ◽  
Qianjun Chen ◽  
Yingjun Ma

Metabolites are closely related to human disease. The interaction between metabolites and drugs has drawn increasing attention in the field of pharmacomicrobiomics. However, only a small portion of the drug-metabolite interactions were experimentally observed due to the fact that experimental validation is labor-intensive, costly, and time-consuming. Although a few computational approaches have been proposed to predict latent associations for various bipartite networks, such as miRNA-disease, drug-target interaction networks, and so on, to our best knowledge the associations between drugs and metabolites have not been reported on a large scale. In this study, we propose a novel algorithm, namely inductive logistic matrix factorization (ILMF) to predict the latent associations between drugs and metabolites. Specifically, the proposed ILMF integrates drug–drug interaction, metabolite–metabolite interaction, and drug-metabolite interaction into this framework, to model the probability that a drug would interact with a metabolite. Moreover, we exploit inductive matrix completion to guide the learning of projection matrices U and V that depend on the low-dimensional feature representation matrices of drugs and metabolites: Fm and Fd. These two matrices can be obtained by fusing multiple data sources. Thus, FdU and FmV can be viewed as drug-specific and metabolite-specific latent representations, different from classical LMF. Furthermore, we utilize the Vicus spectral matrix that reveals the refined local geometrical structure inherent in the original data to encode the relationships between drugs and metabolites. Extensive experiments are conducted on a manually curated “DrugMetaboliteAtlas” dataset. The experimental results show that ILMF can achieve competitive performance compared with other state-of-the-art approaches, which demonstrates its effectiveness in predicting potential drug-metabolite associations.


2020 ◽  
Author(s):  
Daniel Tward ◽  
Xu Li ◽  
Bingxing Huo ◽  
Brian Lee ◽  
Michael I Miller ◽  
...  

ABSTRACTMapping information from different brains gathered using different modalities into a common coordinate space corresponding to a reference brain is an aspirational goal in modern neuroscience, analogous in importance to mapping genomic data to a reference genome. While brain-atlas mapping workflows exist for single-modality data (3D MRI or STPT image volumes), generally speaking data sets need to be combined across modalities with different contrast mechanisms and scale, in the presence of missing data as well as signals not present in the reference. This has so far been an unsolved problem. We have solved this problem in its full generality by developing and implementing a rigorous, non-parametric generative framework, that learns unknown mappings between contrast mechanisms from data and infers missing data. Our methodology permits rigorous quantification of the local sale changes between different individual brains, which has so far been neglected. We are also able to quantitatively characterize the individual variation in shape. Our work establishes a quantitative, scalable and streamlined workflow for unifying a broad spectrum of multi-modal whole-brain light microscopic data volumes into a coordinate-based atlas framework, a step that is a prerequisite for large scale integration of whole brain data sets in modern neuroscience.SummaryA current focus of research in neuroscience is to enumerate, map and annotate neuronal cell types in whole vertebrate brains using different modalities of data acquisition. A key challenge remains: can the large multiplicities of molecular anatomical data sets from many different modalities, and at widely different scales, be all assembled into a common reference space? Solving this problem is as important for modern neuroscience as mapping to reference genomes was for molecular biology. While workable brain-to-atlas mapping workflows exist for single modalities (e.g. mapping serial two photon (STP) brains to STP references) and largely for clean data, this is generally not a solved problem for mapping across contrast modalities, where data sets can be partial, and often carry signal not present in the reference brain (e.g. tracer injections). Presenting these types of anatomical data into a common reference frame for all to use is an aspirational goal for the neuroscience community. However so far this goal has been elusive due to the difficulties pointed to above and real integration is lacking.We have solved this problem in its full generality by developing and implementing a rigorous, generative framework, that learns unknown mappings between contrast mechanisms from data and infers missing data. The key idea in the framework is to minimize the difference between synthetic image volumes and real data over function classes of non-parametric mappings, including a diffeomorphic mapping, the contrast map and locations and types of missing data/non-reference signals. The non-parametric mappings are instantiated as regularized but over-parameterized functional forms over spatial grids. A final, manual refinement step is included to ensure scientific quality of the results.Our framework permits rigorous quantification of the local metric distortions between different individual brains, which is important for quantitative joint analysis of data gathered in multiple animals. Existing methods for atlas mapping do not provide metric quantifications and analyses of the resulting individual variations. We apply this pipeline to data modalities including various combinations of in-vivo and ex-vivo MRI, 3D STP and fMOST data sets, 2D serial histology sections including a 3D reassembly step, and brains processed for snRNAseq with tissue partially removed. Median local linear scale change with respect to a histologically processed Nissl reference brain, as measured using the Jacobian of the diffeomorphic transformations, was found to be 0.93 for STPT imaged brains (7% shrinkage) and 0.84 for fMOST imaged brains (16% shrinkage between reference brains and imaged volumes). Shrinkage between in-vivo and ex-vivo MRI for a mouse brain was found to be 0.96, and the distortion between the perfused brain and tape-cut digital sections was shown to be minimal (1.02 for Nissl histology sections). We were able to quantitatively characterize the individual variation in shape across individuals by studying variations in the tangent space of the diffeomorphic transformation around the reference brain. Based on this work we are able to establish co-variation patterns in metric distortions across the entire brain, across a large population set. We note that the magnitude of individual variation is often greater than differences between different sample preparation techniques. Our work establishes a quantitative, scalable and streamlined workflow for unifying a broad spectrum of multi-modal whole-brain light microscopic data volumes into a coordinate-based atlas framework, a step that is a prerequisite for large scale integration of whole brain data sets in modern neuroscience.


Sign in / Sign up

Export Citation Format

Share Document