scholarly journals Active Learning in the Geometric Block Model

2020 ◽  
Vol 34 (04) ◽  
pp. 3641-3648 ◽  
Author(s):  
Eli Chien ◽  
Antonia Tulino ◽  
Jaime Llorca

The geometric block model is a recently proposed generative model for random graphs that is able to capture the inherent geometric properties of many community detection problems, providing more accurate characterizations of practical community structures compared with the popular stochastic block model. Galhotra et al. recently proposed a motif-counting algorithm for unsupervised community detection in the geometric block model that is proved to be near-optimal. They also characterized the regimes of the model parameters for which the proposed algorithm can achieve exact recovery. In this work, we initiate the study of active learning in the geometric block model. That is, we are interested in the problem of exactly recovering the community structure of random graphs following the geometric block model under arbitrary model parameters, by possibly querying the labels of a limited number of chosen nodes. We propose two active learning algorithms that combine the use of motif-counting with two different label query policies. Our main contribution is to show that sampling the labels of a vanishingly small fraction of nodes (sub-linear in the total number of nodes) is sufficient to achieve exact recovery in the regimes under which the state-of-the-art unsupervised method fails. We validate the superior performance of our algorithms via numerical simulations on both real and synthetic datasets.

2019 ◽  
Author(s):  
Adriaan Sticker ◽  
Ludger Goeminne ◽  
Lennart Martens ◽  
Lieven Clement

AbstractLabel-Free Quantitative mass spectrometry based workflows for differential expression (DE) analysis of proteins impose important challenges on the data analysis due to peptide-specific effects and context dependent missingness of peptide intensities. Peptide-based workflows, like MSqRob, test for DE directly from peptide intensities and outper-form summarization methods which first aggregate MS1 peptide intensities to protein intensities before DE analysis. However, these methods are computationally expensive, often hard to understand for the non-specialised end-user, and do not provide protein summaries, which are important for visualisation or downstream processing. In this work, we therefore evaluate state-of-the-art summarization strategies using a benchmark spike-in dataset and discuss why and when these fail compared to the state-of-the-art peptide based model, MSqRob. Based on this evaluation, we propose a novel summarization strategy, MSqRob-Sum, which estimates MSqRob’s model parameters in a two-stage procedure circumventing the drawbacks of peptide-based workflows. MSqRobSum maintains MSqRob’s superior performance, while providing useful protein expression summaries for plotting and downstream analysis. Summarising peptide to protein intensities considerably reduces the computational complexity, the memory footprint and the model complexity, and makes it easier to disseminate DE inferred on protein summaries. Moreover, MSqRobSum provides a highly modular analysis framework, which provides researchers with full flexibility to develop data analysis workflows tailored towards their specific applications.


2013 ◽  
Vol 34 (1) ◽  
pp. 23-39 ◽  
Author(s):  
Donniell E. Fishkind ◽  
Daniel L. Sussman ◽  
Minh Tang ◽  
Joshua T. Vogelstein ◽  
Carey E. Priebe

Entropy ◽  
2021 ◽  
Vol 23 (1) ◽  
pp. 65
Author(s):  
Feng Zhao ◽  
Min Ye ◽  
Shao-Lun Huang

In this paper, we study the phase transition property of an Ising model defined on a special random graph—the stochastic block model (SBM). Based on the Ising model, we propose a stochastic estimator to achieve the exact recovery for the SBM. The stochastic algorithm can be transformed into an optimization problem, which includes the special case of maximum likelihood and maximum modularity. Additionally, we give an unbiased convergent estimator for the model parameters of the SBM, which can be computed in constant time. Finally, we use metropolis sampling to realize the stochastic estimator and verify the phase transition phenomenon thfough experiments.


2020 ◽  
pp. 2695-2704
Author(s):  
Ali Falah Yaqoob ◽  
Basad Al-Sarray

     Structure of network, which is known as community detection in networks, has received a great attention in diverse topics, including social sciences, biological studies, politics, etc. There are a large number of studies and practical approaches that were designed to solve the problem of finding the structure of the network. The definition of complex network model based on clustering is a non-deterministic polynomial-time hardness (NP-hard) problem. There are no ideal techniques to define the clustering. Here, we present a statistical approach based on using the likelihood function of a Stochastic Block Model (SBM). The objective is to define the general model and select the best model with high quality. Therefore, integrating the Tabu Search method with Fuzzy c-Mean (FCM) is implemented in different settings. The experiments are designed to find the best structure for different types of networks by maximizing the objective functions. SBM selections are computed by applying two types of criteria, namely Akaike Information Criteria (AIC) and Bayesian Information Criteria (BIC). The results show the ability of the proposed method to find the best community of the given networks.


Sign in / Sign up

Export Citation Format

Share Document