scholarly journals Estimating group differences in network models using moderation analysis

Author(s):  
Jonas M. B. Haslbeck

AbstractStatistical network models such as the Gaussian Graphical Model and the Ising model have become popular tools to analyze multivariate psychological datasets. In many applications, the goal is to compare such network models across groups. In this paper, I introduce a method to estimate group differences in network models that is based on moderation analysis. This method is attractive because it allows one to make comparisons across more than two groups for all parameters within a single model and because it is implemented for all commonly used cross-sectional network models. Next to introducing the method, I evaluate the performance of the proposed method and existing approaches in a simulation study. Finally, I provide a fully reproducible tutorial on how to use the proposed method to compare a network model across three groups using the R-package mgm.

2020 ◽  
Author(s):  
Jonas M B Haslbeck

Statistical network models such as the Gaussian Graphical Model and the Ising model have become popular tools to analyze multivariate psychological data sets. In many applications the goal is to compare such network models across groups. In this paper I introduce a method to estimate differences in network models across groups that is based on moderation analysis. This method is attractive because it allows to make comparisons across more than two groups within a single model, and because it is implemented for all commonly used cross-sectional network models. Next to introducing the method, I evaluate the performance of the proposed method and existing approaches in a simulation study. Finally, I provide a fully reproducible tutorial on how to use the moderation method to compare a network model across three groups using the R-package mgm.


Author(s):  
Mingyang Ren ◽  
Sanguo Zhang ◽  
Qingzhao Zhang ◽  
Shuangge Ma

Abstract Summary Heterogeneity is a hallmark of many complex human diseases, and unsupervised heterogeneity analysis has been extensively conducted using high-throughput molecular measurements and histopathological imaging features. ‘Classic’ heterogeneity analysis has been based on simple statistics such as mean, variance and correlation. Network-based analysis takes interconnections as well as individual variable properties into consideration and can be more informative. Several Gaussian graphical model (GGM)-based heterogeneity analysis techniques have been developed, but friendly and portable software is still lacking. To facilitate more extensive usage, we develop the R package HeteroGGM, which conducts GGM-based heterogeneity analysis using the advanced penaliztaion techniques, can provide informative summary and graphical presentation, and is efficient and friendly. Availabilityand implementation The package is available at https://CRAN.R-project.org/package=HeteroGGM. Supplementary information Supplementary data are available at Bioinformatics online.


2018 ◽  
Vol 53 (4) ◽  
pp. 453-480 ◽  
Author(s):  
Sacha Epskamp ◽  
Lourens J. Waldorp ◽  
René Mõttus ◽  
Denny Borsboom

2019 ◽  
Author(s):  
Maarten Marsman

The Ising model is a graphical model that has played an essential role in the field of network psychometrics, where it has been used as a theoretical model to re-conceptualize psychometric concepts and as a statistical model for the analysis of psychological data. But in network psychometrics, the psychological data that are analyzed often come from cross-sectional applications, and the practice of using graphical models such as the Ising model to analyze these data has been heavily critiqued in the past few years. The primary voiced concern centers around the inability of the Ising model to express heterogeneity in the population, and the necessity to then assume that the population is homogeneous w.r.t. the network's structure in practice. But associations at the group-level may be entirely different from associations at the individual level, and it is unclear what the estimated relations from cross-sectional data imply for associations at the individual level. In this paper, an idiographic interpretation of the Ising model is developed that does not require that persons are exchangeable replications of a single topological structure. Working with a clear, formal connection between network relations at the individual- and the group level, we have unique topological structures that characterize individuals and aggregate into an Ising model cross-sectionally.


2015 ◽  
Author(s):  
Marco Antoniotti ◽  
Giulio Caravagna ◽  
Luca De Sano ◽  
Alex Graudenzi ◽  
Giancarlo Mauri ◽  
...  

Models of cancer progression provide insights on the order of accumulation of genetic alterations during cancer development. Algorithms to infer such models from the currently available mutational profiles collected from different cancer patiens (cross-sectional data) have been defined in the literature since late 90s. These algorithms differ in the way they extract a graphical model of the events modelling the progression, e.g., somatic mutations or copy-number alterations. TRONCO is an R package for TRanslational ONcology which provides a serie of functions to assist the user in the analysis of cross sectional genomic data and, in particular, it implements algorithms that aim to model cancer progression by means of the notion of selective advantage. These algorithms are proved to outperform the current state-of-the-art in the inference of cancer progression models. TRONCO also provides functionalities to load input cross-sectional data, set up the execution of the algorithms, assess the statistical confidence in the results and visualize the models. Availability. Freely available at http://www.bioconductor.org/ under GPL license; project hosted at http://bimib.disco.unimib.it/ and https://github.com/BIMIB-DISCo/TRONCO. Contact. [email protected]


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Camilla Lingjærde ◽  
Tonje G. Lien ◽  
Ørnulf Borgan ◽  
Helga Bergholtz ◽  
Ingrid K. Glad

Abstract Background Identifying gene interactions is a topic of great importance in genomics, and approaches based on network models provide a powerful tool for studying these. Assuming a Gaussian graphical model, a gene association network may be estimated from multiomic data based on the non-zero entries of the inverse covariance matrix. Inferring such biological networks is challenging because of the high dimensionality of the problem, making traditional estimators unsuitable. The graphical lasso is constructed for the estimation of sparse inverse covariance matrices in such situations, using $$L_1$$ L 1 -penalization on the matrix entries. The weighted graphical lasso is an extension in which prior biological information from other sources is integrated into the model. There are however issues with this approach, as it naïvely forces the prior information into the network estimation, even if it is misleading or does not agree with the data at hand. Further, if an associated network based on other data is used as the prior, the method often fails to utilize the information effectively. Results We propose a novel graphical lasso approach, the tailored graphical lasso, that aims to handle prior information of unknown accuracy more effectively. We provide an R package implementing the method, . Applying the method to both simulated and real multiomic data sets, we find that it outperforms the unweighted and weighted graphical lasso in terms of all performance measures we consider. In fact, the graphical lasso and weighted graphical lasso can be considered special cases of the tailored graphical lasso, and a parameter determined by the data measures the usefulness of the prior information. We also find that among a larger set of methods, the tailored graphical is the most suitable for network inference from high-dimensional data with prior information of unknown accuracy. With our method, mRNA data are demonstrated to provide highly useful prior information for protein–protein interaction networks. Conclusions The method we introduce utilizes useful prior information more effectively without involving any risk of loss of accuracy should the prior information be misleading.


2020 ◽  
Author(s):  
Camilla Lingjærde ◽  
Tonje G Lien ◽  
Ørnulf Borgan ◽  
Ingrid K Glad

AbstractBackgroundIdentifying gene interactions is a topic of great importance in genomics, and approaches based on network models provide a powerful tool for studying these. Assuming a Gaussian graphical model, a gene association network may be estimated from multiomic data based on the non-zero entries of the inverse covariance matrix. Inferring such biological networks is challenging because of the high dimensionality of the problem, making traditional estimators unsuitable. The graphical lasso is constructed for the estimation of sparse inverse covariance matrices in Gaussian graphical models in such situations, using L1-penalization on the matrix entries. An extension of the graphical lasso is the weighted graphical lasso, in which prior biological information from other (data) sources is integrated into the model through the weights. There are however issues with this approach, as it naïvely forces the prior information into the network estimation, even if it is misleading or does not agree with the data at hand. Further, if an associated network based on other data is used as the prior, weighted graphical lasso often fails to utilize the information effectively.ResultsWe propose a novel graphical lasso approach, the tailored graphical lasso, that aims to handle prior information of unknown accuracy more effectively. We provide an R package implementing the method, tailoredGlasso. Applying the method to both simulated and real multiomic data sets, we find that it outperforms the unweighted and weighted graphical lasso in terms of all performance measures we consider. In fact, the graphical lasso and weighted graphical lasso can be considered special cases of the tailored graphical lasso, and a parameter determined by the data measures the usefulness of the prior information. With our method, mRNA data are demonstrated to provide highly useful prior information for protein-protein interaction networks.ConclusionsThe method we introduce utilizes useful prior information more effectively without involving any risk of loss of accuracy should the prior information be misleading.


2021 ◽  
Author(s):  
Mihai Alexandru Constantin ◽  
Noémi Katalin Schuurman ◽  
Jeroen Vermunt

We introduce a general method for sample size computations in the context of cross-sectional network models. The method takes the form of an automated Monte Carlo algorithm, designed to find an optimal sample size while iteratively concentrating the computations on the sample sizes that seem most relevant. The method requires three inputs: 1) a hypothesized network structure or desired characteristics of that structure, 2) an estimation performance measure and its corresponding target value (e.g., a sensitivity of 0.6), and 3) a statistic and its corresponding target value that determine how the target value for the performance measure be reached (e.g., reaching a sensitivity of 0.6 with a probability of 0.8). The method consists of a Monte Carlo simulation step for computing the performance measure and the statistic for several sample sizes selected from an initial candidate sample size range, a curve-fitting step for interpolating the statistic across the entire candidate range, and a stratified bootstrapping step to quantify the uncertainty around the recommendation provided. We evaluated the performance of the method for the Gaussian Graphical Model, but it can easily extend to other models. It displayed good performance, with the sample size recommendations provided being, on average, at most 1.14 sample sizes away from the truth, with a highest standard deviation of 26.25 sample sizes. The method is implemented in the form of an R package called powerly, available on GitHub and CRAN.


2019 ◽  
pp. 1-9 ◽  
Author(s):  
Jill de Ron ◽  
Eiko I. Fried ◽  
Sacha Epskamp

Abstract Background In clinical research, populations are often selected on the sum-score of diagnostic criteria such as symptoms. Estimating statistical models where a subset of the data is selected based on a function of the analyzed variables introduces Berkson's bias, which presents a potential threat to the validity of findings in the clinical literature. The aim of the present paper is to investigate the effect of Berkson's bias on the performance of the two most commonly used psychological network models: the Gaussian Graphical Model (GGM) for continuous and ordinal data, and the Ising Model for binary data. Methods In two simulation studies, we test how well the two models recover a true network structure when estimation is based on a subset of the data typically seen in clinical studies. The network is based on a dataset of 2807 patients diagnosed with major depression, and nodes in the network are items from the Hamilton Rating Scale for Depression (HRSD). The simulation studies test different scenarios by varying (1) sample size and (2) the cut-off value of the sum-score which governs the selection of participants. Results The results of both studies indicate that higher cut-off values are associated with worse recovery of the network structure. As expected from the Berkson's bias literature, selection reduced recovery rates by inducing negative connections between the items. Conclusion Our findings provide evidence that Berkson's bias is a considerable and underappreciated problem in the clinical network literature. Furthermore, we discuss potential solutions to circumvent Berkson's bias and their pitfalls.


Sign in / Sign up

Export Citation Format

Share Document