scholarly journals Identifying cis-mediators for trans-eQTLs across many human tissues using genomic mediation analysis

2016 ◽  
Author(s):  
Fan Yang ◽  
Jiebiao Wang ◽  
Brandon L. Pierce ◽  
Lin S. Chen ◽  

ABSTRACTThe impact of inherited genetic variation on gene expression in humans is well-established. The majority of known expression quantitative trait loci (eQTLs) impact expression of local genes (cis-eQTLs). More research is needed to identify effects of genetic variation on distant genes (trans-eQTLs) and understand their biological mechanisms. One common trans-eQTLs mechanism is “mediation” by a local (cis) transcript. Thus, mediation analysis can be applied to genome-wide SNP and expression data in order to identify transcripts that are “cis-mediators” of trans-eQTLs, including those “cis-hubs” involved in regulation of many trans-genes. Identifying such mediators helps us understand regulatory networks and suggests biological mechanisms underlying trans-eQTLs, both of which are relevant for understanding susceptibility to complex diseases. The multi-tissue expression data from the Genotype-Tissue Expression (GTEx) program provides a unique opportunity to study cis-mediation across human tissue types. However, the presence of complex hidden confounding effects in biological systems can make mediation analyses challenging and prone to confounding bias, particularly when conducted among diverse samples. To address this problem, we propose a new method: Genomic Mediation analysis with Adaptive Confounding adjustment (GMAC). It enables the search of a very large pool of variables, and adaptively selects potential confounding variables for each mediation test. Analyses of simulated data and GTEx data demonstrate that the adaptive selection of confounders by GMAC improves the power and precision of mediation analysis. Application of GMAC to GTEx data provides new insights into the observed patterns of cis-hubs and trans-eQTL regulation across tissue types.

Author(s):  
Jeffrey D. Graham ◽  
Bolun Zhang ◽  
Denver M.Y. Brown ◽  
John Cairney

This study examined the home advantage effect in decisive National Basketball Association Conference Finals and Finals series playoff games from 1979 to 2019 (the 3-point shot era). We also examined the potential contribution of various offensive- and defensive-based skills and whether these skills mediated the relationship between game status (decisive vs. nondecisive) and outcome (win vs. loss). Overall, we found evidence of a home court advantage with the home team winning 63% of the decisive playoff games and 66% of the nondecisive playoff games. After adjusting for multiple comparisons and regular season win percentage, the home team had significantly more defensive rebounds and steals in Game 5 when trailing 3–1 going into that game. Mediation analyses did not reveal any significant findings when examining the impact of decisive game status on performance through offensive and defensive skills, thus suggesting there are other explanations for the home advantage effect.


2016 ◽  
Vol 113 (18) ◽  
pp. 5130-5135 ◽  
Author(s):  
Juan Zhao ◽  
Yiwei Zhou ◽  
Xiujun Zhang ◽  
Luonan Chen

Quantitatively identifying direct dependencies between variables is an important task in data analysis, in particular for reconstructing various types of networks and causal relations in science and engineering. One of the most widely used criteria is partial correlation, but it can only measure linearly direct association and miss nonlinear associations. However, based on conditional independence, conditional mutual information (CMI) is able to quantify nonlinearly direct relationships among variables from the observed data, superior to linear measures, but suffers from a serious problem of underestimation, in particular for those variables with tight associations in a network, which severely limits its applications. In this work, we propose a new concept, “partial independence,” with a new measure, “part mutual information” (PMI), which not only can overcome the problem of CMI but also retains the quantification properties of both mutual information (MI) and CMI. Specifically, we first defined PMI to measure nonlinearly direct dependencies between variables and then derived its relations with MI and CMI. Finally, we used a number of simulated data as benchmark examples to numerically demonstrate PMI features and further real gene expression data from Escherichia coli and yeast to reconstruct gene regulatory networks, which all validated the advantages of PMI for accurately quantifying nonlinearly direct associations in networks.


2020 ◽  
Author(s):  
S. Thomas Kelly ◽  
Michael A. Black

SummaryTranscriptomic analysis is used to capture the molecular state of a cell or sample in many biological and medical applications. In addition to identifying alterations in activity at the level of individual genes, understanding changes in the gene networks that regulate fundamental biological mechanisms is also an important objective of molecular analysis. As a result, databases that describe biological pathways are increasingly uesad to assist with the interpretation of results from large-scale genomics studies. Incorporating information from biological pathways and gene regulatory networks into a genomic data analysis is a popular strategy, and there are many methods that provide this functionality for gene expression data. When developing or comparing such methods, it is important to gain an accurate assessment of their performance. Simulation-based validation studies are frequently used for this. This necessitates the use of simulated data that correctly accounts for pathway relationships and correlations. Here we present a versatile statistical framework to simulate correlated gene expression data from biological pathways, by sampling from a multivariate normal distribution derived from a graph structure. This procedure has been released as the graphsim R package on CRAN and GitHub (https://github.com/TomKellyGenetics/graphsim) and is compatible with any graph structure that can be described using the igraph package. This package allows the simulation of biological pathways from a graph structure based on a statistical model of gene expression.


2020 ◽  
Author(s):  
Kayla A Johnson ◽  
Arjun Krishnan

AbstractBackgroundConstructing gene coexpression networks is a powerful approach for analyzing high-throughput gene expression data towards module identification, gene function prediction, and disease-gene prioritization. While optimal workflows for constructing coexpression networks – including good choices for data pre-processing, normalization, and network transformation – have been developed for microarray-based expression data, such well-tested choices do not exist for RNA-seq data. Almost all studies that compare data processing/normalization methods for RNA-seq focus on the end goal of determining differential gene expression.ResultsHere, we present a comprehensive benchmarking and analysis of 30 different workflows, each with a unique set of normalization and network transformation methods, for constructing coexpression networks from RNA-seq datasets. We tested these workflows on both large, homogenous datasets (Genotype-Tissue Expression project) and small, heterogeneous datasets from various labs (submitted to the Sequence Read Archive). We analyzed the workflows in terms of aggregate performance, individual method choices, and the impact of multiple dataset experimental factors. Our results demonstrate that between-sample normalization has the biggest impact, with trimmed mean of M-values or upper quartile normalization producing networks that most accurately recapitulate known tissue-naive and tissue-specific gene functional relationships.ConclusionsBased on this work, we provide concrete recommendations on robust procedures for building an accurate coexpression network from an RNA-seq dataset. In addition, researchers can examine all the results in great detail at https://krishnanlab.github.io/norm_for_RNAseq_coexp to make appropriate choices for coexpression analysis based on the experimental factors of their RNA-seq dataset.


2020 ◽  
Author(s):  
Ayan Chatterjee ◽  
Ram Bajpai ◽  
Pankaj Khatiwada

BACKGROUND Lifestyle diseases are the primary cause of death worldwide. The gradual growth of negative behavior in humans due to physical inactivity, unhealthy habit, and improper nutrition expedites lifestyle diseases. In this study, we develop a mathematical model to analyze the impact of regular physical activity, healthy habits, and a proper diet on weight change, targeting obesity as a case study. Followed by, we design an algorithm for the verification of the proposed mathematical model with simulated data of artificial participants. OBJECTIVE This study intends to analyze the effect of healthy behavior (physical activity, healthy habits, and proper dietary pattern) on weight change with a proposed mathematical model and its verification with an algorithm where personalized habits are designed to change dynamically based on the rule. METHODS We developed a weight-change mathematical model as a function of activity, habit, and nutrition with the first law of thermodynamics, basal metabolic rate (BMR), total daily energy expenditure (TDEE), and body-mass-index (BMI) to establish a relationship between health behavior and weight change. Followed by, we verified the model with simulated data. RESULTS The proposed provable mathematical model showed a strong relationship between health behavior and weight change. We verified the mathematical model with the proposed algorithm using simulated data following the necessary constraints. The adoption of BMR and TDEE calculation following Harris-Benedict’s equation has increased the model's accuracy under defined settings. CONCLUSIONS This study helped us understand the impact of healthy behavior on obesity and overweight with numeric implications and the importance of adopting a healthy lifestyle abstaining from negative behavior change.


2020 ◽  
Vol 7 (1) ◽  
pp. e000755
Author(s):  
Matthew Moll ◽  
Sharon M. Lutz ◽  
Auyon J. Ghosh ◽  
Phuwanat Sakornsakolpat ◽  
Craig P. Hersh ◽  
...  

IntroductionFamily history is a risk factor for chronic obstructive pulmonary disease (COPD). We previously developed a COPD risk score from genome-wide genetic markers (Polygenic Risk Score, PRS). Whether the PRS and family history provide complementary or redundant information for predicting COPD and related outcomes is unknown.MethodsWe assessed the predictive capacity of family history and PRS on COPD and COPD-related outcomes in non-Hispanic white (NHW) and African American (AA) subjects from COPDGene and ECLIPSE studies. We also performed interaction and mediation analyses.ResultsIn COPDGene, family history and PRS were significantly associated with COPD in a single model (PFamHx <0.0001; PPRS<0.0001). Similar trends were seen in ECLIPSE. The area under the receiver operator characteristic curve for a model containing family history and PRS was significantly higher than a model with PRS (p=0.00035) in NHWs and a model with family history (p<0.0001) alone in NHWs and AAs. Both family history and PRS were significantly associated with measures of quantitative emphysema and airway thickness. There was a weakly positive interaction between family history and the PRS under the additive, but not multiplicative scale in NHWs (relative excess risk due to interaction=0.48, p=0.04). Mediation analyses found that a significant proportion of the effect of family history on COPD was mediated through PRS in NHWs (16.5%, 95% CI 9.4% to 24.3%), but not AAs.ConclusionFamily history and the PRS provide complementary information for predicting COPD and related outcomes. Future studies can address the impact of obtaining both measures in clinical practice.


Sign in / Sign up

Export Citation Format

Share Document