scholarly journals Systematic tissue-specific functional annotation of the human genome highlights immune-related DNA elements for late-onset Alzheimer’s disease

2016 ◽  
Author(s):  
Qiongshi Lu ◽  
Ryan L. Powles ◽  
Sarah Abdallah ◽  
Derek Ou ◽  
Qian Wang ◽  
...  

AbstractContinuing efforts from large international consortia have made genome-wide epigenomic and transcriptomic annotation data publicly available for a variety of cell and tissue types. However, synthesis of these datasets into effective summary metrics to characterize the functional non-coding genome remains a challenge. Here, we present GenoSkyline-Plus, an extension of our previous work through integration of an expanded set of epigenomic and transcriptomic annotations to produce high-resolution, single tissue annotations. After validating our annotations with a catalog of tissue-specific non-coding elements previously identified in the literature, we apply our method using data from 127 different cell and tissue types to present an atlas of heritability enrichment across 45 different GWAS traits. We show that broader organ system categories (e.g. immune system) increase statistical power in identifying biologically relevant tissue types for complex diseases while annotations of individual cell types (e.g. monocytes or B-cells) provide deeper insights into disease etiology. Additionally, we use our GenoSkyline-Plus annotations in an in-depth case study of late-onset Alzheimer’s disease (LOAD). Our analyses suggest a strong connection between LOAD heritability and genetic variants contained in regions of the genome functional in monocytes. Furthermore, we show that LOAD shares a similar localization of SNPs to monocyte-functional regions with Parkinson’s disease. Overall, we demonstrate that integrated genome annotations at the single tissue level provide a valuable tool for understanding the etiology of complex human diseases. Our GenoSkyline-Plus annotations are freely available at http://genocanyon.med.yale.edu/GenoSkyline.Author SummaryAfter years of community efforts, many experimental and computational approaches have been developed and applied for functional annotation of the human genome, yet proper annotation still remains challenging, especially in non-coding regions. As complex disease research rapidly advances, increasing evidence suggests that non-coding regulatory DNA elements may be the primary regions harboring risk variants in human complex diseases. In this paper, we introduce GenoSkyline-Plus, a principled annotation framework to identify tissue and cell type-specific functional regions in the human genome through integration of diverse high-throughput epigenomic and transcriptomic data. Through validation of known non-coding tissue-specific regulatory regions, enrichment analyses on 45 complex traits, and an in-depth case study of neurodegenerative diseases, we demonstrate the ability of GenoSkyline-Plus to accurately identify tissue-specific functionality in the human genome and provide unbiased, genome-wide insights into the genetic basis of human complex diseases.

PLoS Genetics ◽  
2017 ◽  
Vol 13 (7) ◽  
pp. e1006933 ◽  
Author(s):  
Qiongshi Lu ◽  
Ryan L. Powles ◽  
Sarah Abdallah ◽  
Derek Ou ◽  
Qian Wang ◽  
...  

2019 ◽  
Author(s):  
Fengzhe Xu ◽  
Yuanqing Fu ◽  
Ting-yu Sun ◽  
Zengliang Jiang ◽  
Zelei Miao ◽  
...  

AbstractThere is increasing interest about the interplay between host genetics and gut microbiome on human complex diseases, with prior evidence mainly derived from animal models. In addition, the shared and distinct microbiome features among human complex diseases remain largely unclear. We performed a microbiome genome-wide association study to identify host genetic variants associated with gut microbiome in a Chinese population with 1475 participants. We then conducted bi-directional Mendelian randomization analyses to examine the potential causal associations between gut microbiome and human complex diseases. We found that Saccharibacteria (also known as TM7 phylum) could potentially improve renal function by affecting renal function biomarkers (i.e., creatinine and estimated glomerular filtration rate). In contrast, atrial fibrillation, chronic kidney disease and prostate cancer, as predicted by the host genetics, had potential causal effect on gut microbiome. Further disease-microbiome feature analysis suggested that gut microbiome features revealed novel relationship among human complex diseases. These results suggest that different human complex diseases share common and distinct gut microbiome features, which may help re-shape our understanding about the disease etiology in humans.


2020 ◽  
Author(s):  
Mahfuzur Rahman ◽  
Maximilian Billmann ◽  
Michael Costanzo ◽  
Michael Aregger ◽  
Amy H. Y. Tong ◽  
...  

We present FLEX (Functionalevaluation of experimental perturbations), a pipeline that leverages several functional annotation resources to establish reference standards for benchmarking human genome-wide CRISPR screen data and methods for analyzing them. We apply FLEX to analyze data from the diverse cell line screens generated by the DepMap project. We identify a dominant mitochondria-associated signal, which our time-resolved CRISPR screens and analysis suggests may reflect screen dynamics and protein stability effects rather than genetic dependencies.


2016 ◽  
Author(s):  
Daniel Backenroth ◽  
Zihuai He ◽  
Krzysztof Kiryluk ◽  
Valentina Boeva ◽  
Lynn Pethukova ◽  
...  

ABSTRACTWe describe here a new method based on a latent Dirichlet allocation model for predicting functional effects of noncoding genetic variants in a cell type and tissue specific way (FUN-LDA) by integrating diverse epigenetic annotations for specific cell types and tissues from large scale epige-nomics projects such as ENCODE and Roadmap Epigenomics. Using this unsupervised approach we predict tissue-specific functional effects for every position in the human genome. We demonstrate the usefulness of our predictions using several validation experiments. Using eQTL data from several sources, including the Genotype-Tissue Expression project, the Geuvadis project and Twin-sUK cohort, we show that eQTLs in specific tissues tend to be most enriched among the predicted functional variants in relevant tissues in Roadmap. We further show how these integrated functional scores can be used to derive the most likely cell/tissue type causally implicated for a complex trait using summary statistics from genome-wide association studies, and estimate a tissue-based correlation matrix of various complex traits. We find large enrichment of heritability in functional components of relevant tissues for various complex traits, with FUN-LDA yielding the highest enrichment estimates relative to existing methods. Finally, using experimentally validated functional variants from the literature and variants possibly implicated in disease by previous studies, we rigorously compare FUN-LDA to state-of-the-art functional annotation methods such as GenoSky-line, ChromHMM, Segway, and IDEAS, and show that FUN-LDA has better prediction accuracy and higher resolution compared to these methods. In summary, we describe a new approach and perform rigorous comparisons with the most commonly used functional annotation methods, providing a valuable resource for the community interested in the functional annotation of noncoding variants. Scores for each position in the human genome and for each ENCODE/Roadmap tissue are available from http://www.columbia.edu/~ii2135/funlda.html.


2017 ◽  
Vol 7 (7) ◽  
pp. 2271-2279 ◽  
Author(s):  
Chao Xu ◽  
Ji-Gang Zhang ◽  
Dongdong Lin ◽  
Lan Zhang ◽  
Hui Shen ◽  
...  

Abstract Integrating diverse genomics data can provide a global view of the complex biological processes related to the human complex diseases. Although substantial efforts have been made to integrate different omics data, there are at least three challenges for multi-omics integration methods: (i) How to simultaneously consider the effects of various genomic factors, since these factors jointly influence the phenotypes; (ii) How to effectively incorporate the information from publicly accessible databases and omics datasets to fully capture the interactions among (epi)genomic factors from diverse omics data; and (iii) Until present, the combination of more than two omics datasets has been poorly explored. Current integration approaches are not sufficient to address all of these challenges together. We proposed a novel integrative analysis framework by incorporating sparse model, multivariate analysis, Gaussian graphical model, and network analysis to address these three challenges simultaneously. Based on this strategy, we performed a systemic analysis for glioblastoma multiforme (GBM) integrating genome-wide gene expression, DNA methylation, and miRNA expression data. We identified three regulatory modules of genomic factors associated with GBM survival time and revealed a global regulatory pattern for GBM by combining the three modules, with respect to the common regulatory factors. Our method can not only identify disease-associated dysregulated genomic factors from different omics, but more importantly, it can incorporate the information from publicly accessible databases and omics datasets to infer a comprehensive interaction map of all these dysregulated genomic factors. Our work represents an innovative approach to enhance our understanding of molecular genomic mechanisms underlying human complex diseases.


2015 ◽  
Author(s):  
Qiongshi Lu ◽  
Ryan Lee Powles ◽  
Qian Wang ◽  
Beixin Julie He ◽  
Hongyu Zhao

Extensive efforts have been made to understand genomic function through both experimental and computational approaches, yet proper annotation still remains challenging, especially in non-coding regions. In this manuscript, we introduce GenoSkyline, an unsupervised learning framework to predict tissue-specific functional regions through integrating high-throughput epigenetic annotations. GenoSkyline successfully identified a variety of non-coding regulatory machinery including enhancers, regulatory miRNA, and hypomethylated transposable elements in extensive case studies. Integrative analysis of GenoSkyline annotations and results from genome-wide association studies (GWAS) led to novel biological insights on the etiologies of a number of human complex traits. We also explored using tissue-specific functional annotations to prioritize GWAS signals and predict relevant tissue types for each risk locus. Brain and blood-specific annotations led to better prioritization performance for schizophrenia than standard GWAS p-values and non-tissue-specific annotations. As for coronary artery disease, heart-specific functional regions was highly enriched of GWAS signals, but previously identified risk loci were found to be most functional in other tissues, suggesting a substantial proportion of still undetected heart-related loci. In summary, GenoSkyline annotations can guide genetic studies at multiple resolutions and provide valuable insights in understanding complex diseases. GenoSkyline is available at http://genocanyon.med.yale.edu/GenoSkyline.


2019 ◽  
Vol 47 (14) ◽  
pp. 7247-7261 ◽  
Author(s):  
Nitin Narwade ◽  
Sonal Patel ◽  
Aftab Alam ◽  
Samit Chattopadhyay ◽  
Smriti Mittal ◽  
...  

AbstractScaffold/matrix attachment regions (S/MARs) are DNA elements that serve to compartmentalize the chromatin into structural and functional domains. These elements are involved in control of gene expression which governs the phenotype and also plays role in disease biology. Therefore, genome-wide understanding of these elements holds great therapeutic promise. Several attempts have been made toward identification of S/MARs in genomes of various organisms including human. However, a comprehensive genome-wide map of human S/MARs is yet not available. Toward this objective, ChIP-Seq data of 14 S/MAR binding proteins were analyzed and the binding site coordinates of these proteins were used to prepare a non-redundant S/MAR dataset of human genome. Along with co-ordinate (location) details of S/MARs, the dataset also revealed details of S/MAR features, namely, length, inter-SMAR length (the chromatin loop size), nucleotide repeats, motif abundance, chromosomal distribution and genomic context. S/MARs identified in present study and their subsequent analysis also suggests that these elements act as hotspots for integration of retroviruses. Therefore, these data will help toward better understanding of genome functioning and designing effective anti-viral therapeutics. In order to facilitate user friendly browsing and retrieval of the data obtained in present study, a web interface, MARome (http://bioinfo.net.in/MARome), has been developed.


Sign in / Sign up

Export Citation Format

Share Document