scholarly journals Detection of pathways affected by positive selection in primate lineages ancestral to humans

2016 ◽  
Author(s):  
J.T. Daub ◽  
S. Moretti ◽  
I. I. Davidov ◽  
L. Excoffier ◽  
M. Robinson-Rechavi

AbstractGene set enrichment approaches have been increasingly successful in finding signals of recent polygenic selection in the human genome. In this study, we aim at detecting biological pathways affected by positive selection in more ancient human evolutionary history. Focusing on four branches of the primate tree that lead to modern humans, we tested all available protein coding gene trees of the Primates clade for signals of adaptation in these branches, using the likelihood-based branch site test of positive selection. The results of these locus-specific tests were then used as input for a gene set enrichment test, where whole pathways are globally scored for a signal of positive selection, instead of focusing only on outlier “significant” genes. We identified signals of positive selection in several pathways that are mainly involved in immune response, sensory perception, metabolism, and energy production. These pathway-level results are highly significant, even though there is no functional enrichment when only focusing on top scoring genes. Interestingly, several gene sets are found significant at multiple levels in the phylogeny, but different genes are responsible for the selection signal in the different branches. This suggests that the same function has been optimized in different ways at different times in primate evolution.

2016 ◽  
Vol 2 (1) ◽  
pp. 33 ◽  
Author(s):  
Jean Fred Fontaine ◽  
Miguel A Andrade-Navarro

Large sets of candidate genes derived from high-throughput biological experiments can be characterized by functional enrichment analysis. The analysis consists of comparing the functions of one gene set against that of a background gene set. Then, functions related to a significant number of genes in the gene set are expected to be relevant. Web tools offering disease enrichment analysis on gene sets are often based on gene-disease associations from manually curated or experimental data that is accurate but does not cover all diseases discussed in the literature. Using associations automatically derived from literature data could be a cost effective method to improve the coverage of diseases for enrichment analysis at comparable levels of accuracy. We have implemented a method named Gene set to Diseases, GS2D, as a web tool performing disease enrichment analysis on human protein coding gene sets. It uses an automatically built dataset of more than 63 thousand gene-disease associations defined as statistically significant co-occurrences of genes and diseases in annotations of biomedical citations from PubMed. The dataset covers more diseases for enrichment analysis than the largest comparable curated database, Comparative Toxicogenomics Database, and its performance compared favourably to similar approaches based on manually curated or experimental data. Graphical and programmatic interfaces are available at http://cbdm.uni-mainz.de/geneset2diseases.


2019 ◽  
Vol 116 (42) ◽  
pp. 21094-21103 ◽  
Author(s):  
Amir Marcovitz ◽  
Yatish Turakhia ◽  
Heidi I. Chen ◽  
Michael Gloudemans ◽  
Benjamin A. Braun ◽  
...  

Distantly related species entering similar biological niches often adapt by evolving similar morphological and physiological characters. How much genomic molecular convergence (particularly of highly constrained coding sequence) contributes to convergent phenotypic evolution, such as echolocation in bats and whales, is a long-standing fundamental question. Like others, we find that convergent amino acid substitutions are not more abundant in echolocating mammals compared to their outgroups. However, we also ask a more informative question about the genomic distribution of convergent substitutions by devising a test to determine which, if any, of more than 4,000 tissue-affecting gene sets is most statistically enriched with convergent substitutions. We find that the gene set most overrepresented (q-value = 2.2e-3) with convergent substitutions in echolocators, affecting 18 genes, regulates development of the cochlear ganglion, a structure with empirically supported relevance to echolocation. Conversely, when comparing to nonecholocating outgroups, no significant gene set enrichment exists. For aquatic and high-altitude mammals, our analysis highlights 15 and 16 genes from the gene sets most affected by molecular convergence which regulate skin and lung physiology, respectively. Importantly, our test requires that the most convergence-enriched set cannot also be enriched for divergent substitutions, such as in the pattern produced by inactivated vision genes in subterranean mammals. Showing a clear role for adaptive protein-coding molecular convergence, we discover nearly 2,600 convergent positions, highlight 77 of them in 3 organs, and provide code to investigate other clades across the tree of life.


2019 ◽  
Vol 8 (10) ◽  
pp. 1580 ◽  
Author(s):  
Kyoung Min Moon ◽  
Kyueng-Whan Min ◽  
Mi-Hye Kim ◽  
Dong-Hoon Kim ◽  
Byoung Kwan Son ◽  
...  

Ninety percent of patients with scrub typhus (SC) with vasculitis-like syndrome recover after mild symptoms; however, 10% can suffer serious complications, such as acute respiratory failure (ARF) and admission to the intensive care unit (ICU). Predictors for the progression of SC have not yet been established, and conventional scoring systems for ICU patients are insufficient to predict severity. We aimed to identify simple and robust indicators to predict aggressive behaviors of SC. We evaluated 91 patients with SC and 81 non-SC patients who were admitted to the ICU, and 32 cases from the public functional genomics data repository for gene expression analysis. We analyzed the relationships between several predictors and clinicopathological characteristics in patients with SC. We performed gene set enrichment analysis (GSEA) to identify SC-specific gene sets. The acid-base imbalance (ABI), measured 24 h before serious complications, was higher in patients with SC than in non-SC patients. A high ABI was associated with an increased incidence of ARF, leading to mechanical ventilation and worse survival. GSEA revealed that SC correlated to gene sets reflecting inflammation/apoptotic response and airway inflammation. ABI can be used to indicate ARF in patients with SC and assist with early detection.


2018 ◽  
Vol 314 (4) ◽  
pp. L617-L625 ◽  
Author(s):  
Arjun Mohan ◽  
Anagha Malur ◽  
Matthew McPeek ◽  
Barbara P. Barna ◽  
Lynn M. Schnapp ◽  
...  

To advance our understanding of the pathobiology of sarcoidosis, we developed a multiwall carbon nanotube (MWCNT)-based murine model that shows marked histological and inflammatory signal similarities to this disease. In this study, we compared the alveolar macrophage transcriptional signatures of our animal model with human sarcoidosis to identify overlapping molecular programs. Whole genome microarrays were used to assess gene expression of alveolar macrophages in six MWCNT-exposed and six control animals. The results were compared with the transcriptional profiles of alveolar immune cells in 15 sarcoidosis patients and 12 healthy humans. Rigorous statistical methods were used to identify differentially expressed genes. To better elucidate activated pathways, integrated network and gene set enrichment analysis (GSEA) was performed. We identified over 1,000 differentially expressed between control and MWCNT mice. Gene ontology functional analysis showed overrepresentation of processes primarily involved in immunity and inflammation in MCWNT mice. Applying GSEA to both mouse and human samples revealed upregulation of 92 gene sets in MWCNT mice and 142 gene sets in sarcoidosis patients. Commonly activated pathways in both MWCNT mice and sarcoidosis included adaptive immunity, T-cell signaling, IL-12/IL-17 signaling, and oxidative phosphorylation. Differences in gene set enrichment between MWCNT mice and sarcoidosis patients were also observed. We applied network analysis to differentially expressed genes common between the MWCNT model and sarcoidosis to identify key drivers of disease. In conclusion, an integrated network and transcriptomics approach revealed substantial functional similarities between a murine model and human sarcoidosis particularly with respect to activation of immune-specific pathways.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Mike Fang ◽  
Brian Richardson ◽  
Cheryl M. Cameron ◽  
Jean-Eudes Dazard ◽  
Mark J. Cameron

Abstract Background In this study, we demonstrate that our modified Gene Set Enrichment Analysis (GSEA) method, drug perturbation GSEA (dpGSEA), can detect phenotypically relevant drug targets through a unique transcriptomic enrichment that emphasizes biological directionality of drug-derived gene sets. Results We detail our dpGSEA method and show its effectiveness in detecting specific perturbation of drugs in independent public datasets by confirming fluvastatin, paclitaxel, and rosiglitazone perturbation in gastroenteropancreatic neuroendocrine tumor cells. In drug discovery experiments, we found that dpGSEA was able to detect phenotypically relevant drug targets in previously published differentially expressed genes of CD4+T regulatory cells from immune responders and non-responders to antiviral therapy in HIV-infected individuals, such as those involved with virion replication, cell cycle dysfunction, and mitochondrial dysfunction. dpGSEA is publicly available at https://github.com/sxf296/drug_targeting. Conclusions dpGSEA is an approach that uniquely enriches on drug-defined gene sets while considering directionality of gene modulation. We recommend dpGSEA as an exploratory tool to screen for possible drug targeting molecules.


F1000Research ◽  
2019 ◽  
Vol 8 ◽  
pp. 129 ◽  
Author(s):  
Michael Prummer

Differential gene expression (DGE) studies often suffer from poor interpretability of their primary results, i.e., thousands of differentially expressed genes. This has led to the introduction of gene set analysis (GSA) methods that aim at identifying interpretable global effects by grouping genes into sets of common context, such as, molecular pathways, biological function or tissue localization. In practice, GSA often results in hundreds of differentially regulated gene sets. Similar to the genes they contain, gene sets are often regulated in a correlative fashion because they share many of their genes or they describe related processes. Using these kind of neighborhood information to construct networks of gene sets allows to identify highly connected sub-networks as well as poorly connected islands or singletons. We show here how topological information and other network features can be used to filter and prioritize gene sets in routine DGE studies. Community detection in combination with automatic labeling and the network representation of gene set clusters further constitute an appealing and intuitive visualization of GSA results. The RICHNET workflow described here does not require human intervention and can thus be conveniently incorporated in automated analysis pipelines.


2020 ◽  
Vol 11 ◽  
Author(s):  
Cheng Liu ◽  
Xiang Li ◽  
Hua Shao ◽  
Dan Li

Background: Lung adenocarcinoma (LUAD) is one of the main types of lung cancer. Because of its low early diagnosis rate, poor late prognosis, and high mortality, it is of great significance to find biomarkers for diagnosis and prognosis.Methods: Five hundred and twelve LUADs from The Cancer Genome Atlas were used for differential expression analysis and short time-series expression miner (STEM) analysis to identify the LUAD-development characteristic genes. Survival analysis was used to identify the LUAD-unfavorable genes and LUAD-favorable genes. Gene set variation analysis (GSVA) was used to score individual samples against the two gene sets. Receiver operating characteristic (ROC) curve analysis and univariate and multivariate Cox regression analysis were used to explore the diagnostic and prognostic ability of the two GSVA score systems. Two independent data sets from Gene Expression Omnibus (GEO) were used for verifying the results. Functional enrichment analysis was used to explore the potential biological functions of LUAD-unfavorable genes.Results: With the development of LUAD, 185 differentially expressed genes (DEGs) were gradually upregulated, of which 84 genes were associated with LUAD survival and named as LUAD-unfavorable gene set. While 237 DEGs were gradually downregulated, of which 39 genes were associated with LUAD survival and named as LUAD-favorable gene set. ROC curve analysis and univariate/multivariate Cox proportional hazards analyses indicated both of LUAD-unfavorable GSVA score and LUAD-favorable GSVA score were a biomarker of LUAD. Moreover, both of these two GSVA score systems were an independent factor for LUAD prognosis. The LUAD-unfavorable genes were significantly involved in p53 signaling pathway, Oocyte meiosis, and Cell cycle.Conclusion: We identified and validated two LUAD-development characteristic gene sets that not only have diagnostic value but also prognostic value. It may provide new insight for further research on LUAD.


Author(s):  
Konstantina Charmpi ◽  
Bernard Ycart

AbstractGene Set Enrichment Analysis (GSEA) is a basic tool for genomic data treatment. Its test statistic is based on a cumulated weight function, and its distribution under the null hypothesis is evaluated by Monte-Carlo simulation. Here, it is proposed to subtract to the cumulated weight function its asymptotic expectation, then scale it. Under the null hypothesis, the convergence in distribution of the new test statistic is proved, using the theory of empirical processes. The limiting distribution needs to be computed only once, and can then be used for many different gene sets. This results in large savings in computing time. The test defined in this way has been called Weighted Kolmogorov Smirnov (WKS) test. Using expression data from the GEO repository, tested against the MSig Database C2, a comparison between the classical GSEA test and the new procedure has been conducted. Our conclusion is that, beyond its mathematical and algorithmic advantages, the WKS test could be more informative in many cases, than the classical GSEA test.


2019 ◽  
Author(s):  
Rani K. Powers ◽  
Anthony Sun ◽  
James C. Costello

AbstractSummaryGSEA-InContext Explorer is a Shiny app that allows users to perform two methods of gene set enrichment analysis (GSEA). The first, GSEAPreranked, applies the GSEA algorithm in which statistical significance is estimated from a null distribution of enrichment scores generated for randomly permuted gene sets. The second, GSEA-InContext, incorporates a user-defined set of background experiments to define the null distribution and calculate statistical significance. GSEA-InContext Explorer allows the user to build custom background sets from a compendium of over 5,700 curated experiments, run both GSEAPreranked and GSEA-InContext on their own uploaded experiment, and explore the results using an interactive interface. This tool will allow researchers to visualize gene sets that are commonly enriched across experiments and identify gene sets that are uniquely significant in their experiment, thus complementing current methods for interpreting gene set enrichment results.Availability and implementationThe code for GSEA-InContext Explorer is available at: https://github.com/CostelloLab/GSEA-InContext_Explorer and the interactive tool is at: http://gsea-incontext_explorer.ngrok.io


Sign in / Sign up

Export Citation Format

Share Document