scholarly journals Horizontal meta-analysis identifies common deregulated genes across AML subgroups providing a robust prognostic signature

2020 ◽  
Vol 4 (20) ◽  
pp. 5322-5335
Author(s):  
Ali Nehme ◽  
Hassan Dakik ◽  
Frédéric Picou ◽  
Meyling Cheok ◽  
Claude Preudhomme ◽  
...  

Abstract Advances in transcriptomics have improved our understanding of leukemic development and helped to enhance the stratification of patients. The tendency of transcriptomic studies to combine AML samples, regardless of cytogenetic abnormalities, could lead to bias in differential gene expression analysis because of the differential representation of AML subgroups. Hence, we performed a horizontal meta-analysis that integrated transcriptomic data on AML from multiple studies, to enrich the less frequent cytogenetic subgroups and to uncover common genes involved in the development of AML and response to therapy. A total of 28 Affymetrix microarray data sets containing 3940 AML samples were downloaded from the Gene Expression Omnibus database. After stringent quality control, transcriptomic data on 1534 samples from 11 data sets, covering 10 AML cytogenetically defined subgroups, were retained and merged with the data on 198 healthy bone marrow samples. Differentially expressed genes between each cytogenetic subgroup and normal samples were extracted, enabling the unbiased identification of 330 commonly deregulated genes (CODEGs), which showed enriched profiles of myeloid differentiation, leukemic stem cell status, and relapse. Most of these genes were downregulated, in accordance with DNA hypermethylation. CODEGs were then used to create a prognostic score based on the weighted sum of expression of 22 core genes (CODEG22). The score was validated with microarray data of 5 independent cohorts and by quantitative real time-polymerase chain reaction in a cohort of 142 samples. CODEG22-based stratification of patients, globally and into subpopulations of cytologically healthy and elderly individuals, may complement the European LeukemiaNet classification, for a more accurate prediction of AML outcomes.

2014 ◽  
Vol 2014 ◽  
pp. 1-8
Author(s):  
Tzu-Hao Chang ◽  
Shih-Lin Wu ◽  
Wei-Jen Wang ◽  
Jorng-Tzong Horng ◽  
Cheng-Wei Chang

Microarrays are widely used to assess gene expressions. Most microarray studies focus primarily on identifying differential gene expressions between conditions (e.g., cancer versus normal cells), for discovering the major factors that cause diseases. Because previous studies have not identified the correlations of differential gene expression between conditions, crucial but abnormal regulations that cause diseases might have been disregarded. This paper proposes an approach for discovering the condition-specific correlations of gene expressions within biological pathways. Because analyzing gene expression correlations is time consuming, an Apache Hadoop cloud computing platform was implemented. Three microarray data sets of breast cancer were collected from the Gene Expression Omnibus, and pathway information from the Kyoto Encyclopedia of Genes and Genomes was applied for discovering meaningful biological correlations. The results showed that adopting the Hadoop platform considerably decreased the computation time. Several correlations of differential gene expressions were discovered between the relapse and nonrelapse breast cancer samples, and most of them were involved in cancer regulation and cancer-related pathways. The results showed that breast cancer recurrence might be highly associated with the abnormal regulations of these gene pairs, rather than with their individual expression levels. The proposed method was computationally efficient and reliable, and stable results were obtained when different data sets were used. The proposed method is effective in identifying meaningful biological regulation patterns between conditions.


2019 ◽  
Author(s):  
Gregory R. Gershkowitz ◽  
Zachary B. Abrams ◽  
Caitlin E. Coombes ◽  
Kevin R. Coombes

AbstractBackgroundResearchers commonly use online tools such as ToppGene to conduct enrichment analyses on gene expression data. This process does not easily allow multiple gene data sets to be analyzed and compared at once. ToppGene requires the user to manually enter gene symbols or other gene identifiers into a text box and to manually sift through forms with many adjustable parameters in order to obtain a downloadable text file of results. This process makes the analysis of multiple sets of genes tedious, time-consuming, and error prone. To address this problem, we developed Malachite, a Python package that enables researchers to perform gene enrichment analyses on multiple gene lists and concatenate the resulting enrichment statistics. In this way, Malachite enables meta-enrichment analyses across multiple data sets.ResultsTo illustrate its use, we applied Malachite to three data sets from the Gene Expression Omnibus comparing gene expression in the large airways of smokers and non-smokers. Biological processes enriched in all three data sets were related to xenobiotic stimulus; molecular functions typically involved nicotinamide adenine dinucleotide phosphate (NADP) activity.ConclusionMalachite enables researchers to automate gene enrichment metaanalyses using ToppGene. Malachite also enhances ToppGene’s gene set analysis of drug-gene relationships by further filtering for FDA approved drugs.


mSystems ◽  
2021 ◽  
Vol 6 (2) ◽  
Author(s):  
Zhongyou Li ◽  
Katja Koeppen ◽  
Victoria I. Holden ◽  
Samuel L. Neff ◽  
Liviu Cengher ◽  
...  

ABSTRACT The NCBI Gene Expression Omnibus (GEO) provides tools to query and download transcriptomic data. However, less than 4% of microbial experiments include the sample group annotations required to assess differential gene expression for high-throughput reanalysis, and data deposited after 2014 universally lack these annotations. Our algorithm GAUGE (general annotation using text/data group ensembles) automatically annotates GEO microbial data sets, including microarray and RNA sequencing studies, increasing the percentage of data sets amenable to analysis from 4% to 33%. Eighty-nine percent of GAUGE-annotated studies matched group assignments generated by human curators. To demonstrate how GAUGE annotation can lead to scientific insight, we created GAPE (GAUGE-annotated Pseudomonas aeruginosa and Escherichia coli transcriptomic compendia for reanalysis), a Shiny Web interface to analyze 73 GAUGE-annotated P. aeruginosa studies, three times more than previously available. GAPE analysis revealed that PA3923, a gene of unknown function, was frequently differentially expressed in more than 50% of studies and significantly coregulated with genes involved in biofilm formation. Follow-up wet-bench experiments demonstrate that PA3923 mutants are indeed defective in biofilm formation, consistent with predictions facilitated by GAUGE and GAPE. We anticipate that GAUGE and GAPE, which we have made freely available, will make publicly available microbial transcriptomic data easier to reuse and lead to new data-driven hypotheses. IMPORTANCE GEO archives transcriptomic data from over 5,800 microbial experiments and allows researchers to answer questions not directly addressed in published papers. However, less than 4% of the microbial data sets include the sample group annotations required for high-throughput reanalysis. This limitation blocks a considerable amount of microbial transcriptomic data from being reused easily. Here, we demonstrate that the GAUGE algorithm could make 33% of microbial data accessible to parallel mining and reanalysis. GAUGE annotations increase statistical power and, thereby, make consistent patterns of differential gene expression easier to identify. In addition, we developed GAPE (GAUGE-annotated Pseudomonas aeruginosa and Escherichia coli transcriptomic compendia for reanalysis), a Shiny Web interface that performs parallel analyses on P. aeruginosa and E. coli compendia. Source code for GAUGE and GAPE is freely available and can be repurposed to create compendia for other bacterial species.


2016 ◽  
Vol 45 (1) ◽  
pp. e1-e1 ◽  
Author(s):  
Timothy E. Sweeney ◽  
Winston A. Haynes ◽  
Francesco Vallania ◽  
John P. Ioannidis ◽  
Purvesh Khatri

2015 ◽  
Vol 135 (10) ◽  
pp. 2455-2463 ◽  
Author(s):  
Lanlan Yin ◽  
Sergio G. Coelho ◽  
Julio C. Valencia ◽  
Dominik Ebsen ◽  
Andre Mahns ◽  
...  

2017 ◽  
Author(s):  
Sivateja Tangirala ◽  
Chirag J Patel

AbstractWhile both genes and environment contribute to phenotype, deciphering environmental contributions to phenotype is a challenge. Furthermore, elucidating how different phenotypes may share similar environmental etiologies also is challenging. One way to identify environmental influences is through a discordant monozygotic (MZ) twin study design. Here, we assessed differential gene expression in MZ discordant twin pairs (affected vs. non-affected) for seven phenotypes, including chronic fatigue syndrome, obesity, ulcerative colitis, major depressive disorder, intermittent allergic rhinitis, physical activity, and intelligence quotient, comparing the spectrum of genes differentially expressed across seven phenotypes individually. Second, we performed meta-analysis for each gene to identify commonalities and differences in gene expression signatures between the seven phenotypes. In our integrative analyses, we found that there may be a common gene expression signature (with small effect sizes) across the phenotypes; however, differences between phenotypes with respect to differentially expressed genes were more prominently featured. Therefore, defining common environmentally induced pathways in phenotypes remains elusive. We make our work accessible by providing a new database (DiscTwinExprDB: http://apps.chiragjpgroup.org/disctwinexprdb/) for investigators to study non-genotypic influence on gene expression.


2021 ◽  
Vol 11 ◽  
Author(s):  
Xiaoli Hu ◽  
Yang Liu ◽  
Zhitong Bing ◽  
Qian Ye ◽  
Chengcheng Li

Owing to metastases and drug resistance, the prognosis of breast cancer is still dismal. Therefore, it is necessary to find new prognostic markers to improve the efficacy of breast cancer treatment. Literature shows a controversy between moesin (MSN) expression and prognosis in breast cancer. Here, we aimed to conduct a systematic review and meta-analysis to evaluate the prognostic relationship between MSN and breast cancer. Literature retrieval was conducted in the following databases: PubMed, Web of Science, Embase, and Cochrane. Two reviewers independently performed the screening of studies and data extraction. The Gene Expression Omnibus (GEO) database including both breast cancer gene expression and follow-up datasets was selected to verify literature results. The R software was employed for the meta-analysis. A total of 9 articles with 3,039 patients and 16 datasets with 2,916 patients were ultimately included. Results indicated that there was a significant relationship between MSN and lymph node metastases (P < 0.05), and high MSN expression was associated with poor outcome of breast cancer patients (HR = 1.99; 95% CI 1.73–2.24). In summary, there is available evidence to support that high MSN expression has valuable importance for the poor prognosis in breast cancer patients.Systematic Review Registrationhttps://inplasy.com/inplasy-2020-8-0039/.


Sign in / Sign up

Export Citation Format

Share Document