A COMPRESSED SENSING BASED APPROACH FOR SUBTYPING OF LEUKEMIA FROM GENE EXPRESSION DATA

WENLONG TANG; HONGBAO CAO; JUNBO DUAN; YU-PING WANG

doi:10.1142/s0219720011005689

A COMPRESSED SENSING BASED APPROACH FOR SUBTYPING OF LEUKEMIA FROM GENE EXPRESSION DATA

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720011005689 ◽

2011 ◽

Vol 09 (05) ◽

pp. 631-645 ◽

Cited By ~ 12

Author(s):

WENLONG TANG ◽

HONGBAO CAO ◽

JUNBO DUAN ◽

YU-PING WANG

Keyword(s):

Gene Expression ◽

Compressed Sensing ◽

Gene Expression Analysis ◽

Expression Data ◽

Data Set ◽

New Methods ◽

Genome Wide ◽

Different Types ◽

Genome Wide Data ◽

Improved Accuracy

With the development of genomic techniques, the demand for new methods that can handle high-throughput genome-wide data effectively is becoming stronger than ever before. Compressed sensing (CS) is an emerging approach in statistics and signal processing. With the CS theory, a signal can be uniquely reconstructed or approximated from its sparse representations, which can therefore better distinguish different types of signals. However, the application of CS approach to genome-wide data analysis has been rarely investigated. We propose a novel CS-based approach for genomic data classification and test its performance in the subtyping of leukemia through gene expression analysis. The detection of subtypes of cancers such as leukemia according to different genetic markups is significant, which holds promise for the individualization of therapies and improvement of treatments. In our work, four statistical features were employed to select significant genes for the classification. With our selected genes out of 7,129 ones, the proposed CS method achieved a classification accuracy of 97.4% when evaluated with the cross validation and 94.3% when evaluated with another independent data set. The robustness of the method to noise was also tested, giving good performance. Therefore, this work demonstrates that the CS method can effectively detect subtypes of leukemia, implying improved accuracy of diagnosis of leukemia.

PathCORE-T: identifying and visualizing globally co-occurring pathways in large transcriptomic compendia

10.1101/147645 ◽

2017 ◽

Author(s):

Kathleen M. Chen ◽

Jie Tan ◽

Gregory P. Way ◽

Georgia Doing ◽

Deborah A. Hogan ◽

...

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Original Data ◽

Knowledge Bases ◽

Expression Data ◽

Expression Levels ◽

Genome Wide ◽

Genome Wide Data ◽

Tcga Dataset ◽

Construction Algorithms

AbstractBackgroundInvestigators often interpret genome-wide data by analyzing the expression levels of genes within pathways. While this within-pathway analysis is routine, the products of any one pathway can affect the activity of other pathways. Past efforts to identify relationships between biological processes have evaluated overlap in knowledge bases or evaluated changes that occur after specific treatments. Individual experiments can highlight condition-specific pathway-pathway relationships; however, constructing a complete network of such relationships across many conditions requires analyzing results from many studies.ResultsWe developed PathCORE-T framework by implementing existing methods to identify pathway-pathway transcriptional relationships evident across a broad data compendium. PathCORE-T is applied to the output of feature construction algorithms; it identifies pairs of pathways observed in features more than expected by chance as functionally co-occurring. We demonstrate PathCORE-T by analyzing an existing eADAGE model of a microbial compendium and building and analyzing NMF features from the TCGA dataset of 33 cancer types. The PathCORE-T framework includes a demonstration web interface, with source code, that users can launch to (1) visualize the network and (2) review the expression levels of associated genes in the original data. PathCORE-T creates and displays the network of globally co-occurring pathways based on features observed in a machine learning analysis of gene expression data.ConclusionsThe PathCORE-T framework identifies transcriptionally co-occurring pathways from the results of unsupervised analysis of gene expression data and visualizes the relationships between pathways as a network. PathCORE-T recapitulated previously described pathway-pathway relationships and suggested experimentally testable additional hypotheses that remain to be explored.

Fusion of single-cell transcriptome and DNA-binding data, for genomic network inference in cortical development

BMC Bioinformatics ◽

10.1186/s12859-021-04201-9 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Thomas Bartlett

Keyword(s):

Gene Expression ◽

Dna Binding ◽

Network Model ◽

Gene Expression Data ◽

Cortical Development ◽

Specific Gene ◽

Expression Data ◽

Genome Wide ◽

Binding Data ◽

Genome Wide Data

Abstract Background Network models are well-established as very useful computational-statistical tools in cell biology. However, a genomic network model based only on gene expression data can, by definition, only infer gene co-expression networks. Hence, in order to infer gene regulatory patterns, it is necessary to also include data related to binding of regulatory factors to DNA. Results We propose a new dynamic genomic network model, for inferring patterns of genomic regulatory influence in dynamic processes such as development. Our model fuses experiment-specific gene expression data with publicly available DNA-binding data. The method we propose is computationally efficient, and can be applied to genome-wide data with tens of thousands of transcripts. Thus, our method is well suited for use as an exploratory tool for genome-wide data. We apply our method to data from human fetal cortical development, and our findings confirm genomic regulatory patterns which are recognised as being fundamental to neuronal development. Conclusions Our method provides a mathematical/computational toolbox which, when coupled with targeted experiments, will reveal and confirm important new functional genomic regulatory processes in mammalian development.

Fusion of single-cell transcriptome and DNA-binding data, for genomic network inference in cortical development

10.1101/2021.05.18.444638 ◽

2021 ◽

Author(s):

Thomas E Bartlett

Keyword(s):

Gene Expression ◽

Dna Binding ◽

Network Model ◽

Gene Expression Data ◽

Cortical Development ◽

Specific Gene ◽

Expression Data ◽

Genome Wide ◽

Binding Data ◽

Genome Wide Data

Network models are well-established as very useful computational-statistical tools in cell biology. However, a genomic network model based only on gene expression data can, by definition, only infer gene co-expression networks. Hence, in order to infer gene regulatory patterns, it is neces- sary to also include data related to binding of regulatory factors to DNA. We propose a new dynamic genomic network model, for inferring patterns of genomic reg- ulatory influence in dynamic processes such as development. Our model fuses experiment-specific gene expression data with publicly available DNA-binding data. The method we propose is computa- tionally efficient, and can be applied to genome-wide data with tens of thousands of transcripts. Thus, our method is well suited for use as an exploratory tool for genome-wide data. We apply our method to data from human fetal cortical development, and our findings confirm genomic regulatory patterns which are recognised as being fundamental to neuronal development. Our method provides a mathematical/computational toolbox which, when coupled with targeted experiments, will reveal and confirm important new functional genomic regulatory processes in mammalian development.

F53GENETIC AND GENE EXPRESSION ANALYSIS IN CTBP2: A GENE DERIVED FROM GENOME-WIDE DATA IN ANOREXIA NERVOSA AND BODY WEIGHT REGULATION

European Neuropsychopharmacology ◽

10.1016/j.euroneuro.2018.08.133 ◽

2019 ◽

Vol 29 ◽

pp. S1138

Author(s):

Johanna Giuranna ◽

Sigrid Jall ◽

Triinu Peters ◽

Johannes Hebebrand ◽

Timo D. Müller ◽

...

Keyword(s):

Gene Expression ◽

Body Weight ◽

Anorexia Nervosa ◽

Expression Analysis ◽

Gene Expression Analysis ◽

Weight Regulation ◽

Body Weight Regulation ◽

Genome Wide ◽

Genome Wide Data

Faculty Opinions recommendation of Meta-Analysis of Genome-Wide Association Studies and Network Analysis-Based Integration with Gene Expression Data Identify New Suggestive Loci and Unravel a Wnt-Centric Network Associated with Dupuytren's Disease.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.726582766.793522025 ◽

2016 ◽

Author(s):

Rik Lories

Keyword(s):

Gene Expression ◽

Network Analysis ◽

Gene Expression Data ◽

Association Studies ◽

Meta Analysis ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Expression Data ◽

Dupuytren's Disease ◽

Genome Wide

Genome-wide identification of suitable zebrafish Danio rerio reference genes for normalization of gene expression data by RT-qPCR

Journal of Fish Biology ◽

10.1111/jfb.12915 ◽

2016 ◽

Vol 88 (6) ◽

pp. 2095-2110 ◽

Cited By ~ 34

Author(s):

H. Xu ◽

C. Li ◽

Q. Zeng ◽

I. Agrawal ◽

X. Zhu ◽

...

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Danio Rerio ◽

Reference Genes ◽

Expression Data ◽

Genome Wide

Genome‐wide chromatin occupancy of BRDT and gene expression analysis suggest transcriptional partners and specific epigenetic landscapes that regulate gene expression during spermatogenesis

Molecular Reproduction and Development ◽

10.1002/mrd.23449 ◽

2021 ◽

Author(s):

Yoon Ra Her ◽

Li Wang ◽

Iouri Chepelev ◽

Marcia Manterola ◽

Binyamin Berkovits ◽

...

Keyword(s):

Gene Expression ◽

Expression Analysis ◽

Gene Expression Analysis ◽

Regulate Gene Expression ◽

Genome Wide ◽

Regulate Gene

A Genome-Wide Integrative Association Study of DNA Methylation and Gene Expression Data and Later Life Cognitive Functioning in Monozygotic Twins

Frontiers in Neuroscience ◽

10.3389/fnins.2020.00233 ◽

2020 ◽

Vol 14 ◽

Cited By ~ 1

Author(s):

Mette Soerensen ◽

Dominika Marzena Hozakowska-Roszkowska ◽

Marianne Nygaard ◽

Martin J. Larsen ◽

Veit Schwämmle ◽

...

Keyword(s):

Gene Expression ◽

Dna Methylation ◽

Cognitive Functioning ◽

Association Study ◽

Gene Expression Data ◽

Monozygotic Twins ◽

Later Life ◽

Expression Data ◽

Genome Wide ◽

A Genome

Identification of unique venous thromboembolism-susceptibility variants in African-Americans

Thrombosis and Haemostasis ◽

10.1160/th16-08-0652 ◽

2017 ◽

Vol 117 (04) ◽

pp. 758-768 ◽

Cited By ~ 16

Author(s):

Sebastian Armasu ◽

Bryan McCauley ◽

Iftikhar Kullo ◽

Hugues Sicotte ◽

Jyotishman Pathak ◽

...

Keyword(s):

Gene Expression ◽

Venous Thromboembolism ◽

African Americans ◽

Differential Expression ◽

White Women ◽

Genome Wide Association Study ◽

Expression Data ◽

Significant Differential Expression ◽

Genome Wide ◽

A Genome

SummaryTo identify novel single nucleotide polymorphisms (SNPs) associated with venous thromboembolism (VTE) in African-Americans (AAs), we performed a genome-wide association study (GWAS) of VTE in AAs using the Electronic Medical Records and Genomics (eMERGE) Network, comprised of seven sites each with DNA biobanks (total ~39,200 unique DNA samples) with genome-wide SNP data (imputed to 1000 Genomes Project cosmopolitan reference panel) and linked to electronic health records (EHRs). Using a validated EHR-driven phenotype extraction algorithm, we identified VTE cases and controls and tested for an association between each SNP and VTE using unconditional logistic regression, adjusted for age, sex, stroke, site-platform combination and sickle cell risk genotype. Among 393 AA VTE cases and 4,941 AA controls, three intragenic SNPs reached genome-wide significance: LEMD3 rs138916004 (OR=3.2; p=1.3E-08), LY86 rs3804476 (OR=1.8; p=2E-08) and LOC100130298 rs142143628 (OR=4.5; p=4.4E-08); all three SNPs validated using internal cross-validation, parametric bootstrap and meta-analysis methods. LEMD3 rs138916004 and LOC100130298 rs142143628 are only present in Africans (1000G data). LEMD3 showed a significant differential expression in both NCBI Gene Expression Omnibus (GEO) and the Mayo Clinic gene expression data, LOC100130298 showed a significant differential expression only in the GEO expression data, and LY86 showed a significant differential expression only in the Mayo expression data. LEMD3 encodes for an antagonist of TGF-β-induced cell proliferation arrest. LY86 encodes for MD-1 which down-regulates the pro-inflammatory response to lipopolysaccharide; LY86 variation was previously associated with VTE in white women; LOC100130298 is a non-coding RNA gene with unknown regulatory activity in gene expression and epigenetics.Supplementary Material to this article is available online at www.thrombosis-online.com.

Genome-Wide Gene Expression Analysis in Response to Organophosphorus Pesticide Chlorpyrifos and Diazinon in C. elegans

PLoS ONE ◽

10.1371/journal.pone.0012145 ◽

2010 ◽

Vol 5 (8) ◽

pp. e12145 ◽

Cited By ~ 30

Author(s):

Ana Viñuela ◽

L. Basten Snoek ◽

Joost A. G. Riksen ◽

Jan E. Kammenga

Keyword(s):

Gene Expression ◽

Expression Analysis ◽

Gene Expression Analysis ◽

Organophosphorus Pesticide ◽

C Elegans ◽

Genome Wide ◽

Genome Wide Gene Expression