Orymold: ontology based gene expression data integration and analysis tool applied to rice.

Summary Objectives: Array-comparative genomic hybridization (aCGH) is a high-throughput method to detect and map copy number aberrations in the genome. Multi-step analysis of high-dimensional data requires an integrated suite of bioinformatic tools. In this paperwe detail an analysis pipeline for array CGH data. Methods: We developed an analysis tool for array CGH data which supports single and multi-chip analyses as well as combined analyses with paired mRNA gene expression data. The functions supporting relevant steps of analysis were implemented using the open source software R and combined as package aCGHPipeline. Analysis methods were illustrated using 189 CGH arrays of aggressive B-cell lymphomas. Results: The package covers data input, quality control, normalization, segmentation and classification. For multi-chip analysis aCGHPipeline offers an algorithm for automatic delineation of recurrent regions. This task was performed manuallyup to now. The package also supports combined analysis with mRNA gene expression data. Outputs consist of HTML documents to facilitate communication with clinical partners. Conclusions: The R package aCGHPipeline supports basic tasks of single and multi-chip analysis of array CGH data.

Download Full-text

Making Sense of Human Lung Carcinomas Gene Expression Data: Integration and Analysis of Two Affymetrix Platform Experiments

Methods of Microarray Data Analysis ◽

10.1007/0-387-23077-7_7 ◽

2006 ◽

pp. 81-94

Author(s):

Xiwu Lin ◽

Daniel Park ◽

Sergio Eslava ◽

Kwan R. Lee ◽

Raymond L.H. Lam ◽

...

Keyword(s):

Gene Expression ◽

Data Integration ◽

Gene Expression Data ◽

Human Lung ◽

Expression Data ◽

Affymetrix Platform ◽

Lung Carcinomas ◽

Making Sense

Download Full-text

Gene-expression data integration to squamous cell lung cancer subtypes reveals drug sensitivity

British Journal of Cancer ◽

10.1038/bjc.2013.452 ◽

2013 ◽

Vol 109 (6) ◽

pp. 1599-1608 ◽

Cited By ~ 16

Author(s):

D Wu ◽

Y Pang ◽

M D Wilkerson ◽

D Wang ◽

P S Hammerman ◽

...

Keyword(s):

Gene Expression ◽

Lung Cancer ◽

Data Integration ◽

Gene Expression Data ◽

Squamous Cell ◽

Cell Lung Cancer ◽

Drug Sensitivity ◽

Squamous Cell Lung Cancer ◽

Expression Data ◽

Cancer Subtypes

Download Full-text

A sequence comparison and gene expression data integration add-on for the Pathway Tools software

Bioinformatics ◽

10.1093/bioinformatics/bts431 ◽

2012 ◽

Vol 28 (17) ◽

pp. 2283-2284

Author(s):

Peter M. Krempl ◽

Juergen Mairhofer ◽

Gerald Striedner ◽

Gerhard G. Thallinger

Keyword(s):

Gene Expression ◽

Data Integration ◽

Gene Expression Data ◽

Sequence Comparison ◽

Expression Data

Download Full-text

Comparing alternative pipelines for cross-platform microarray gene expression data integration with RNA-seq data in breast cancer

10.1101/059600 ◽

2016 ◽

Cited By ~ 2

Author(s):

Alina Frolova ◽

Vladyslav Bondarenko ◽

Maria Obolenska

Keyword(s):

Breast Cancer ◽

Gene Expression ◽

Data Integration ◽

Gene Expression Data ◽

Statistical Power ◽

Meta Analysis ◽

Expression Data ◽

Rna Seq ◽

Microarray Gene Expression ◽

Cross Platform

AbstractBackgroundAccording to major public repositories statistics an overwhelming majority of the existing and newly uploaded data originates from microarray experiments. Unfortunately, the potential of this data to bring new insights is limited by the effects of individual study-specific biases due to small number of biological samples. Increasing sample size by direct microarray data integration increases the statistical power to obtain a more precise estimate of gene expression in a population of individuals resulting in lower false discovery rates. However, despite numerous recommendations for gene expression data integration, there is a lack of a systematic comparison of different processing approaches aimed to asses microarray platforms diversity and ambiguous probesets to genes correspondence, leading to low number of studies applying integration.ResultsHere, we investigated five different approaches of the microarrays data processing in comparison with RNA-seq data on breast cancer samples. We aimed to evaluate different probesets annotations as well as different procedures of choosing between probesets mapped to the same gene. We show that pipelines rankings are mostly preserved across Affymetrix and Illumina platforms. BrainArray approach based on updated annotation and redesigned probesets definition and choosing probeset with the maximum average signal across the samples have best correlation with RNA-seq, while averaging probesets signals as well as scoring the quality of probes sequences mapping to the transcripts of the targeted gene have worse correlation. Finally, randomly selecting probeset among probesets mapped to the same gene significantly decreases the correlation with RNA-seq.ConclusionWe show that methods, which rely on actual probesets signal intensities, are advantageous to methods considering biological characteristics of the probes sequences only and that cross-platform integration of datasets improves correlation with the RNA-seq data. We consider the results obtained in this paper contributive to the integrative analysis as a worthwhile alternative to the classical meta-analysis of the multiple gene expression datasets.

Download Full-text

Batch effect removal methods for microarray gene expression data integration: a survey

Briefings in Bioinformatics ◽

10.1093/bib/bbs037 ◽

2012 ◽

Vol 14 (4) ◽

pp. 469-490 ◽

Cited By ~ 153

Author(s):

C. Lazar ◽

S. Meganck ◽

J. Taminau ◽

D. Steenhoff ◽

A. Coletta ◽

...

Keyword(s):

Gene Expression ◽

Data Integration ◽

Gene Expression Data ◽

Microarray Gene Expression Data ◽

Batch Effect ◽

Expression Data ◽

Microarray Gene Expression ◽

Microarray Gene

Download Full-text

BioGPS and GXD: mouse gene expression data—the benefits and challenges of data integration

Mammalian Genome ◽

10.1007/s00335-012-9408-0 ◽

2012 ◽

Vol 23 (9-10) ◽

pp. 550-558 ◽

Cited By ~ 5

Author(s):

Martin Ringwald ◽

Chunlei Wu ◽

Andrew I. Su

Keyword(s):

Gene Expression ◽

Data Integration ◽

Gene Expression Data ◽

Mouse Gene ◽

Expression Data ◽

Mouse Gene Expression

Download Full-text

Microarray Gene Expression Data Integration: An Application to Brain Tumor Grade Determination

Advances in Intelligent Systems and Computing - 9th International Conference on Practical Applications of Computational Biology and Bioinformatics ◽

10.1007/978-3-319-19776-0_14 ◽

2015 ◽

pp. 127-135 ◽

Cited By ~ 1

Author(s):

Eduardo Valente ◽

Miguel Rocha

Keyword(s):

Gene Expression ◽

Brain Tumor ◽

Data Integration ◽

Gene Expression Data ◽

Tumor Grade ◽

Microarray Gene Expression Data ◽

Expression Data ◽

Microarray Gene Expression ◽

Microarray Gene

Download Full-text

sciCAN: Single-cell chromatin accessibility and gene expression data integration via Cycle-consistent Adversarial Network

10.1101/2021.11.30.470677 ◽

2021 ◽

Author(s):

Yang Xu ◽

Edmon Begoli ◽

Rachel Patton McCord

Keyword(s):

Gene Expression ◽

Data Integration ◽

Single Cell ◽

Gene Expression Data ◽

Chromatin Accessibility ◽

Cellular Systems ◽

Expression Data ◽

Adversarial Network ◽

Cell Technologies ◽

Cell Data

The booming single-cell technologies bring a surge of high dimensional data that come from different sources and represent cellular systems from different views. With advances in single-cell technologies, integrating single-cell data across modalities arises as a new computational challenge and gains more and more attention within the community. Here, we present a novel adversarial approach, sciCAN, to integrate single-cell chromatin accessibility and gene expression data in an unsupervised manner. We benchmarked sciCAN with 3 state-of-the-art (SOTA) methods in 5 scATAC-seq/scRNA-seq datasets, and we demonstrated that our method dealt with data integration with better balance of mutual transferring between modalities than the other 3 SOTA methods. We further applied sciCAN to 10X Multiome data and confirmed the integrated representation preserves information of the hematopoietic hierarchy. Finally, we investigated CRSIPR-perturbed single-cell K562 ATAC-seq and RNA-seq data to identify cells with related responses to different perturbations in these different modalities.

Download Full-text