maqc project Latest Research Papers

pyAffy: An efficient Python/Cython implementation of the RMA method for processing raw data from Affymetrix expression microarrays

10.7287/peerj.preprints.1790v1 ◽

2016 ◽

Cited By ~ 1

Author(s):

Florian Wagner

Keyword(s):

Microarray Data ◽

Source Code ◽

R Package ◽

Raw Data ◽

Maqc Project ◽

Reference Implementation ◽

Using Data ◽

Expression Microarrays ◽

Python Package ◽

Final Expression

Robust multi-array average (RMA) is a highly successful method for processing raw data from Affymetrix expression microarrays. However, most of the work on microarray data processing predates the widespread use of Python in scientific computing. Here, I describe pyAffy, an efficient implementation of the RMA method in Python/Cython. Using data from the MAQC project, I show that this implementation produces virtually identical results compared to the RMA reference implementation in the affy R package, while running more than five times faster and consuming significantly less memory. I also show how individual steps of the RMA method affect the final expression estimates. The source code for pyAffy is available from PyPI and GitHub (https://github.com/flo-compbio/pyaffy) under an OSI-approved license. I intend to periodically revise this article to ensure that it accurately reflects the functionalities available in the pyAffy Python package.

Download Full-text

pyAffy: An efficient Python/Cython implementation of the RMA method for processing raw data from Affymetrix expression microarrays

10.7287/peerj.preprints.1790 ◽

2016 ◽

Author(s):

Florian Wagner

Keyword(s):

Microarray Data ◽

Source Code ◽

R Package ◽

Raw Data ◽

Maqc Project ◽

Reference Implementation ◽

Using Data ◽

Expression Microarrays ◽

Python Package ◽

Final Expression

Robust multi-array average (RMA) is a highly successful method for processing raw data from Affymetrix expression microarrays. However, most of the work on microarray data processing predates the widespread use of Python in scientific computing. Here, I describe pyAffy, an efficient implementation of the RMA method in Python/Cython. Using data from the MAQC project, I show that this implementation produces virtually identical results compared to the RMA reference implementation in the affy R package, while running more than five times faster and consuming significantly less memory. I also show how individual steps of the RMA method affect the final expression estimates. The source code for pyAffy is available from PyPI and GitHub (https://github.com/flo-compbio/pyaffy) under an OSI-approved license. I intend to periodically revise this article to ensure that it accurately reflects the functionalities available in the pyAffy Python package.

Download Full-text

Estimation of isoform expression in RNA-seq data using a hierarchical Bayesian model

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720015420019 ◽

2015 ◽

Vol 13 (06) ◽

pp. 1542001 ◽

Cited By ~ 3

Author(s):

Zengmiao Wang ◽

Jun Wang ◽

Changjing Wu ◽

Minghua Deng

Keyword(s):

Differential Expression Analysis ◽

Multinomial Distribution ◽

Simulated Data ◽

Ground Truth ◽

Hierarchical Bayesian ◽

Rna Seq ◽

Unified Framework ◽

Maqc Project ◽

Isoform Expression ◽

Hierarchical Bayesian Method

Estimation of gene or isoform expression is a fundamental step in many transcriptome analysis tasks, such as differential expression analysis, eQTL (or sQTL) studies, and biological network construction. RNA-seq technology enables us to monitor the expression on genome-wide scale at single base pair resolution and offers the possibility of accurately measuring expression at the level of isoform. However, challenges remain because of non-uniform read sampling and the presence of various biases in RNA-seq data. In this paper, we present a novel hierarchical Bayesian method to estimate isoform expression. While most of the existing methods treat gene expression as a by-product, we incorporate it into our model and explicitly describe its relationship with corresponding isoform expression using a Multinomial distribution. In this way, gene and isoform expression are included in a unified framework and it helps us achieve a better performance over other state-of-the-art algorithms for isoform expression estimation. The effectiveness of the proposed method is demonstrated using both simulated data with known ground truth and two real RNA-seq datasets from MAQC project. The codes are available at http://www.math.pku.edu.cn/teachers/dengmh/GIExp/ .

Download Full-text

Comparing Imputation Procedures for Affymetrix Gene Expression Datasets Using MAQC Datasets

Advances in Bioinformatics ◽

10.1155/2013/790567 ◽

2013 ◽

Vol 2013 ◽

pp. 1-10 ◽

Cited By ~ 1

Author(s):

Sreevidya Sadananda Sadasiva Rao ◽

Lori A. Shepherd ◽

Andrew E. Bruno ◽

Song Liu ◽

Jeffrey C. Miecznikowski

Keyword(s):

Least Squares ◽

Missing Values ◽

Nearest Neighbor ◽

Least Squares Method ◽

Microarray Quality Control ◽

Imputation Methods ◽

Error Measures ◽

Maqc Project ◽

Overall Performance ◽

Microarray Datasets

Introduction. The microarray datasets from the MicroArray Quality Control (MAQC) project have enabled the assessment of the precision, comparability of microarrays, and other various microarray analysis methods. However, to date no studies that we are aware of have reported the performance of missing value imputation schemes on the MAQC datasets. In this study, we use the MAQC Affymetrix datasets to evaluate several imputation procedures in Affymetrix microarrays. Results. We evaluated several cutting edge imputation procedures and compared them using different error measures. We randomly deleted 5% and 10% of the data and imputed the missing values using imputation tests. We performed 1000 simulations and averaged the results. The results for both 5% and 10% deletion are similar. Among the imputation methods, we observe the local least squares method with is most accurate under the error measures considered. The k-nearest neighbor method with has the highest error rate among imputation methods and error measures. Conclusions. We conclude for imputing missing values in Affymetrix microarray datasets, using the MAS 5.0 preprocessing scheme, the local least squares method with has the best overall performance and k-nearest neighbor method with has the worst overall performance. These results hold true for both 5% and 10% missing values.

Download Full-text

The MicroArray Quality Control (MAQC) Project and Cross-Platform Analysis of Microarray Data

Handbook of Statistical Bioinformatics ◽

10.1007/978-3-642-16345-6_9 ◽

2011 ◽

pp. 171-192 ◽

Cited By ~ 3

Author(s):

Zhining Wen ◽

Zhenqiang Su ◽

Jie Liu ◽

Baitang Ning ◽

Lei Guo ◽

...

Keyword(s):

Quality Control ◽

Microarray Data ◽

Microarray Quality Control ◽

Maqc Project ◽

Cross Platform

Download Full-text

EVALUATION OF INTER-LABORATORY AND CROSS-PLATFORM CONCORDANCE OF DNA MICROARRAYS THROUGH DISCRIMINATING GENES AND CLASSIFIER TRANSFERABILITY

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720009004011 ◽

2009 ◽

Vol 07 (01) ◽

pp. 157-173 ◽

Cited By ~ 13

Author(s):

SHIHONG MAO ◽

CHARLES WANG ◽

GUOZHU DONG

Keyword(s):

Dna Microarrays ◽

P Value ◽

Microarray Technology ◽

Microarray Quality Control ◽

Practical Usefulness ◽

Maqc Project ◽

Cross Platform ◽

Microarray Datasets ◽

Consistency Rate

Microarray technology has great potential for improving our understanding of biological processes, medical conditions, and diseases. Often, microarray datasets are collected using different microarray platforms (provided by different companies) under different conditions in different laboratories. The cross-platform and cross-laboratory concordance of the microarray technology needs to be evaluated before it can be successfully and reliably applied in biological/clinical practice. New measures and techniques are proposed for comparing and evaluating the quality of microarray datasets generated from different platforms/laboratories. These measures and techniques are based on the following philosophy: the practical usefulness of the microarray technology may be confirmed if discriminating genes and classifiers, which are the focus of most, if not all, comparative investigations, discovered/trained from data collected in one lab/platform combination can be transferred to another lab/platform combination. The rationale is that the nondiscriminating genes might not be as strongly regulated as the discriminating genes, by the biological process of the tissue cells under study, and hence they may behave more randomly than the discriminating genes. Our experiment results, on microarray datasets generated from different platforms/laboratories using the reference mRNA samples in the Microarray Quality Control (MAQC) project, showed that DNA microarrays can produce highly repeatable data in a cross-platform cross-lab manner, when one focuses on the discriminating genes and classifiers. In our comparative study, we compare samples of one type against samples of another type; the methodology can be applied to situations where one compares one arbitrary class of data against another. Other findings include: (1) using three discriminating-gene/classifier-based methods to test the concordance between microarray datasets gave consistent results; (2) when noisy (nondiscriminating) genes were removed, the microarray datasets from different laboratories using common platform were found to be highly concordant, and the data generated using most of the commercial platforms studied here were also found to be concordant with each other; (3) several series of artificial datasets with known degree of difference were created, to establish a bridge between consistency rate and P-value, allowing us to estimate P-value if consistency rate between two datasets is known.

Download Full-text

Faculty Opinions recommendation of The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.1091093.544482 ◽

2007 ◽

Author(s):

Patricia Finn

Keyword(s):

Gene Expression ◽

Quality Control ◽

Microarray Quality Control ◽

Maqc Project

Download Full-text

The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements

Nature Biotechnology ◽

10.1038/nbt1239 ◽

2006 ◽

Vol 24 (9) ◽

pp. 1151-1161 ◽

Cited By ~ 1413

Author(s):

Keyword(s):

Gene Expression ◽

Quality Control ◽

Microarray Quality Control ◽

Maqc Project

Download Full-text

Performance comparison of one-color and two-color platforms within the Microarray Quality Control (MAQC) project

Nature Biotechnology ◽

10.1038/nbt1242 ◽

2006 ◽

Vol 24 (9) ◽

pp. 1140-1150 ◽

Cited By ~ 323

Author(s):

Tucker A Patterson ◽

Edward K Lobenhofer ◽

Stephanie B Fulmer-Smentek ◽

Patrick J Collins ◽

Tzu-Ming Chu ◽

...

Keyword(s):

Quality Control ◽

Performance Comparison ◽

Microarray Quality Control ◽

Maqc Project

Download Full-text

maqc project
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

pyAffy: An efficient Python/Cython implementation of the RMA method for processing raw data from Affymetrix expression microarrays

pyAffy: An efficient Python/Cython implementation of the RMA method for processing raw data from Affymetrix expression microarrays

Estimation of isoform expression in RNA-seq data using a hierarchical Bayesian model

Comparing Imputation Procedures for Affymetrix Gene Expression Datasets Using MAQC Datasets

The MicroArray Quality Control (MAQC) Project and Cross-Platform Analysis of Microarray Data

EVALUATION OF INTER-LABORATORY AND CROSS-PLATFORM CONCORDANCE OF DNA MICROARRAYS THROUGH DISCRIMINATING GENES AND CLASSIFIER TRANSFERABILITY

Faculty Opinions recommendation of The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements.

The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements

Performance comparison of one-color and two-color platforms within the Microarray Quality Control (MAQC) project

Export Citation Format

maqc projectRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

pyAffy: An efficient Python/Cython implementation of the RMA method for processing raw data from Affymetrix expression microarrays

pyAffy: An efficient Python/Cython implementation of the RMA method for processing raw data from Affymetrix expression microarrays

Estimation of isoform expression in RNA-seq data using a hierarchical Bayesian model

Comparing Imputation Procedures for Affymetrix Gene Expression Datasets Using MAQC Datasets

The MicroArray Quality Control (MAQC) Project and Cross-Platform Analysis of Microarray Data

EVALUATION OF INTER-LABORATORY AND CROSS-PLATFORM CONCORDANCE OF DNA MICROARRAYS THROUGH DISCRIMINATING GENES AND CLASSIFIER TRANSFERABILITY

Faculty Opinions recommendation of The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements.

The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements

Performance comparison of one-color and two-color platforms within the Microarray Quality Control (MAQC) project

maqc project
Recently Published Documents