pyAffy: An efficient Python/Cython implementation of the RMA method for processing raw data from Affymetrix expression microarrays
Robust multi-array average (RMA) is a highly successful method for processing raw data from Affymetrix expression microarrays. However, most of the work on microarray data processing predates the widespread use of Python in scientific computing. Here, I describe pyAffy, an efficient implementation of the RMA method in Python/Cython. Using data from the MAQC project, I show that this implementation produces virtually identical results compared to the RMA reference implementation in the affy R package, while running more than five times faster and consuming significantly less memory. I also show how individual steps of the RMA method affect the final expression estimates. The source code for pyAffy is available from PyPI and GitHub (https://github.com/flo-compbio/pyaffy) under an OSI-approved license. I intend to periodically revise this article to ensure that it accurately reflects the functionalities available in the pyAffy Python package.