Dataset-specific thresholds significantly improve detection of low transcribed regulatory genes in polysome profiling experiments
AbstractMotivationPolysome profiling is novel, and yet has proved to be an effective approach to detect mRNAs with differential ribosomal load and explore the regulatory mechanisms driving efficient translation. Genes encoding regulatory proteins, having a great influence of the organism, usually reveal moderate to low transcriptional levels, compared, for example, to genes of house-keeping machinery. This complicates the reliable detection of such genes in the presence of technical and/or biological noise.ResultsIn this work we investigate how cleaning of polysome profiling data on Arabidopsis thaliana influences the ability to detect genes with low level of total mRNA, but with a highly differential ribosomal load, i.e. genes translationally active. Suggested data modelling approach to identify a background level of mRNA counts individually for each dataset, shows higher power in detection of low transcribed genes, compared to the use of thresholds for the minimal required mRNA counts or the use of raw data. The significant increase in detected number of regulation–related genes was demonstrated. The described approach is applicable to a wide variety of RNA-seq data. All identified and classified mRNAs with high and low translation status are made available in supplementary material.