Mining transcriptomic data to identify Saccharomyces cerevisiae signatures related to improved and repressed ethanol production under fermentation
Saccharomyces cerevisiae is known for its outstanding ability to produce ethanol in industry. Identifying the dynamic of gene expression in S. cerevisiae in response to fermentation is required for the establishment of any ethanol production improvement program. The goal of this study was to identify the discriminative genes between improved and repressed ethanol production as well as clarifying the molecular responses to this process through mining the transcriptomic data. Through 11 machine learning based algorithms from RapidMiner employed on available microarray datasets related to yeast fermentation performance under Mg 2+ and Cu 2+ supplementation, 172 probe sets were identified by at least 5 AWAs. Some have been identified as being involved in carbohydrate metabolism, oxidative phosphorylation, and ethanol fermentation. Principal component analysis (PCA) and heatmap clustering were also validated the top-ranked selective probe sets. According to decision tree models, 17 roots with 100% performance were identified. OLI1 and CYC3 were identified as the roots with the best performance, demonstrated by the most weighting algorithms and linked to top two significant enriched pathways including porphyrin biosynthesis and oxidative phosphorylation. ADH5 and PDA1 are also recognized as differential top-ranked genes that contribute to ethanol production. According to the regulatory clustering analysis, Tup1 has a significant effect on the top-ranked target genes CYC3 and ADH5 genes. This study provides a basic understanding of the S. cerevisiae cell molecular mechanism and responses to two different medium conditions (Mg 2+ and Cu 2+ ) during the fermentation process.