scholarly journals ADAGE-Based Integration of Publicly Available Pseudomonas aeruginosa Gene Expression Data with Denoising Autoencoders Illuminates Microbe-Host Interactions

mSystems ◽  
2016 ◽  
Vol 1 (1) ◽  
Author(s):  
Jie Tan ◽  
John H. Hammond ◽  
Deborah A. Hogan ◽  
Casey S. Greene

ABSTRACT The quantity and breadth of genome-scale data sets that examine RNA expression in diverse bacterial and eukaryotic species are increasing more rapidly than for curated knowledge. Our ADAGE method integrates such data without requiring gene function, gene pathway, or experiment labeling, making practical its application to any large gene expression compendium. We built a Pseudomonas aeruginosa ADAGE model from a diverse set of publicly available experiments without any prespecified biological knowledge, and this model was accurate and predictive. We provide ADAGE results for the complete P. aeruginosa GeneChip compendium for use by researchers studying P. aeruginosa and source code that facilitates ADAGE’s application to other species and data types. The increasing number of genome-wide assays of gene expression available from public databases presents opportunities for computational methods that facilitate hypothesis generation and biological interpretation of these data. We present an unsupervised machine learning approach, ADAGE (analysis using denoising autoencoders of gene expression), and apply it to the publicly available gene expression data compendium for Pseudomonas aeruginosa. In this approach, the machine-learned ADAGE model contained 50 nodes which we predicted would correspond to gene expression patterns across the gene expression compendium. While no biological knowledge was used during model construction, cooperonic genes had similar weights across nodes, and genes with similar weights across nodes were significantly more likely to share KEGG pathways. By analyzing newly generated and previously published microarray and transcriptome sequencing data, the ADAGE model identified differences between strains, modeled the cellular response to low oxygen, and predicted the involvement of biological processes based on low-level gene expression differences. ADAGE compared favorably with traditional principal component analysis and independent component analysis approaches in its ability to extract validated patterns, and based on our analyses, we propose that these approaches differ in the types of patterns they preferentially identify. We provide the ADAGE model with analysis of all publicly available P. aeruginosa GeneChip experiments and open source code for use with other species and settings. Extraction of consistent patterns across large-scale collections of genomic data using methods like ADAGE provides the opportunity to identify general principles and biologically important patterns in microbial biology. This approach will be particularly useful in less-well-studied microbial species. IMPORTANCE The quantity and breadth of genome-scale data sets that examine RNA expression in diverse bacterial and eukaryotic species are increasing more rapidly than for curated knowledge. Our ADAGE method integrates such data without requiring gene function, gene pathway, or experiment labeling, making practical its application to any large gene expression compendium. We built a Pseudomonas aeruginosa ADAGE model from a diverse set of publicly available experiments without any prespecified biological knowledge, and this model was accurate and predictive. We provide ADAGE results for the complete P. aeruginosa GeneChip compendium for use by researchers studying P. aeruginosa and source code that facilitates ADAGE’s application to other species and data types. Author Video: An author video summary of this article is available.

2020 ◽  
Author(s):  
Camden Jansen ◽  
Kitt D. Paraiso ◽  
Jeff J. Zhou ◽  
Ira L. Blitz ◽  
Margaret B. Fish ◽  
...  

SummaryMesendodermal specification is one of the earliest events in embryogenesis, where cells first acquire distinct identities. Cell differentiation is a highly regulated process that involves the function of numerous transcription factors (TFs) and signaling molecules, which can be described with gene regulatory networks (GRNs). Cell differentiation GRNs are difficult to build because existing mechanistic methods are low-throughput, and high-throughput methods tend to be non-mechanistic. Additionally, integrating highly dimensional data comprised of more than two data types is challenging. Here, we use linked self-organizing maps to combine ChIP-seq/ATAC-seq with temporal, spatial and perturbation RNA-seq data from Xenopus tropicalis mesendoderm development to build a high resolution genome scale mechanistic GRN. We recovered both known and previously unsuspected TF-DNA/TF-TF interactions and validated through reporter assays. Our analysis provides new insights into transcriptional regulation of early cell fate decisions and provides a general approach to building GRNs using highly-dimensional multi-omic data sets.HighlightsBuilt a generally applicable pipeline to creating GRNs using highly-dimensional multi-omic data setsPredicted new TF-DNA/TF-TF interactions during mesendoderm developmentGenerate the first genome scale GRN for vertebrate mesendoderm and expanded the core mesendodermal developmental network with high fidelityDeveloped a resource to visualize hundreds of RNA-seq and ChIP-seq data using 2D SOM metaclusters.


Author(s):  
Soumya Raychaudhuri

The most interesting and challenging gene expression data sets to analyze are large multidimensional data sets that contain expression values for many genes across multiple conditions. In these data sets the use of scientific text can be particularly useful, since there are a myriad of genes examined under vastly different conditions, each of which may induce or repress expression of the same gene for different reasons. There is an enormous complexity to the data that we are examining—each gene is associated with dozens if not hundreds of expression values as well as multiple documents built up from vocabularies consisting of thousands of words. In Section 2.4 we reviewed common gene expression strategies, most of which revolve around defining groups of genes based on common profiles. A limitation of many gene expression analytic approaches is that they do not incorporate comprehensive background knowledge about the genes into the analysis. We present computational methods that leverage the peer-reviewed literature in the automatic analysis of gene expression data sets. Including the literature in gene expression data analysis offers an opportunity to incorporate background functional information about the genes when defining expression clusters. In Chapter 5 we saw how literature- based approaches could help in the analysis of single condition experiments. Here we will apply the strategies introduced in Chapter 6 to assess the coherence of groups of genes to enhance gene expression analysis approaches. The methods proposed here could, in fact, be applied to any multivariate genomics data type. The key concepts discussed in this chapter are listed in the frame box. We begin with a discussion of gene groups and their role in expression analysis; we briefly discuss strategies to assign keywords to groups and strategies to assess their functional coherence. We apply functional coherence measures to gene expression analysis; for examples we focus on a yeast expression data set. We first demonstrate how functional coherence can be used to focus in on the key biologically relevant gene groups derived by clustering methods such as self-organizing maps and k-means clustering.


2012 ◽  
Vol 79 (2) ◽  
pp. 718-721 ◽  
Author(s):  
F. Heath Damron ◽  
Elizabeth S. McKenney ◽  
Herbert P. Schweizer ◽  
Joanna B. Goldberg

ABSTRACTWe describe a mini-Tn7-based broad-host-range expression cassette for arabinose-inducible gene expression from the PBADpromoter. This delivery vector, pTJ1, can integrate a single copy of a gene into the chromosome of Gram-negative bacteria for diverse genetic applications, of which several are discussed, usingPseudomonas aeruginosaas the model host.


mBio ◽  
2020 ◽  
Vol 11 (2) ◽  
Author(s):  
Stephen K. Dolan ◽  
Michael Kohlstedt ◽  
Stephen Trigg ◽  
Pedro Vallejo Ramirez ◽  
Clemens F. Kaminski ◽  
...  

ABSTRACT Pseudomonas aeruginosa is an opportunistic human pathogen, particularly noted for causing infections in the lungs of people with cystic fibrosis (CF). Previous studies have shown that the gene expression profile of P. aeruginosa appears to converge toward a common metabolic program as the organism adapts to the CF airway environment. However, we still have only a limited understanding of how these transcriptional changes impact metabolic flux at the systems level. To address this, we analyzed the transcriptome, proteome, and fluxome of P. aeruginosa grown on glycerol or acetate. These carbon sources were chosen because they are the primary breakdown products of an airway surfactant, phosphatidylcholine, which is known to be a major carbon source for P. aeruginosa in CF airways. We show that the fluxes of carbon throughout central metabolism are radically different among carbon sources. For example, the newly recognized “EDEMP cycle” (which incorporates elements of the Entner-Doudoroff [ED] pathway, the Embden-Meyerhof-Parnas [EMP] pathway, and the pentose phosphate [PP] pathway) plays an important role in supplying NADPH during growth on glycerol. In contrast, the EDEMP cycle is attenuated during growth on acetate, and instead, NADPH is primarily supplied by the reaction catalyzed by isocitrate dehydrogenase(s). Perhaps more importantly, our proteomic and transcriptomic analyses revealed a global remodeling of gene expression during growth on the different carbon sources, with unanticipated impacts on aerobic denitrification, electron transport chain architecture, and the redox economy of the cell. Collectively, these data highlight the remarkable metabolic plasticity of P. aeruginosa; that plasticity allows the organism to seamlessly segue between different carbon sources, maximizing the energetic yield from each. IMPORTANCE Pseudomonas aeruginosa is an opportunistic human pathogen that is well known for causing infections in the airways of people with cystic fibrosis. Although it is clear that P. aeruginosa is metabolically well adapted to life in the CF lung, little is currently known about how the organism metabolizes the nutrients available in the airways. In this work, we used a combination of gene expression and isotope tracer (“fluxomic”) analyses to find out exactly where the input carbon goes during growth on two CF-relevant carbon sources, acetate and glycerol (derived from the breakdown of lung surfactant). We found that carbon is routed (“fluxed”) through very different pathways during growth on these substrates and that this is accompanied by an unexpected remodeling of the cell’s electron transfer pathways. Having access to this “blueprint” is important because the metabolism of P. aeruginosa is increasingly being recognized as a target for the development of much-needed antimicrobial agents.


2017 ◽  
Vol 85 (5) ◽  
Author(s):  
Alexandria A. Reinhart ◽  
Angela T. Nguyen ◽  
Luke K. Brewer ◽  
Justin Bevere ◽  
Jace W. Jones ◽  
...  

ABSTRACT Pseudomonas aeruginosa is a Gram-negative opportunistic pathogen that requires iron for virulence. Iron homeostasis is maintained in part by the PrrF1 and PrrF2 small RNAs (sRNAs), which block the expression of iron-containing proteins under iron-depleted conditions. The PrrF sRNAs also promote the production of the Pseudomonas quinolone signal (PQS), a quorum sensing molecule that activates the expression of several virulence genes. The tandem arrangement of the prrF genes allows for expression of a third sRNA, PrrH, which is predicted to regulate gene expression through its unique sequence derived from the prrF1-prrF2 intergenic (IG) sequence (the PrrHIG sequence). Previous studies showed that the prrF locus is required for acute lung infection. However, the individual functions of the PrrF and PrrH sRNAs were not determined. Here, we describe a system for differentiating PrrF and PrrH functions by deleting the PrrHIG sequence [prrF(ΔHIG)]. Our analyses of this construct indicate that the PrrF sRNAs, but not PrrH, are required for acute lung infection by P. aeruginosa. Moreover, we show that the virulence defect of the ΔprrF1-prrF2 mutant is due to decreased bacterial burden during acute lung infection. In vivo analysis of gene expression in lung homogenates shows that PrrF-mediated regulation of genes for iron-containing proteins is disrupted in the ΔprrF1-prrF2 mutant during infection, while the expression of genes that mediate PrrF-regulated PQS production are not affected by prrF deletion in vivo. Combined, these studies demonstrate that regulation of iron utilization plays a critical role in P. aeruginosa's ability to survive during infection.


2014 ◽  
Vol 82 (4) ◽  
pp. 1638-1647 ◽  
Author(s):  
Ziyu Sun ◽  
Jing Shi ◽  
Chang Liu ◽  
Yongxin Jin ◽  
Kewei Li ◽  
...  

ABSTRACTPseudomonas aeruginosais an opportunistic pathogen that causes acute and chronic infections in humans. Pyocins are bacteriocins produced byP. aeruginosathat are usually released through lysis of the producer strains. Expression of pyocin genes is negatively regulated by PrtR, which gets cleaved under SOS response, leading to upregulation of pyocin synthetic genes. Previously, we demonstrated that PrtR is required for the expression of type III secretion system (T3SS), which is an important virulence component ofP. aeruginosa. In this study, we demonstrate that mutation inprtRresults in reduced bacterial colonization in a mouse acute pneumonia model. Examination of bacterial and host cells in the bronchoalveolar lavage fluids from infected mice revealed that expression of PrtR is induced by reactive oxygen species (ROS) released by neutrophils. We further demonstrate that treatment with hydrogen peroxide or ciprofloxacin, known to induce the SOS response and pyocin production, resulted in an elevated PrtR mRNA level. Overexpression of PrtR by atacpromoter repressed the endogenousprtRpromoter activity, and electrophoretic mobility shift assay revealed that PrtR binds to its own promoter, suggesting an autorepressive mechanism of regulation. A high level of PrtR expressed from a plasmid resulted in increased T3SS gene expression during infection and higher resistance against ciprofloxacin. Overall, our results suggest that the autorepression of PrtR contributes to the maintenance of a relatively stable level of PrtR, which is permissive to T3SS gene expression in the presence of ROS while increasing bacterial tolerance to stresses, such as ciprofloxacin, by limiting pyocin production.


mBio ◽  
2011 ◽  
Vol 2 (1) ◽  
Author(s):  
Larry A. Gallagher ◽  
Jay Shendure ◽  
Colin Manoil

ABSTRACT We describe a deep-sequencing procedure for tracking large numbers of transposon mutants of Pseudomonas aeruginosa. The procedure employs a new Tn-seq methodology based on the generation and amplification of single-strand circles carrying transposon junction sequences (the Tn-seq circle method), a method which can be used with virtually any transposon. The procedure reliably identified more than 100,000 transposon insertions in a single experiment, providing near-saturation coverage of the genome. To test the effectiveness of the procedure for mutant identification, we screened for mutations reducing intrinsic resistance to the aminoglycoside antibiotic tobramycin. Intrinsic tobramycin resistance had been previously analyzed at genome scale using mutant-by-mutant screening and thus provided a benchmark for evaluating the new method. The new Tn-seq procedure identified 117 tobramycin resistance genes, the majority of which were then verified with individual mutants. The group of genes with the strongest mutant phenotypes included nearly all (13 of 14) of those with strong mutant phenotypes identified in the previous screening, as well as a nearly equal number of new genes. The results thus show the effectiveness of the Tn-seq method in defining the genetic basis of a complex resistance trait of P. aeruginosa and indicate that it can be used to analyze a variety of growth-related processes. IMPORTANCE Research progress in microbiology is technology limited in the sense that the analytical methods available dictate how questions are experimentally addressed and, to some extent, what questions are asked. This report describes a new transposon tracking procedure for defining the genetic basis of growth-related processes in Pseudomonas aeruginosa, an important bacterial pathogen. The method employs next-generation sequencing to monitor the makeup of mutant populations (Tn-seq) and has several potential advantages over other Tn-seq methodologies. The new method was validated through the analysis of a clinically relevant antibiotic resistance trait.


Author(s):  
Sardar Karash ◽  
Robert Nordell ◽  
Egon A. Ozer ◽  
Timothy L. Yahr

A common feature of microorganisms that cause chronic infections is a stealthy lifestyle that promotes immune avoidance and host tolerance. During chronic colonization of cystic fibrosis (CF) patients, Pseudomonas aeruginosa acquires numerous adaptations that include reduced expression of some factors, such as motility, O antigen, and the T3SS, and increased expression of other traits, such as biofilm formation.


2021 ◽  
Author(s):  
Dan Li ◽  
Hong Gu ◽  
Qiaozhen Chang ◽  
Jia Wang ◽  
Pan Qin

Abstract Clustering algorithms have been successfully applied to identify co-expressed gene groups from gene expression data. Missing values often occur in gene expression data, which presents a challenge for gene clustering. When partitioning incomplete gene expression data into co-expressed gene groups, missing value imputation and clustering are generally performed as two separate processes. These two-stage methods are likely to result in unsuitable imputation values for clustering task and unsatisfying clustering performance. This paper proposes a multi-objective joint optimization framework for clustering incomplete gene expression data that addresses this problem. The proposed framework can impute the missing expression values under the guidance of clustering, and therefore realize the synergistic improvement of imputation and clustering. In addition, gene expression similarity and gene semantic similarity extracted from the Gene Ontology are combined, as the form of functional neighbor interval for each missing expression value, to provide reasonable constraints for the joint optimization framework. Experiments on several benchmark data sets confirm the effectiveness of the proposed framework.


2015 ◽  
Vol 197 (16) ◽  
pp. 2664-2674 ◽  
Author(s):  
Peter J. Intile ◽  
Grant J. Balzer ◽  
Matthew C. Wolfgang ◽  
Timothy L. Yahr

ABSTRACTThePseudomonas aeruginosatype III secretion system (T3SS) is a primary virulence factor important for phagocytic avoidance, disruption of host cell signaling, and host cell cytotoxicity. ExsA is the master regulator of T3SS transcription. The expression, synthesis, and activity of ExsA is tightly regulated by both intrinsic and extrinsic factors. Intrinsic regulation consists of the well-characterized ExsECDA partner-switching cascade, while extrinsic factors include global regulators that alterexsAtranscription and/or translation. To identify novel extrinsic regulators of ExsA, we conducted a transposon mutagenesis screen in the absence of intrinsic control. Transposon disruptions within gene PA2840, which encodes a homolog of theEscherichia coliRNA-helicase DeaD, significantly reduced T3SS gene expression. Recent studies indicate thatE. coliDeaD can promote translation by relieving inhibitory secondary structures within target mRNAs. We report here that PA2840, renamed DeaD, stimulates ExsA synthesis at the posttranscriptional level. Genetic experiments demonstrate that the activity of anexsAtranslational fusion is reduced in adeaDmutant. In addition,exsAexpression intransfails to restore T3SS gene expression in adeaDmutant. We hypothesized that DeaD relaxes mRNA secondary structure to promoteexsAtranslation and found that altering the mRNA sequence ofexsAor the nativeexsAShine-Dalgarno sequence relieved the requirement for DeaDin vivo. Finally, we show that purified DeaD promotes ExsA synthesis usingin vitrotranslation assays. Together, these data reveal a novel regulatory mechanism forP. aeruginosaDeaD and add to the complexity of global regulation of T3SS.IMPORTANCEAlthough members of the DEAD box family of RNA helicases are appreciated for their roles in mRNA degradation and ribosome biogenesis, an additional role in gene regulation is now emerging in bacteria. By relaxing secondary structures in mRNAs, DEAD box helicases are now thought to promote translation by enhancing ribosomal recruitment. We identify here an RNA helicase that plays a critical role in promoting ExsA synthesis, the central regulator of thePseudomonas aeruginosatype III secretion system, and provide additional evidence that DEAD box helicases directly stimulate translation of target genes. The finding that DeaD stimulatesexsAtranslation adds to a growing list of transcriptional and posttranscriptional regulatory mechanisms that control type III gene expression.


Sign in / Sign up

Export Citation Format

Share Document