scholarly journals Recycler: an algorithm for detecting plasmids from de novo assembly graphs

2015 ◽  
Roye Rozov ◽  
Aya Brown Kav ◽  
David Bogumil ◽  
Naama Shterzer ◽  
Eran Halperin ◽  

AbstractPlasmids are central contributors to microbial evolution and genome innovation. Recently, they have been found to have important roles in antibiotic resistance and in affecting production of metabolites used in industrial and agricultural applications. However, their characterization through deep sequencing remains challenging, in spite of rapid drops in cost and throughput increases for sequencing. Here, we attempt to ameliorate this situation by introducing a new plasmid-specific assembly algorithm, leveraging assembly graphs provided by a conventional de novo assembler and alignments of paired- end reads to assembled graph nodes. We introduce the first tool for this task, called Recycler, and demonstrate its merits in comparison with extant approaches. We show that Recycler greatly increases the number of true plasmids recovered while remaining highly accurate. On simulated plasmidomes, Recycler recovered 5-14% more true plasmids compared to the best extant method with overall precision of about 90%. We validated these results in silico on real data, as well as in vitro by PCR validation performed on a subset of Recycler’s predictions on different data types. All 12 of Recycler’s outputs on isolate samples matched known plasmids or phages, and had alignments having at least 97% identity over at least 99% of the reported reference sequence lengths. For the two E. Coli strains examined, most known plasmid sequences were recovered, while in both cases additional plasmids only known to be present in different hosts were found. Recycler also generated plasmids in high agreement with known annotation on real plasmidome data. Moreover, in PCR validations performed on 77 sequences, Recycler showed mean accuracy of 89% across all data types – isolate, microbiome, and plasmidome. Recycler is available at

2015 ◽  
Vol 2015 ◽  
pp. 1-10 ◽  
Krisztian Buza ◽  
Bartek Wilczynski ◽  
Norbert Dojer

Background. Next-generation sequencing technologies are now producing multiple times the genome size in total reads from a single experiment. This is enough information to reconstruct at least some of the differences between the individual genome studied in the experiment and the reference genome of the species. However, in most typical protocols, this information is disregarded and the reference genome is used.Results. We provide a new approach that allows researchers to reconstruct genomes very closely related to the reference genome (e.g., mutants of the same species) directly from the reads used in the experiment. Our approach applies de novo assembly software to experimental reads and so-called pseudoreads and uses the resulting contigs to generate a modified reference sequence. In this way, it can very quickly, and at no additional sequencing cost, generate new, modified reference sequence that is closer to the actual sequenced genome and has a full coverage. In this paper, we describe our approach and test its implementation called RECORD. We evaluate RECORD on both simulated and real data. We made our software publicly available on sourceforge.Conclusion. Our tests show that on closely related sequences RECORD outperforms more general assisted-assembly software.

2021 ◽  
Madeleine Huber

Operons wurden zuerst im Jahre 1961 beschrieben. Bis heute ist bekannt, dass die prokaryotischen Domänen Bacteria und Archaea Gene sowohl in monocistronischen als auch in bi- oder polycistronischen Transkripten exprimieren können. Häufig überlappen Gene sogar in ihren Sequenzen. Diese überlappenden Genpaare stehen nicht in Korrelation mit der Kompaktheit ihres Genoms. Das führt zu der Annahme, dass eine Art der Regulation vorliegt, welche weitere Proteine oder Gene nicht benötigt. Diese könnte eine gekoppelte Translation sein. Das bedeutet die Translation des stromabwärts-liegenden Gens ist abhängig von der Translation eines stromaufwärts-liegenden Gens. Diese Abhängigkeit kann zum Beispiel durch lang reichende Sekundärstrukturen entstehen, bei welchen Ribosomenbindestellen (RBS) des stromabwärts-liegenden Gens blockiert sind. Die de novo-Initiation am stromabwärts-liegenden Gen kann nur stattfinden, wenn das erste Gen translatiert wird und dabei die Sekundärstruktur an der RBS aufgeschmolzen wird. Für Genpaare in E. coli ist dieser Mechanismus gut untersucht. Ein anderes Beispiel für die Translationskopplung ist die Termination-Reinitiation, bei welcher ein Ribosom das erste Gen translatiert bis zum Stop-Codon, dort terminiert und direkt am stromabwärts-liegenden Start-Codon reinitiiert. Der Mechanismus via Termination-Reinitiation ist bis jetzt nur für eukaryontische Viren beschrieben worden. Im Gegensatz zu einer Kopplung über Sekundärstrukturen kommt es bei der Termination-Reinitiation am stromabwärts-liegenden Gen nicht zu einer de novo-Initiation sondern eine Reinitiation des Ribosoms findet statt. Diese Arbeit analysiert jene Art der Translationskopplung an Genen polycistronischer mRNAs in jeweils einem Modellorganismus als Vertreter der Archaea (Haloferax volcanii) und Bacteria (Escherichia coli). Hierfür wurden Reportergenvektoren erstellt, welche die überlappenden Genpaare an Reportergene fusionierten. Für diese Reportergene ist es möglich die Transkriptmenge zu quantifizieren sowie für die exprimierten Proteine Enzymassays durchgeführt werden können. Aus beiden Werten können Translationseffizienzen berechnet werden indem jeweils die Enzymaktivität pro Transkriptmenge ermittelt wird. Durch ein prämatures Stop-Codon in diesen Konstrukten ist es möglich zu unterscheiden ob es für die Translation des zweiten Gens essentiell ist, dass das Ribosom den Überlapp erreicht. Hiermit konnte für neun Genpaare in H. volcanii und vier Genpaare in E. coli gezeigt werden, dass eine Art der Kopplung stattfindet bei der es sich um eine Termination-Reinitiation handelt. Des Weiteren wurde analysiert, welche Auswirkungen intragene Shine-Dalgarno Sequenzen bei dem Event der Translationskopplung besitzen. Durch die Mutation solcher Motive und dem Vergleich der Translationseffizienzen der Konstrukte, mit und ohne einer SD Sequenz, wird für alle analysierten Genpaare beider Modellorganismen gezeigt, dass die SD Sequenz einen Einfluss auf diese Art der Kopplung hat. Zwischen den Genpaaren ist dieser Einfluss jedoch stark variabel. Weiterhin wurde der maximale Abstand zwischen zwei bicistronischen Genen untersucht, für welchen Translationskopplung via Termination-Reinitiation noch stattfinden kann. Hierfür wird durch site-directed mutagenesis jeweils ein prämatures Stop-Codon im stromaufwärts-liegenden Gen eingebracht, welches den intergenen Abstand zwischen den Genen in den jeweiligen Konstrukten vergrößert. Der Vergleich aller Konstrukte eines Genpaars zeigt in beiden Modellorganismen, dass die Termination-Reinitiation vom intergenen Abstand abhängig ist und die Translationseffizienz des stromabwärts-liegenden Reporters bereits ab 15 Nukleotiden Abstand abnimmt. Eine weitere Fragestellung dieser Arbeit war es, den genauen Mechanismus der Termination-Reinitiation zu analysieren. Für Ribosomen gibt es an der mRNA nach der Termination der Translation zwei Möglichkeiten: Entweder als 70S Ribosom bestehen zu bleiben und ein weiteres Start-Codon auf der mRNA zu suchen oder in seine beiden Untereinheiten zu dissoziieren, während die 50S Untereinheit die mRNA verlässt und die 30S Untereinheit über Wechselwirkungen an der mRNA verbleiben kann. Um diesen Mechanismus auf molekularer Ebene zu untersuchen, wird ein Versuchsablauf vorgestellt. Dieser ermöglicht das Event bei der Termination-Reinitiation in vitro zu analysieren. Eine Unterscheidung von 30S oder 70S Ribosomen bei der Reinitiation der Translation des stromabwärts-liegenden Gens wird ermöglicht. Die Idee dabei basiert auf einem ribosome display, bei welchem Translationskomplexe am Ende der Translation nicht in ihre Bestandteile zerfallen können, da die eingesetzte mRNA kein Stop-Codon enthält Der genaue Versuchsablauf, die benötigten Bestandteile sowie proof-of-principal Versuche sind in der Arbeit dargestellt und mögliche Optimierungen werden diskutiert.

1991 ◽  
Vol 174 (5) ◽  
pp. 1167-1177 ◽  
J Vuopio-Varkila ◽  
G K Schoolnik

Enteropathogenic Escherichia coli grow as discrete colonies on the mucous membranes of the small intestine. A similar pattern can be demonstrated in vitro; termed localized adherence (LA), it is characterized by the presence of circumscribed clusters of bacteria attached to the surfaces of cultured epithelial cells. The LA phenotype was studied using B171, an O111:NM enteropathogenic E. coli (EPEC) strain, and HEp-2 cell monolayers. LA could be detected 30-60 min after exposure of HEp-2 cells to B171. However, bacteria transferred from infected HEp-2 cells to fresh monolayers exhibited LA within 15 min, indicating that LA is an inducible phenotype. Induction of the LA phenotype was found to be associated with de novo protein synthesis and changes in the outer membrane proteins, including the production of a new 18.5-kD polypeptide. A partial NH2-terminal amino acid sequence of this polypeptide was obtained and showed it to be identical through residue 12 to the recently described bundle-forming pilus subunit of EPEC. Expression of the 18.5-kD polypeptide required the 57-megadalton enteropathogenic E. coli adherence plasmid previously shown to be required for the LA phenotype in vitro and full virulence in vivo. This observation, the correspondence of the 18.5-kD polypeptide to an EPEC-specific pilus protein, and the temporal correlation of its expression with the development of the LA phenotype suggest that it may contribute to the EPEC colonial mode of growth.

2006 ◽  
Vol 188 (3) ◽  
pp. 909-918 ◽  
Jianmin Zhong ◽  
Stephane Skouloubris ◽  
Qiyuan Dai ◽  
Hannu Myllykallio ◽  
Alan G. Barbour

ABSTRACT The thyX gene for thymidylate synthase of the Lyme borreliosis (LB) agent Borrelia burgdorferi is located in a 54-kb linear plasmid. In the present study, we identified an orthologous thymidylate synthase gene in the relapsing fever (RF) agent Borrelia hermsii, located it in a 180-kb linear plasmid, and demonstrated its expression. The functions of the B. hermsii and B. burgdorferi thyX gene products were evaluated both in vivo, by complementation of a thymidylate synthase-deficient Escherichia coli mutant, and in vitro, by testing their activities after purification. The B. hermsii thyX gene complemented the thyA mutation in E. coli, and purified B. hermsii ThyX protein catalyzed the conversion of dTMP from dUMP. In contrast, the B. burgdorferi ThyX protein had only weakly detectable activity in vitro, and the B. burgdorferi thyX gene did not provide complementation in vivo. The lack of activity of B. burgdorferi's ThyX protein was associated with the substitution of a cysteine for a highly conserved arginine at position 91. The B. hermsii thyX locus was further distinguished by the downstream presence in the plasmid of orthologues of nrdI, nrdE, and nrdF, which encode the subunits of ribonucleoside diphosphate reductase and which are not present in the LB agents B. burgdorferi and Borrelia garinii. Phylogenetic analysis suggested that the nrdIEF cluster of B. hermsii was acquired by horizontal gene transfer. These findings indicate that Borrelia spp. causing RF have a greater capability for de novo pyrimidine synthesis than those causing LB, thus providing a basis for some of the biological differences between the two groups of pathogens.

1999 ◽  
Vol 338 (3) ◽  
pp. 701-708 ◽  
Evelyne RAUX ◽  
Treasa McVEIGH ◽  
Sarah E. PETERS ◽  
Thomas LEUSTEK ◽  
Martin J. WARREN

MET1 and MET8 mutants of Saccharomyces cerevisiae can be complemented by Salmonella typhimurium cysG, indicating that the genes are involved in the transformation of uroporphyrinogen III into sirohaem. In the present study, we have demonstrated complementation of defined cysG mutants of Sal. typhimurium and Escherichia coli, with either MET1 or MET8 cloned in tandem with Pseudomonas denitrificans cobA. The conclusion drawn from these experiments is that MET1 encodes the S-adenosyl-l-methionine uroporphyrinogen III transmethylase activity, and MET8 encodes the dehydrogenase and chelatase activities (all three functions are encoded by Sal. typhimurium and E. coli cysG). MET8 was further cloned into pET14b to allow expression of the protein with an N-terminal His-tag. After purification, the functions of the His-tagged Met8p were studied in vitro by assay with precorrin-2 in the presence of NAD+ and Co2+. The results demonstrated that Met8p acts as a dehydrogenase and chelatase in the biosynthesis of sirohaem. Moreover, despite the fact that S. cerevisiae does not make cobalamins de novo, we have shown also that MET8 is able to complement cobalamin cobaltochelatase mutants and have revealed a subtle difference in the early stages of the anaerobic cobalamin biosynthetic pathways between Sal. typhimurium and Bacillus megaterium.

2021 ◽  
Vol 12 ◽  
Lacey R. Lopez ◽  
Cassandra J. Barlogio ◽  
Christopher A. Broberg ◽  
Jeremy Wang ◽  
Janelle C. Arthur

Inflammatory bowel diseases (IBDs) and inflammation-associated colorectal cancer (CRC) are linked to blooms of adherent-invasive Escherichia coli (AIEC) in the intestinal microbiota. AIEC are functionally defined by their ability to adhere/invade epithelial cells and survive/replicate within macrophages. Changes in micronutrient availability can alter AIEC physiology and interactions with host cells. Thus, culturing AIEC for mechanistic investigations often involves precise nutrient formulation. We observed that the pro-inflammatory and pro-carcinogenic AIEC strain NC101 failed to grow in minimal media (MM). We hypothesized that NC101 was unable to synthesize a vital micronutrient normally found in the host gut. Through nutrient supplementation studies, we identified that NC101 is a nicotinic acid (NA) auxotroph. NA auxotrophy was not observed in the other non-toxigenic E. coli or AIEC strains we tested. Sequencing revealed NC101 has a missense mutation in nadA, a gene encoding quinolinate synthase A that is important for de novo nicotinamide adenine dinucleotide (NAD) biosynthesis. Correcting the identified nadA point mutation restored NC101 prototrophy without impacting AIEC function, including motility and AIEC-defining survival in macrophages. Our findings, along with the generation of a prototrophic NC101 strain, will greatly enhance the ability to perform in vitro functional studies that are needed for mechanistic investigations on the role of intestinal E. coli in digestive disease.

2021 ◽  
Zhenya Chen ◽  
Tongtong Chen ◽  
Shengzhu Yu ◽  
Yi-Xin Huo

Abstract BackgroundGallic acid (GA) and pyrogallol are phenolic hydroxyl compounds and have diverse biological activities. Microbial-based biosynthesis of GA and pyrogallol has been emerged as an ecofriendly method to replace the traditional chemical synthesis. In GA and pyrogallol biosynthetic pathways, the low hydroxylation activity of p-hydroxybenzoate hydroxylase (PobA) towards 3,4-dihydroxybenzoic acid (3,4-DHBA) limited the high-level biosynthesis of GA and pyrogallol.ResultsThis work reported a high active PobA mutant (Y385F/T294A/V349A PobA) towards 3,4-DHBA. This mutant was screened out from a PobA random mutagenesis library through a novel naked eye visual screening method. In vitro enzyme assay showed this mutant has a kcat/Km of 0.059 μM-1s-1 towards 3,4-DHBA, which was 4.92-fold higher than the reported mutant (Y385F/T294A PobA). Molecular docking simulation provided the mechanism explanation of the high activity. Expression of this mutant in E. coli BW25113 (F’) can generate 830 ± 33 mg/L GA from 1000 mg/L 3,4-DHBA. After that, we utilized this mutant to assemble a de novo GA biosynthetic pathway. Subsequently, this pathway was introduced into a 3,4-DHBA-producing strain (E. coli BW25113 (F’)ΔaroE) to achieve 301 ± 15 mg/L GA production from simple carbon sources. Similarly, assembling this mutant into a de novo pyrogallol biosynthetic pathway enabled 129 ± 15 mg/L pyrogallol production.ConclusionsThis work established an efficient screening method and generated a high active PobA mutant. Assembling this mutant into GA and pyrogallol biosynthetic pathways achieved the de novo production of these two compounds. Besides, this mutant has great potential for GA or pyrogallol derivatives production. The screening method could be used for other GA biosynthesis-related enzymes.

2006 ◽  
Vol 188 (5) ◽  
pp. 1786-1797 ◽  
Ekaterina N. Andreishcheva ◽  
Willie F. Vann

ABSTRACT Escherichia coli K1 is responsible for 80% of E. coli neonatal meningitis and is a common pathogen in urinary tract infections. Bacteria of this serotype are encapsulated with the α(2-8)-polysialic acid NeuNAc(α2-8), common to several bacterial pathogens. The gene cluster encoding the pathway for synthesis of this polymer is organized into three regions: (i) kpsSCUDEF, (ii) neuDBACES, and (iii) kpsMT. The K1 polysialyltransferase, NeuS, cannot synthesize polysialic acid de novo without other products of the gene cluster. Membranes isolated from strains having the entire K1 gene cluster can synthesize polysialic acid de novo. We designed a series of plasmid constructs containing fragments of regions 1 and 2 in two compatible vectors to determine the minimum number of gene products required for de novo synthesis of the polysialic acid from CMP-NeuNAc in K1 E. coli. We measured the ability of the various combinations of region 1 and 2 fragments to restore polysialyltransferase activity in vitro in the absence of exogenously added polysaccharide acceptor. The products of region 2 genes neuDBACES alone were not sufficient to support de novo synthesis of polysialic acid in vitro. Only membrane fractions harboring NeuES and KpsCS could form sialic polymer in the absence of exogenous acceptor at the concentrations formed by wild-type E. coli K1 membranes. Membrane fractions harboring NeuES and KpsC together could form small quantities of the sialic polymer de novo.

2015 ◽  
Vol 112 (31) ◽  
pp. E4188-E4196 ◽  
Aleksandra Wawrzycka ◽  
Marta Gross ◽  
Anna Wasaznik ◽  
Igor Konieczny

Although the molecular basis for replisome activity has been extensively investigated, it is not clear what the exact mechanism for de novo assembly of the replication complex at the replication origin is, or how the directionality of replication is determined. Here, using the plasmid RK2 replicon, we analyze the protein interactions required for Escherichia coli polymerase III (Pol III) holoenzyme association at the replication origin. Our investigations revealed that in E. coli, replisome formation at the plasmid origin involves interactions of the RK2 plasmid replication initiation protein (TrfA) with both the polymerase β- and α-subunits. In the presence of other replication proteins, including DnaA, helicase, primase and the clamp loader, TrfA interaction with the β-clamp contributes to the formation of the β-clamp nucleoprotein complex on origin DNA. By reconstituting in vitro the replication reaction on ssDNA templates, we demonstrate that TrfA interaction with the β-clamp and sequence-specific TrfA interaction with one strand of the plasmid origin DNA unwinding element (DUE) contribute to strand-specific replisome assembly. Wild-type TrfA, but not the TrfA QLSLF mutant (which does not interact with the β-clamp), in the presence of primase, helicase, Pol III core, clamp loader, and β-clamp initiates DNA synthesis on ssDNA template containing 13-mers of the bottom strand, but not the top strand, of DUE. Results presented in this work uncovered requirements for anchoring polymerase at the plasmid replication origin and bring insights of how the directionality of DNA replication is determined.

