scholarly journals A comprehensive and high-quality collection of E. coli genomes and their genes

Author(s):  
Gal Horesh ◽  
Grace Blackwell ◽  
Gerry Tonkin-Hill ◽  
Jukka Corander ◽  
Eva Heinz ◽  
...  

AbstractEscherichia coli is a highly diverse organism which includes a range of commensal and pathogenic variants found across a range of niches and worldwide. In addition to causing severe intestinal and extraintestinal disease, E. coli is considered a priority pathogen due to high levels of observed drug resistance. The diversity in the E. coli population is driven by high genome plasticity and a very large gene pool. All these have made E. coli one of the most well-studied organisms, as well as a commonly used laboratory strain. Today, there are thousands of sequenced E. coli genomes stored in public databases. While data is widely available, accessing the information in order to perform analyses can still be a challenge. Collecting relevant available data requires accessing different sources, where data may be stored in a range of formats, and often requires further manipulation, and processing to apply various analyses and extract useful information. In this study, we collated and intensely curated a collection of over 10,000 E. coli and Shigella genomes to provide a single, uniform, high-quality dataset. Shigella were included as they are considered specialised pathovars of E. coli. We provide these data in a number of easily accessible formats which can be used as the foundation for future studies addressing the biological differences between E. coli lineages and the distribution and flow of genes in the E. coli population at a high resolution. The analysis we present emphasises our lack of understanding of the true diversity of the E. coli species, and the biased nature of our current understanding of the genetic diversity of such a key pathogen.Author NotesAll supporting data have been provided within the article or through supplementary data files. All supporting code is provided in the git repository https://github.com/ghoresh11/ecoli_genome_collection.Significance as a BioResource to the communityAs of today, there are more than 140,000 E. coli genomes available on public databases. While data is widely available, collating the data and extracting meaningful information from it often requires multiple steps, computational resources and expert knowledge. Here, we collate a high quality and comprehensive set of over 10,000 E. coli genomes, isolated from human hosts, into a set of manageable files that offer an accessible and usable snapshot of the currently available genome data, linked to a minimal data quality standard. The data provided includes a detailed synopsis of the main lineages present, including their antimicrobial and virulence profiles, their complete gene content, and all the associated metadata for each genome. This includes a database which enables the user to compare newly sequenced isolates against the assembled genomes. Additionally, we provide a searchable index which allows the user to query any DNA sequence against the assemblies of the collection. This collection paves the path for many future studies, including those investigating the differences between E. coli lineages, following the evolution of different genes in the E. coli pan-genome and exploring the dynamics of horizontal gene transfer in this important organism.Data SummaryThe complete aggregated metadata of 10,146 high quality genomes isolated from human hosts (doi.org/10.6084/m9.figshare.12514883, File F1).A PopPUNK database which can be used to query any genome and examine its context relative to this collection (Deposited to doi.org/10.6084/m9.figshare.12650834).A BIGSI index of all the genomes which can be used to easily and quickly query the genomes for any DNA sequence of 61 bp or longer (Deposited to doi.org/10.6084/m9.figshare.12666497).Description and complete profiling the 50 largest lineages which represent the majority of publicly available human-isolated E. coli genomes (doi.org/10.6084/m9.figshare.12514883, File F2). Phylogenetic trees of representative genomes of these lineages, presented in this manuscript, are also provided (doi.org/10.6084/m9.figshare.12514883, Files tree_500.nwk and tree_50.nwk).The complete pan-genome of the 50 largest lineages which includes:A FASTA file containing a single representative sequence of each gene of the gene pool (doi.org/10.6084/m9.figshare.12514883, File F3).Complete gene presence-absence across all isolates (doi.org/10.6084/m9.figshare.12514883, File F4).The frequency of each gene within each of the lineages (doi.org/10.6084/m9.figshare.12514883, File F5).The representative sequences from each lineage for all the genes (doi.org/10.6084/m9.figshare.12514883, File F6).

2021 ◽  
Author(s):  
Gal Horesh ◽  
Grace A. Blackwell ◽  
Gerry Tonkin-Hill ◽  
Jukka Corander ◽  
Eva Heinz ◽  
...  

Escherichia coli is a highly diverse organism that includes a range of commensal and pathogenic variants found across a range of niches and worldwide. In addition to causing severe intestinal and extraintestinal disease, E. coli is considered a priority pathogen due to high levels of observed drug resistance. The diversity in the E. coli population is driven by high genome plasticity and a very large gene pool. All these have made E. coli one of the most well-studied organisms, as well as a commonly used laboratory strain. Today, there are thousands of sequenced E. coli genomes stored in public databases. While data is widely available, accessing the information in order to perform analyses can still be a challenge. Collecting relevant available data requires accessing different sources, where data may be stored in a range of formats, and often requires further manipulation and processing to apply various analyses and extract useful information. In this study, we collated and intensely curated a collection of over 10 000 E. coli and Shigella genomes to provide a single, uniform, high-quality dataset. Shigella were included as they are considered specialized pathovars of E. coli . We provide these data in a number of easily accessible formats that can be used as the foundation for future studies addressing the biological differences between E. coli lineages and the distribution and flow of genes in the E. coli population at a high resolution. The analysis we present emphasizes our lack of understanding of the true diversity of the E. coli species, and the biased nature of our current understanding of the genetic diversity of such a key pathogen.


Author(s):  
Erin Felton ◽  
Aszia Burrell ◽  
Hollis Chaney ◽  
Iman Sami ◽  
Anastassios C. Koumbourlis ◽  
...  

Abstract Background Cystic fibrosis (CF) affects >70,000 people worldwide, yet the microbiologic trigger for pulmonary exacerbations (PExs) remains unknown. The objective of this study was to identify changes in bacterial metabolic pathways associated with clinical status. Methods Respiratory samples were collected at hospital admission for PEx, end of intravenous (IV) antibiotic treatment, and follow-up from 27 hospitalized children with CF. Bacterial DNA was extracted and shotgun DNA sequencing was performed. MetaPhlAn2 and HUMAnN2 were used to evaluate bacterial taxonomic and pathway relative abundance, while DESeq2 was used to evaluate differential abundance based on clinical status. Results The mean age of study participants was 10 years; 85% received combination IV antibiotic therapy (beta-lactam plus a second agent). Long-chain fatty acid (LCFA) biosynthesis pathways were upregulated in follow-up samples compared to end of treatment: gondoate (p = 0.012), oleate (p = 0.048), palmitoleate (p = 0.043), and pathways of fatty acid elongation (p = 0.012). Achromobacter xylosoxidans and Escherichia sp. were also more prevalent in follow-up compared to PEx (p < 0.001). Conclusions LCFAs may be associated with persistent infection of opportunistic pathogens. Future studies should more closely investigate the role of LCFA production by lung bacteria in the transition from baseline wellness to PEx in persons with CF. Impact Increased levels of LCFAs are found after IV antibiotic treatment in persons with CF. LCFAs have previously been associated with increased lung inflammation in asthma. This is the first report of LCFAs in the airway of persons with CF. This research provides support that bacterial production of LCFAs may be a contributor to inflammation in persons with CF. Future studies should evaluate LCFAs as predictors of future PExs.


Author(s):  
Marta Margeta ◽  
Peter Gould ◽  
Lili-Naz Hazrati ◽  
Veronica Hirsch-Reinshagen ◽  
Werner Paulus

Scholarly communication faces increasing economical and ethical challenges, including pricing policies and overbearing behavior of commercial publishing houses. Based on the hypothesis that a diamond open access neuropathology journal of a high scientific and technical quality can be run entirely by neuropathologists, we launched Free Neuropathology (FNP; freeneuropathology.org) in January 2020. Classical publisher activities, such as copyediting, layout, website maintenance, and journal promotion, are undertaken by neuropathologists and neuroscientists using free open access software. The journal is free for both readers and authors, and papers are published under a Creative Commons BY SA licence, where copyright remains with the authors. Based on 26 articles published by August 2020, it takes FNP 11.1 days from submission to first, and 19.9 days to final, decision. High-quality copyediting, layout, and online publishing in the final format is accomplished in only 8 days. Absence of a commercial publisher enables prioritization of democratic and scientifically-driven decisions on editorial structure, website design, journal promotion, paper formatting, special article series, and number of accepted papers. This new model of journal publishing, which returns the control of scholarly communication to scientists, will be of interest to neuropathologists and wider scientific community alike.Learning ObjectivesSummarize the current state and driving forces behind commercial and non-commercial scientific publishing in neuropathology.Describe the advantages and challenges of a non-commercial publishing platform for neuropathology.


2021 ◽  
Vol 3 (8) ◽  
Author(s):  
Muhammad Yasir ◽  
Basit Zeshan ◽  
Nur Hardy A. Daud ◽  
Izzah Shahid ◽  
Hafza Khalid

Abstract There is a need for more efficient and eco-friendly approaches to overcome increasing microbial infections. Bacteriocins and chitinases from Bacillus spp. can be powerful alternatives to conventional antibiotics and antifungal drugs, respectively. The purpose of this study was to assess the inhibitory potential of bacteriocins and chitinase enzymes against multiple resistant bacterial and fungal pathogens. Bacterial isolates were selected by growth on minimal salts medium and after that were morphologically and biochemically characterized. The physiochemical characterization of bacteriocins was carried out. The inhibitory potential of bacteriocins towards six pathogenic bacteria was determined by the well diffusion assay while chitinase activity towards three fungal strains was determined by the dual plate culture assay. Two bacterial strains (WW2P1 and WRE4P2), out of nine showed inhibition of K. pneumonia, P. aeruginosa, E. coli and MRSA while WW4P2 was positive against S. typhimurium and E. coli and WRE10P2 against P. aeruginosa, S. pneumoniae. Two bacterial isolates (WW3P1 and WRE10P2) were chosen for further study on the basis of their antifungal activities. Of these, WW3P1 isolate was more effective against A. fumigatus as well as A. niger. The proteinaceous nature of the bacteriocins was confirmed by treatment of the crude extract with proteinase K. It was found that the inhibitory activity of strain WW3P1 against E. coli was highest at 20 °C, and against S. pneumoniae it was at 20 °C and pH 10 after treatment with EDTA. Inhibition by strain the WRE10P2 against P. aeruginosa was highest at 20 °C and pH 14. It was found that EDTA increased the inhibitory activity of strain WW2P1 against P. aeruginosa, K. pneumoniae and E. coli by 2 ± 0.235, 3.5 ± 0.288, 2.5 ± 1.040 times, respectively, of strain WRE4P2 against P. aeruginosa and E. coli by 2.5 ± 0.763, 2.7 ± 0.5 times, respectively, and of strain WRE10P2 against S. pneumoniae by 3 ± 0.6236 times. The isolates have promising inhibitory activity, which should be further analyzed for the commercial production of antimicrobials. Article highlights The current study aimed to isolate the microbiome from wheat plant (Triticum aestivum L.), to screen for bacteriocin production and to assess its antimicrobial activity against human pathogens. Forty-one phenotypically different bacterial colonies were subjected to bacteriocin purification from which 25 colonies showed positive reactions. These 25 bacterial isolates were screened against six different human bacterial pathogens using the well diffusion method to check the antimicrobial activity. Out of nine bacterial isolates, WW3P1 and WRE10P2 were able to degrade the chitin and utilize it as their sole energy source. Strain WRE4P2 exhibited partial inactivation in its activity against MRSA after treatment with proteinase K.


2021 ◽  
Vol 55 (2) ◽  
pp. 25-34
Author(s):  
Jiawang Chen ◽  
Weitao He ◽  
Peng Zhou ◽  
Jiasong Fang ◽  
Dahai Zhang ◽  
...  

Abstract In order to obtain high-quality microbial samples from the hadal zone, which has a depth of over 6,000 m, a full-ocean-depth sampler with the function of in-situ filtration and preservation was developed. A flow pump and several membrane filters were used for in-situ filtration under the sea. With a multistage filtering structure, the microbes can be initially screened according to their sizes. To avoid the degradation of microbial ribonucleic acid (RNA), a special structure was designed to inject the RNAlater solution into the samples immediately after the filtration. The sampler was tested in our laboratory and deployed during Mariana TS-15 in 2019. It was installed on a hadal lander of Shanghai Ocean University and deployed at MBR02 (11.371°N, 142.587°E, 10,931 m) in the Mariana Trench. A total of 20 L of in-situ seawater was filtered, and membranes with pore sizes of 3 and 0.2 μm were preserved. The study is expected to provide important support for the establishment of a hadal microbial gene pool.


Russian vine ◽  
2020 ◽  
Vol 14 ◽  
pp. 30-36
Author(s):  
L.G. Naumova ◽  
◽  
V.A. Ganich ◽  

The article reflects the results of work on the mobilization, conservation, replenishment and study of genetic resources of grapevines of the Don ampelographic collection named after Ya.I. Potapenko (Novocherkassk, Ros-tov region) in 2019. Mobilization, conserva-tion, replenishment and study of plant bio-diversity, identification of new and assess-ment of stocks of used species, is gaining theoretical, scientific and practical signifi-cance, and is currently relevant. Most of the native and sparsely distributed grapevine varieties are now preserved only through col-lections. The process of grape selection is closely related to the need to preserve and replenish collections, since this is the main base for large-scale ampelographic and ge-netic selection works, which are currently effective and very effective in science and production and thus practically significant for the Russian grape-growing industry. Cur-rently, the preserved gene pool of grapevines in the collection includes 870 varieties, the collection is supplemented with 5 new grape varieties (Suholimanskij belyj, Traminer be-lyj, Granatovyj, Dostojnyj, Mriya). Two sources of economically valuable traits for high-quality wine – making were identified: Laсukere and Neizvestnyj donskoj varieties.


PLoS ONE ◽  
2020 ◽  
Vol 15 (12) ◽  
pp. e0240953
Author(s):  
Christian Schulz ◽  
Eivind Almaas

Approaches for systematizing information of relatedness between organisms is important in biology. Phylogenetic analyses based on sets of highly conserved genes are currently the basis for the Tree of Life. Genome-scale metabolic reconstructions contain high-quality information regarding the metabolic capability of an organism and are typically restricted to metabolically active enzyme-encoding genes. While there are many tools available to generate draft reconstructions, expert-level knowledge is still required to generate and manually curate high-quality genome-scale metabolic models and to fill gaps in their reaction networks. Here, we use the tool AutoKEGGRec to construct 975 genome-scale metabolic draft reconstructions encoded in the KEGG database without further curation. The organisms are selected across all three domains, and their metabolic networks serve as basis for generating phylogenetic trees. We find that using all reactions encoded, these metabolism-based comparisons give rise to a phylogenetic tree with close similarity to the Tree of Life. While this tree is quite robust to reasonable levels of noise in the metabolic reaction content of an organism, we find a significant heterogeneity in how much noise an organism may tolerate before it is incorrectly placed in the tree. Furthermore, by using the protein sequences for particular metabolic functions and pathway sets, such as central carbon-, nitrogen-, and sulfur-metabolism, as basis for the organism comparisons, we generate highly specific phylogenetic trees. We believe the generation of phylogenetic trees based on metabolic reaction content, in particular when focused on specific functions and pathways, could aid the identification of functionally important metabolic enzymes and be of value for genome-scale metabolic modellers and enzyme-engineers.


2021 ◽  
Vol 15 (8) ◽  
pp. e0009665
Author(s):  
Shuai Xu ◽  
Zhenpeng Li ◽  
Yuanming Huang ◽  
Lichao Han ◽  
Yanlin Che ◽  
...  

Nocardia is a complex and diverse genus of aerobic actinomycetes that cause complex clinical presentations, which are difficult to diagnose due to being misunderstood. To date, the genetic diversity, evolution, and taxonomic structure of the genus Nocardia are still unclear. In this study, we investigated the pan-genome of 86 Nocardia type strains to clarify their genetic diversity. Our study revealed an open pan-genome for Nocardia containing 265,836 gene families, with about 99.7% of the pan-genome being variable. Horizontal gene transfer appears to have been an important evolutionary driver of genetic diversity shaping the Nocardia genome and may have caused historical taxonomic confusion from other taxa (primarily Rhodococcus, Skermania, Aldersonia, and Mycobacterium). Based on single-copy gene families, we established a high-accuracy phylogenomic approach for Nocardia using 229 genome sequences. Furthermore, we found 28 potentially new species and reclassified 16 strains. Finally, by comparing the topology between a phylogenomic tree and 384 phylogenetic trees (from 384 single-copy genes from the core genome), we identified a novel locus for inferring the phylogeny of this genus. The dapb1 gene, which encodes dipeptidyl aminopeptidase BI, was far superior to commonly used markers for Nocardia and yielded a topology almost identical to that of genome-based phylogeny. In conclusion, the present study provides insights into the genetic diversity, contributes a robust framework for the taxonomic classification, and elucidates the evolutionary relationships of Nocardia. This framework should facilitate the development of rapid tests for the species identification of highly variable species and has given new insight into the behavior of this genus.


2018 ◽  
Author(s):  
Meghana Natesh ◽  
Ryan W. Taylor ◽  
Nathan Truelove ◽  
Elizabeth A. Hadly ◽  
Stephen Palumbi ◽  
...  

AbstractModerate to high density genotyping (100+ SNPs) is widely used to determine and measure individual identity, relatedness, fitness, population structure and migration in wild populations.However, these important tools are difficult to apply when high-quality genetic material is unavailable. Most genomic tools are developed for high quality DNA sources from labor medical settings. As a result, most genetic data from market or field settings is limited to easily amplified mitochondrial DNA or a few microsatellites.To enable genotyping in conservation contexts, we used next-generation sequencing of multiplex PCR products from very low-quality DNA extracted from feces, hair, and cooked samples. We demonstrated utility and wide-ranging potential application in endangered wild tigers and tracking commercial trade in Caribbean queen conch.We genotyped 100 SNPs from degraded tiger samples to identify individuals, discern close relatives, and detect population differentiation. Co-occurring carnivores do not amplify (e.g. Indian wild dog/Dhole) or are monomorphic (e.g. leopard). 62 SNPs from conch fritters and field-collected samples were used to test relatedness and detect population structure.We provide proof-of-concept for a rapid, simple, cost-effective, and scalable method (for both samples and number of loci), a framework that can be applied to other conservation scenarios previously limited by low quality DNA samples. These approaches provide a critical advance for wildlife monitoring and forensics, open the door to field-ready testing, and will strengthen the use of science in policy decisions and wildlife trade.


2021 ◽  
Vol 11 ◽  
Author(s):  
Steven J. Holochwost ◽  
Judith Hill Bose ◽  
Elizabeth Stuk ◽  
Eleanor D. Brown ◽  
Kate E. Anderson ◽  
...  

Growth mindset is an important aspect of children’s socioemotional development and is subject to change due to environmental influence. Orchestral music education may function as a fertile context in which to promote growth mindset; however, this education is not widely available to children facing economic hardship. This study examined whether participation in a program of orchestral music education was associated with higher levels of overall growth mindset and greater change in levels of musical growth mindset among children placed at risk by poverty. After at least 2 years of orchestral participation, students reported significantly higher levels of overall growth mindset than their peers; participating students also reported statistically significant increases in musical growth mindset regardless of the number of years that they were enrolled in orchestral music education. These findings have implications for future research into specific pedagogical practices that may promote growth mindset in the context of orchestral music education and more generally for future studies of the extra-musical benefits of high-quality music education.


Sign in / Sign up

Export Citation Format

Share Document