genome databases
Recently Published Documents


TOTAL DOCUMENTS

137
(FIVE YEARS 25)

H-INDEX

24
(FIVE YEARS 5)

2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Margaret R. Woodhouse ◽  
Ethalinda K. Cannon ◽  
John L. Portwood ◽  
Lisa C. Harper ◽  
Jack M. Gardiner ◽  
...  

AbstractResearch in the past decade has demonstrated that a single reference genome is not representative of a species’ diversity. MaizeGDB introduces a pan-genomic approach to hosting genomic data, leveraging the large number of diverse maize genomes and their associated datasets to quickly and efficiently connect genomes, gene models, expression, epigenome, sequence variation, structural variation, transposable elements, and diversity data across genomes so that researchers can easily track the structural and functional differences of a locus and its orthologs across maize. We believe our framework is unique and provides a template for any genomic database poised to host large-scale pan-genomic data.


2021 ◽  
Author(s):  
Cristian Javier Caniu

Artificial intelligence-based predictions have emerged as a friendly and reliable tool for the surveillance of the antimicrobial resistance (AMR) worldwide. In this regard, genome databases typically include whole-genome sequencing (WGS) data containing AMR metadata that can be used to train machine learning (ML) models, in order to predict phenotype features from genome samples. In this study, using a Neural Network (NN) architecture and the SGD-ADAM algorithm, we build ML antibiotic resistance models that can predict Minimum Inhibitory Concentrations (MICs) and antimicrobial susceptibility profiles of Salmonella spp. Data analysis was based on 7,268 genomes publicly available in PATRIC database, containing about 75,000 AMR annotations. ML models were built using reference-free k-mer analysis of whole-genome sequences, MIC measurements and susceptibility categories, obtaining robust and accurate results for 9 antibiotics belonging to beta-lactam, fluoroquinolone, phenicol, aminoglycoside, tetracycline and sulphonamide classes. Although the accuracy of predicting the actual MIC reaches modest levels, the within +/- 1 2-fold dilution accuracy per antibiotic reaches significant levels with values that varies from 85% to 95%, with narrow 95% CIs of about 5% and individual accuracies per MIC ≥ 80%. For differentiation between ''susceptible'' and ''resistant'' values, by measuring the accuracy and error of model's susceptibility predictions to different antibiotics, the accuracy is the same as before and ranges from 85% to 95%, with 95% CIs of about 5%, the recall extends from 75% to 85%, the precision from 60% to 90%, whereas the very major error is ≤ 20%. In summary, these results show that NN-based models are able to learn and predict the AMR phenotype from bacterial genomes based on a gene-free k-mer analysis.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Atsushi Kondo ◽  
China Nagano ◽  
Shinya Ishiko ◽  
Takashi Omori ◽  
Yuya Aoto ◽  
...  

AbstractGitelman syndrome is an autosomal recessive inherited salt-losing tubulopathy. It has a prevalence of around 1 in 40,000 people, and heterozygous carriers are estimated at approximately 1%, although the exact prevalence is unknown. We estimated the predicted prevalence of Gitelman syndrome based on multiple genome databases, HGVD and jMorp for the Japanese population and gnomAD for other ethnicities, and included all 274 pathogenic missense or nonsense variants registered in HGMD Professional. The frequencies of all these alleles were summed to calculate the total variant allele frequency in SLC12A3. The carrier frequency and the disease prevalence were assumed to be twice and the square of the total allele frequency, respectively, according to the Hardy–Weinberg principle. In the Japanese population, the total carrier frequencies were 0.0948 (9.5%) and 0.0868 (8.7%) and the calculated prevalence was 0.00225 (2.3 in 1000 people) and 0.00188 (1.9 in 1000 people) in HGVD and jMorp, respectively. Other ethnicities showed a prevalence varying from 0.000012 to 0.00083. These findings indicate that the prevalence of Gitelman syndrome in the Japanese population is higher than expected and that some other ethnicities also have a higher prevalence than has previously been considered.


2021 ◽  
Author(s):  
Hans-Joachim Ruscheweyh ◽  
Alessio Milanese ◽  
Lucas Paoli ◽  
Quentin Clayssen ◽  
Daniel Mende ◽  
...  

AbstractSummaryIdentifying species and estimating their relative abundance in metagenomic samples is a crucial task in microbiome research. However, for many environments, variable fractions of microbial species are not represented in reference genome databases, and thus, remain often unaccounted for in taxonomic composition analyses. Here, we present an updated version of the metagenomic profiling tool mOTUs. We extended its database of taxonomic marker genes by including information from ~600,000 prokaryotic draft genomes, >96% of which are metagenome assembled genomes from 23 environments. This extension enables researchers to profile a previously underrepresented diversity of microbes not only in some of the most intensively studied environments such as the human, soil and ocean microbiomes, but also in freshwater systems, the air, and the gastrointestinal tract of mice, cattle and other animals. This update doubles the number of detectable prokaryotic species to 33,570 and includes the release of 11,164 taxonomic profiles as a community resource.Availability and implementationmOTUs 2.6 is implemented in Python and licensed under GPLv3. Source code, profiles and documentation are available at: https://motu-tool.org/.Supplementary informationSupplementary information is available online.


Antibiotics ◽  
2021 ◽  
Vol 10 (4) ◽  
pp. 403
Author(s):  
Fernando Román-Hurtado ◽  
Marina Sánchez-Hidalgo ◽  
Jesús Martín ◽  
Francisco Javier Ortiz-López ◽  
Olga Genilloud

Cacaoidin is produced by the strain Streptomyces cacaoi CA-170360 and represents the first member of the new lanthidin (class V lanthipeptides) RiPP family. In this work, we describe the complete identification, cloning and heterologous expression of the cacaoidin biosynthetic gene cluster, which shows unique RiPP genes whose functions were not predicted by any bioinformatic tool. We also describe that the cacaoidin pathway is restricted to strains of the subspecies Streptomyces cacaoi subsp. cacaoi found in public genome databases, where we have also identified the presence of other putative class V lanthipeptide pathways. This is the first report on the heterologous production of a class V lanthipeptide.


2021 ◽  
Vol 12 ◽  
Author(s):  
Suzanne Paley ◽  
Ingrid M. Keseler ◽  
Markus Krummenacker ◽  
Peter D. Karp

Updating genome databases to reflect newly published molecular findings for an organism was hard enough when only a single strain of a given organism had been sequenced. With multiple sequenced strains now available for many organisms, the challenge has grown significantly because of the still-limited resources available for the manual curation that corrects errors and captures new knowledge. We have developed a method to automatically propagate multiple types of curated knowledge from genes and proteins in one genome database to their orthologs in uncurated databases for related strains, imposing several quality-control filters to reduce the chances of introducing errors. We have applied this method to propagate information from the highly curated EcoCyc database for Escherichia coli K–12 to databases for 480 other Escherichia coli strains in the BioCyc database collection. The increase in value and utility of the target databases after propagation is considerable. Target databases received updates for an average of 2,535 proteins each. In addition to widespread addition and regularization of gene and protein names, 97% of the target databases were improved by the addition of at least 200 new protein complexes, at least 800 new or updated reaction assignments, and at least 2,400 sets of GO annotations.


2020 ◽  
Author(s):  
Hiroshi Yamaguchi ◽  
Hiroaki Nagase ◽  
Shoichi Tokumoto ◽  
Kazumi Tomioka ◽  
Masahiro Nishiyama ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document