Xyntagma: a graphical query interface for the ACeDB genome databases

Author(s):  
W. Grajewski ◽  
Dong-Guk Shin ◽  
Lei Liu
Author(s):  
John Zobolas ◽  
Vasundra Touré ◽  
Martin Kuiper ◽  
Steven Vercruysse

Abstract Summary We present a set of software packages that provide uniform access to diverse biological vocabulary resources that are instrumental for current biocuration efforts and tools. The Unified Biological Dictionaries (UniBioDicts or UBDs) provide a single query-interface for accessing the online API services of leading biological data providers. Given a search string, UBDs return a list of matching term, identifier and metadata units from databases (e.g. UniProt), controlled vocabularies (e.g. PSI-MI) and ontologies (e.g. GO, via BioPortal). This functionality can be connected to input fields (user-interface components) that offer autocomplete lookup for these dictionaries. UBDs create a unified gateway for accessing life science concepts, helping curators find annotation terms across resources (based on descriptive metadata and unambiguous identifiers), and helping data users search and retrieve the right query terms. Availability and implementation The UBDs are available through npm and the code is available in the GitHub organisation UniBioDicts (https://github.com/UniBioDicts) under the Affero GPL license. Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Atsushi Kondo ◽  
China Nagano ◽  
Shinya Ishiko ◽  
Takashi Omori ◽  
Yuya Aoto ◽  
...  

AbstractGitelman syndrome is an autosomal recessive inherited salt-losing tubulopathy. It has a prevalence of around 1 in 40,000 people, and heterozygous carriers are estimated at approximately 1%, although the exact prevalence is unknown. We estimated the predicted prevalence of Gitelman syndrome based on multiple genome databases, HGVD and jMorp for the Japanese population and gnomAD for other ethnicities, and included all 274 pathogenic missense or nonsense variants registered in HGMD Professional. The frequencies of all these alleles were summed to calculate the total variant allele frequency in SLC12A3. The carrier frequency and the disease prevalence were assumed to be twice and the square of the total allele frequency, respectively, according to the Hardy–Weinberg principle. In the Japanese population, the total carrier frequencies were 0.0948 (9.5%) and 0.0868 (8.7%) and the calculated prevalence was 0.00225 (2.3 in 1000 people) and 0.00188 (1.9 in 1000 people) in HGVD and jMorp, respectively. Other ethnicities showed a prevalence varying from 0.000012 to 0.00083. These findings indicate that the prevalence of Gitelman syndrome in the Japanese population is higher than expected and that some other ethnicities also have a higher prevalence than has previously been considered.


1984 ◽  
Vol 14 (2) ◽  
pp. 100-106 ◽  
Author(s):  
Dennis Fogg
Keyword(s):  

2017 ◽  
Vol 93 (3) ◽  
pp. 459-466 ◽  
Author(s):  
J. Ghouse ◽  
M.W. Skov ◽  
R.S. Bigseth ◽  
G. Ahlberg ◽  
J.K. Kanters ◽  
...  

2021 ◽  
Vol 12 ◽  
Author(s):  
Suzanne Paley ◽  
Ingrid M. Keseler ◽  
Markus Krummenacker ◽  
Peter D. Karp

Updating genome databases to reflect newly published molecular findings for an organism was hard enough when only a single strain of a given organism had been sequenced. With multiple sequenced strains now available for many organisms, the challenge has grown significantly because of the still-limited resources available for the manual curation that corrects errors and captures new knowledge. We have developed a method to automatically propagate multiple types of curated knowledge from genes and proteins in one genome database to their orthologs in uncurated databases for related strains, imposing several quality-control filters to reduce the chances of introducing errors. We have applied this method to propagate information from the highly curated EcoCyc database for Escherichia coli K–12 to databases for 480 other Escherichia coli strains in the BioCyc database collection. The increase in value and utility of the target databases after propagation is considerable. Target databases received updates for an average of 2,535 proteins each. In addition to widespread addition and regularization of gene and protein names, 97% of the target databases were improved by the addition of at least 200 new protein complexes, at least 800 new or updated reaction assignments, and at least 2,400 sets of GO annotations.


Sign in / Sign up

Export Citation Format

Share Document