scholarly journals The EcoCyc Database in 2021

2021 ◽  
Vol 12 ◽  
Author(s):  
Ingrid M. Keseler ◽  
Socorro Gama-Castro ◽  
Amanda Mackie ◽  
Richard Billington ◽  
César Bonavides-Martínez ◽  
...  

The EcoCyc model-organism database collects and summarizes experimental data for Escherichia coli K-12. EcoCyc is regularly updated by the manual curation of individual database entries, such as genes, proteins, and metabolic pathways, and by the programmatic addition of results from select high-throughput analyses. Updates to the Pathway Tools software that supports EcoCyc and to the web interface that enables user access have continuously improved its usability and expanded its functionality. This article highlights recent improvements to the curated data in the areas of metabolism, transport, DNA repair, and regulation of gene expression. New and revised data analysis and visualization tools include an interactive metabolic network explorer, a circular genome viewer, and various improvements to the speed and usability of existing tools.

2004 ◽  
Vol 5 (4) ◽  
pp. 362-369 ◽  
Author(s):  
Danforth Weems ◽  
Neil Miller ◽  
Margarita Garcia-Hernandez ◽  
Eva Huala ◽  
Seung Y. Rhee

TheArabidopsisInformation Resource (TAIR) is a web-based community database for the model plantArabidopsis thaliana. It provides an integrated view of genes, sequences, proteins, germplasms, clones, metabolic pathways, gene expression, ecotypes, polymorphisms, publications, maps and community information. TAIR is developed and maintained by collaboration between software developers and biologists. Biologists provide specification and use cases for the system, acquire, analyse and curate data, interact with users and test the software. Software developers design, implement and test the database and software. In this review, we briefly describe how TAIR was built and is being maintained.


Author(s):  
Olivier Arnaiz ◽  
Eric Meyer ◽  
Linda Sperling

Abstract ParameciumDB (https://paramecium.i2bc.paris-saclay.fr) is a community model organism database for the genome and genetics of the ciliate Paramecium. ParameciumDB development relies on the GMOD (www.gmod.org) toolkit. The ParameciumDB web site has been publicly available since 2006 when the P. tetraurelia somatic genome sequence was released, revealing that a series of whole genome duplications punctuated the evolutionary history of the species. The genome is linked to available genetic data and stocks. ParameciumDB has undergone major changes in its content and website since the last update published in 2011. Genomes from multiple Paramecium species, especially from the P. aurelia complex, are now included in ParameciumDB. A new modern web interface accompanies this transition to a database for the whole Paramecium genus. Gene pages have been enriched with orthology relationships, among the Paramecium species and with a panel of model organisms across the eukaryotic tree. This update also presents expert curation of Paramecium mitochondrial genomes.


Database ◽  
2020 ◽  
Vol 2020 ◽  
Author(s):  
Carlos-Francisco Méndez-Cruz ◽  
Antonio Blanchet ◽  
Alan Godínez ◽  
Ignacio Arroyo-Fernández ◽  
Socorro Gama-Castro ◽  
...  

Abstract Transcription factors (TFs) play a main role in transcriptional regulation of bacteria, as they regulate transcription of the genetic information encoded in DNA. Thus, the curation of the properties of these regulatory proteins is essential for a better understanding of transcriptional regulation. However, traditional manual curation of article collections to compile descriptions of TF properties takes significant time and effort due to the overwhelming amount of biomedical literature, which increases every day. The development of automatic approaches for knowledge extraction to assist curation is therefore critical. Here, we show an effective approach for knowledge extraction to assist curation of summaries describing bacterial TF properties based on an automatic text summarization strategy. We were able to recover automatically a median 77% of the knowledge contained in manual summaries describing properties of 177 TFs of Escherichia coli K-12 by processing 5961 scientific articles. For 71% of the TFs, our approach extracted new knowledge that can be used to expand manual descriptions. Furthermore, as we trained our predictive model with manual summaries of E. coli, we also generated summaries for 185 TFs of Salmonella enterica serovar Typhimurium from 3498 articles. According to the manual curation of 10 of these Salmonella typhimurium summaries, 96% of their sentences contained relevant knowledge. Our results demonstrate the feasibility to assist manual curation to expand manual summaries with new knowledge automatically extracted and to create new summaries of bacteria for which these curation efforts do not exist. Database URL: The automatic summaries of the TFs of E. coli and Salmonella and the automatic summarizer are available in GitHub (https://github.com/laigen-unam/tf-properties-summarizer.git).


2015 ◽  
Vol 44 (D1) ◽  
pp. D1195-D1201 ◽  
Author(s):  
Carson M. Andorf ◽  
Ethalinda K. Cannon ◽  
John L. Portwood ◽  
Jack M. Gardiner ◽  
Lisa C. Harper ◽  
...  

genesis ◽  
2015 ◽  
Vol 53 (8) ◽  
pp. 498-509 ◽  
Author(s):  
Leyla Ruzicka ◽  
Yvonne M. Bradford ◽  
Ken Frazer ◽  
Douglas G. Howe ◽  
Holly Paddock ◽  
...  

BMC Genomics ◽  
2022 ◽  
Vol 23 (1) ◽  
Author(s):  
Chenna Swetha ◽  
Anushree Narjala ◽  
Awadhesh Pandit ◽  
Varsha Tirumalai ◽  
P. V. Shivaprasad

Abstract Background Small non-coding (s)RNAs are involved in the negative regulation of gene expression, playing critical roles in genome integrity, development and metabolic pathways. Targeting of RNAs by ribonucleoprotein complexes of sRNAs bound to Argonaute (AGO) proteins results in cleaved RNAs having precise and predictable 5` ends. While tools to study sliced bits of RNAs to confirm the efficiency of sRNA-mediated regulation are available, they are sub-optimal. In this study, we provide an improvised version of a tool with better efficiency to accurately validate sRNA targets. Results Here, we improvised the CleaveLand tool to identify additional micro (mi)RNA targets that belong to the same family and also other targets within a specified free energy cut-off. These additional targets were otherwise excluded during the default run. We employed these tools to understand the sRNA targeting efficiency in wild and cultivated rice, sequenced degradome from two rice lines, O. nivara and O. sativa indica Pusa Basmati-1 and analyzed variations in sRNA targeting. Our results indicate the existence of multiple miRNA-mediated targeting differences between domesticated and wild species. For example, Os5NG4 was targeted only in wild rice that might be responsible for the poor secondary wall formation when compared to cultivated rice. We also identified differential mRNA targets of secondary sRNAs that were generated after miRNA-mediated cleavage of primary targets. Conclusions We identified many differentially targeted mRNAs between wild and domesticated rice lines. In addition to providing a step-wise guide to generate and analyze degradome datasets, we showed how domestication altered sRNA-mediated cascade silencing during the evolution of indica rice.


2021 ◽  
Vol 12 ◽  
Author(s):  
Suzanne Paley ◽  
Ingrid M. Keseler ◽  
Markus Krummenacker ◽  
Peter D. Karp

Updating genome databases to reflect newly published molecular findings for an organism was hard enough when only a single strain of a given organism had been sequenced. With multiple sequenced strains now available for many organisms, the challenge has grown significantly because of the still-limited resources available for the manual curation that corrects errors and captures new knowledge. We have developed a method to automatically propagate multiple types of curated knowledge from genes and proteins in one genome database to their orthologs in uncurated databases for related strains, imposing several quality-control filters to reduce the chances of introducing errors. We have applied this method to propagate information from the highly curated EcoCyc database for Escherichia coli K–12 to databases for 480 other Escherichia coli strains in the BioCyc database collection. The increase in value and utility of the target databases after propagation is considerable. Target databases received updates for an average of 2,535 proteins each. In addition to widespread addition and regularization of gene and protein names, 97% of the target databases were improved by the addition of at least 200 new protein complexes, at least 800 new or updated reaction assignments, and at least 2,400 sets of GO annotations.


Database ◽  
2020 ◽  
Vol 2020 ◽  
Author(s):  
Hugo Mochão ◽  
Pedro Barahona ◽  
Rafael S Costa

Abstract The KiMoSys (https://kimosys.org), launched in 2014, is a public repository of published experimental data, which contains concentration data of metabolites, protein abundances and flux data. It offers a web-based interface and upload facility to share data, making it accessible in structured formats, while also integrating associated kinetic models related to the data. In addition, it also supplies tools to simplify the construction process of ODE (Ordinary Differential Equations)-based models of metabolic networks. In this release, we present an update of KiMoSys with new data and several new features, including (i) an improved web interface, (ii) a new multi-filter mechanism, (iii) introduction of data visualization tools, (iv) the addition of downloadable data in machine-readable formats, (v) an improved data submission tool, (vi) the integration of a kinetic model simulation environment and (vii) the introduction of a unique persistent identifier system. We believe that this new version will improve its role as a valuable resource for the systems biology community. Database URL:  www.kimosys.org


2008 ◽  
Vol 2008 ◽  
pp. 1-10 ◽  
Author(s):  
Carolyn J. Lawrence ◽  
Lisa C. Harper ◽  
Mary L. Schaeffer ◽  
Taner Z. Sen ◽  
Trent E. Seigfried ◽  
...  

In 2001 maize became the number one production crop in the world with the Food and Agriculture Organization of the United Nations reporting over 614 million tonnes produced. Its success is due to the high productivity per acre in tandem with a wide variety of commercial uses. Not only is maize an excellent source of food, feed, and fuel, but also its by-products are used in the production of various commercial products. Maize's unparalleled success in agriculture stems from basic research, the outcomes of which drive breeding and product development. In order for basic, translational, and applied researchers to benefit from others' investigations, newly generated data must be made freely and easily accessible. MaizeGDB is the maize research community's central repository for genetics and genomics information. The overall goals of MaizeGDB are to facilitate access to the outcomes of maize research by integrating new maize data into the database and to support the maize research community by coordinating group activities.


2019 ◽  
Vol 81 (9) ◽  
pp. 626-635
Author(s):  
Cory T. Forbes ◽  
Dante Cisterna ◽  
Devarati Bhattacharya ◽  
Ranu Roy

Learning about heredity is important across the K–12 continuum. However, these ideas may be challenging for students. We examined third-grade students' ideas about heredity in the context of a new, six-week, model-based science unit that uses corn as a model organism to support students' ideas about heredity. We analyzed data collected during implementation of the unit, including student artifacts and interviews. We compared these data to those from a pilot version of the curriculum – implemented in the prior year – that was focused on the same disciplinary concepts but was not designed around scientific modeling. Our findings illustrate levels of understanding in students' ideas about three target concepts underlying heredity: life cycles, trait inheritance, and trait variation. We also found that students experiencing the model-based version of the unit exhibited higher levels of understanding for two of the three target concepts than those experiencing the non-model-based curriculum. Analysis of student interviews also showed that students experiencing the model-based curriculum were better able to use key elements of life cycle, such as pollination and reproduction to support their explanations about inheritance. We discuss implications of this work for design and enactment of model-based curricula in elementary grades that can support students' learning about heredity.


Sign in / Sign up

Export Citation Format

Share Document