The EcoCyc Database in 2021

The EcoCyc model-organism database collects and summarizes experimental data for Escherichia coli K-12. EcoCyc is regularly updated by the manual curation of individual database entries, such as genes, proteins, and metabolic pathways, and by the programmatic addition of results from select high-throughput analyses. Updates to the Pathway Tools software that supports EcoCyc and to the web interface that enables user access have continuously improved its usability and expanded its functionality. This article highlights recent improvements to the curated data in the areas of metabolism, transport, DNA repair, and regulation of gene expression. New and revised data analysis and visualization tools include an interactive metabolic network explorer, a circular genome viewer, and various improvements to the speed and usability of existing tools.

Download Full-text

Design, Implementation and Maintenance of a Model Organism Database forArabidopsis thaliana

Comparative and Functional Genomics ◽

10.1002/cfg.408 ◽

2004 ◽

Vol 5 (4) ◽

pp. 362-369 ◽

Cited By ~ 12

Author(s):

Danforth Weems ◽

Neil Miller ◽

Margarita Garcia-Hernandez ◽

Eva Huala ◽

Seung Y. Rhee

Keyword(s):

Gene Expression ◽

Arabidopsis Thaliana ◽

Metabolic Pathways ◽

Model Organism ◽

Information Resource ◽

Model Organism Database ◽

Software Developers ◽

Web Based ◽

Model Plant ◽

Community Information

TheArabidopsisInformation Resource (TAIR) is a web-based community database for the model plantArabidopsis thaliana. It provides an integrated view of genes, sequences, proteins, germplasms, clones, metabolic pathways, gene expression, ecotypes, polymorphisms, publications, maps and community information. TAIR is developed and maintained by collaboration between software developers and biologists. Biologists provide specification and use cases for the system, acquire, analyse and curate data, interact with users and test the software. Software developers design, implement and test the database and software. In this review, we briefly describe how TAIR was built and is being maintained.

Download Full-text

ParameciumDB 2019: integrating genomic data across the genus for functional and evolutionary biology

Nucleic Acids Research ◽

10.1093/nar/gkz948 ◽

2019 ◽

Cited By ~ 2

Author(s):

Olivier Arnaiz ◽

Eric Meyer ◽

Linda Sperling

Keyword(s):

Evolutionary Biology ◽

Model Organism ◽

Model Organisms ◽

Web Interface ◽

Model Organism Database ◽

Whole Genome Duplications ◽

Community Model ◽

Genome Duplications ◽

History Of ◽

Somatic Genome

Abstract ParameciumDB (https://paramecium.i2bc.paris-saclay.fr) is a community model organism database for the genome and genetics of the ciliate Paramecium. ParameciumDB development relies on the GMOD (www.gmod.org) toolkit. The ParameciumDB web site has been publicly available since 2006 when the P. tetraurelia somatic genome sequence was released, revealing that a series of whole genome duplications punctuated the evolutionary history of the species. The genome is linked to available genetic data and stocks. ParameciumDB has undergone major changes in its content and website since the last update published in 2011. Genomes from multiple Paramecium species, especially from the P. aurelia complex, are now included in ParameciumDB. A new modern web interface accompanies this transition to a database for the whole Paramecium genus. Gene pages have been enriched with orthology relationships, among the Paramecium species and with a panel of model organisms across the eukaryotic tree. This update also presents expert curation of Paramecium mitochondrial genomes.

Download Full-text

Knowledge extraction for assisted curation of summaries of bacterial transcription factor properties

Database ◽

10.1093/database/baaa109 ◽

2020 ◽

Vol 2020 ◽

Author(s):

Carlos-Francisco Méndez-Cruz ◽

Antonio Blanchet ◽

Alan Godínez ◽

Ignacio Arroyo-Fernández ◽

Socorro Gama-Castro ◽

...

Keyword(s):

Transcriptional Regulation ◽

Knowledge Extraction ◽

Biomedical Literature ◽

Main Role ◽

E Coli ◽

Bacterial Transcription ◽

Manual Curation ◽

New Knowledge ◽

Serovar Typhimurium ◽

K 12

Abstract Transcription factors (TFs) play a main role in transcriptional regulation of bacteria, as they regulate transcription of the genetic information encoded in DNA. Thus, the curation of the properties of these regulatory proteins is essential for a better understanding of transcriptional regulation. However, traditional manual curation of article collections to compile descriptions of TF properties takes significant time and effort due to the overwhelming amount of biomedical literature, which increases every day. The development of automatic approaches for knowledge extraction to assist curation is therefore critical. Here, we show an effective approach for knowledge extraction to assist curation of summaries describing bacterial TF properties based on an automatic text summarization strategy. We were able to recover automatically a median 77% of the knowledge contained in manual summaries describing properties of 177 TFs of Escherichia coli K-12 by processing 5961 scientific articles. For 71% of the TFs, our approach extracted new knowledge that can be used to expand manual descriptions. Furthermore, as we trained our predictive model with manual summaries of E. coli, we also generated summaries for 185 TFs of Salmonella enterica serovar Typhimurium from 3498 articles. According to the manual curation of 10 of these Salmonella typhimurium summaries, 96% of their sentences contained relevant knowledge. Our results demonstrate the feasibility to assist manual curation to expand manual summaries with new knowledge automatically extracted and to create new summaries of bacteria for which these curation efforts do not exist. Database URL: The automatic summaries of the TFs of E. coli and Salmonella and the automatic summarizer are available in GitHub (https://github.com/laigen-unam/tf-properties-summarizer.git).

Download Full-text

MaizeGDB update: new tools, data and interface for the maize model organism database

Nucleic Acids Research ◽

10.1093/nar/gkv1007 ◽

2015 ◽

Vol 44 (D1) ◽

pp. D1195-D1201 ◽

Cited By ~ 113

Author(s):

Carson M. Andorf ◽

Ethalinda K. Cannon ◽

John L. Portwood ◽

Jack M. Gardiner ◽

Lisa C. Harper ◽

...

Keyword(s):

Model Organism ◽

Model Organism Database

Download Full-text

ZFIN, The zebrafish model organism database: Updates and new directions

genesis ◽

10.1002/dvg.22868 ◽

2015 ◽

Vol 53 (8) ◽

pp. 498-509 ◽

Cited By ~ 46

Author(s):

Leyla Ruzicka ◽

Yvonne M. Bradford ◽

Ken Frazer ◽

Douglas G. Howe ◽

Holly Paddock ◽

...

Keyword(s):

Model Organism ◽

Model Organism Database ◽

Zebrafish Model ◽

Database Updates ◽

New Directions

Download Full-text

Degradome comparison between wild and cultivated rice identifies differential targeting by miRNAs

BMC Genomics ◽

10.1186/s12864-021-08288-5 ◽

2022 ◽

Vol 23 (1) ◽

Author(s):

Chenna Swetha ◽

Anushree Narjala ◽

Awadhesh Pandit ◽

Varsha Tirumalai ◽

P. V. Shivaprasad

Keyword(s):

Metabolic Pathways ◽

Wild Rice ◽

Regulation Of Gene Expression ◽

Genome Integrity ◽

Cultivated Rice ◽

Mrna Targets ◽

Ribonucleoprotein Complexes ◽

The Poor ◽

Rna Targets ◽

Secondary Wall Formation

Abstract Background Small non-coding (s)RNAs are involved in the negative regulation of gene expression, playing critical roles in genome integrity, development and metabolic pathways. Targeting of RNAs by ribonucleoprotein complexes of sRNAs bound to Argonaute (AGO) proteins results in cleaved RNAs having precise and predictable 5` ends. While tools to study sliced bits of RNAs to confirm the efficiency of sRNA-mediated regulation are available, they are sub-optimal. In this study, we provide an improvised version of a tool with better efficiency to accurately validate sRNA targets. Results Here, we improvised the CleaveLand tool to identify additional micro (mi)RNA targets that belong to the same family and also other targets within a specified free energy cut-off. These additional targets were otherwise excluded during the default run. We employed these tools to understand the sRNA targeting efficiency in wild and cultivated rice, sequenced degradome from two rice lines, O. nivara and O. sativa indica Pusa Basmati-1 and analyzed variations in sRNA targeting. Our results indicate the existence of multiple miRNA-mediated targeting differences between domesticated and wild species. For example, Os5NG4 was targeted only in wild rice that might be responsible for the poor secondary wall formation when compared to cultivated rice. We also identified differential mRNA targets of secondary sRNAs that were generated after miRNA-mediated cleavage of primary targets. Conclusions We identified many differentially targeted mRNAs between wild and domesticated rice lines. In addition to providing a step-wise guide to generate and analyze degradome datasets, we showed how domestication altered sRNA-mediated cascade silencing during the evolution of indica rice.

Download Full-text

Leveraging Curation Among Escherichia coli Pathway/Genome Databases Using Ortholog-Based Annotation Propagation

Frontiers in Microbiology ◽

10.3389/fmicb.2021.614355 ◽

2021 ◽

Vol 12 ◽

Author(s):

Suzanne Paley ◽

Ingrid M. Keseler ◽

Markus Krummenacker ◽

Peter D. Karp

Keyword(s):

Escherichia Coli ◽

Protein Complexes ◽

Limited Resources ◽

Genome Database ◽

Single Strain ◽

Manual Curation ◽

Genome Databases ◽

New Knowledge ◽

K 12 ◽

New Protein

Updating genome databases to reflect newly published molecular findings for an organism was hard enough when only a single strain of a given organism had been sequenced. With multiple sequenced strains now available for many organisms, the challenge has grown significantly because of the still-limited resources available for the manual curation that corrects errors and captures new knowledge. We have developed a method to automatically propagate multiple types of curated knowledge from genes and proteins in one genome database to their orthologs in uncurated databases for related strains, imposing several quality-control filters to reduce the chances of introducing errors. We have applied this method to propagate information from the highly curated EcoCyc database for Escherichia coli K–12 to databases for 480 other Escherichia coli strains in the BioCyc database collection. The increase in value and utility of the target databases after propagation is considerable. Target databases received updates for an average of 2,535 proteins each. In addition to widespread addition and regularization of gene and protein names, 97% of the target databases were improved by the addition of at least 200 new protein complexes, at least 800 new or updated reaction assignments, and at least 2,400 sets of GO annotations.

Download Full-text

KiMoSys 2.0: an upgraded database for submitting, storing and accessing experimental data for kinetic modeling

Database ◽

10.1093/database/baaa093 ◽

2020 ◽

Vol 2020 ◽

Author(s):

Hugo Mochão ◽

Pedro Barahona ◽

Rafael S Costa

Keyword(s):

Experimental Data ◽

Metabolic Networks ◽

Model Simulation ◽

Web Interface ◽

Concentration Data ◽

Web Based ◽

Share Data ◽

Visualization Tools ◽

Filter Mechanism ◽

Machine Readable

Abstract The KiMoSys (https://kimosys.org), launched in 2014, is a public repository of published experimental data, which contains concentration data of metabolites, protein abundances and flux data. It offers a web-based interface and upload facility to share data, making it accessible in structured formats, while also integrating associated kinetic models related to the data. In addition, it also supplies tools to simplify the construction process of ODE (Ordinary Differential Equations)-based models of metabolic networks. In this release, we present an update of KiMoSys with new data and several new features, including (i) an improved web interface, (ii) a new multi-filter mechanism, (iii) introduction of data visualization tools, (iv) the addition of downloadable data in machine-readable formats, (v) an improved data submission tool, (vi) the integration of a kinetic model simulation environment and (vii) the introduction of a unique persistent identifier system. We believe that this new version will improve its role as a valuable resource for the systems biology community. Database URL: www.kimosys.org

Download Full-text

MaizeGDB: The Maize Model Organism Database for Basic, Translational, and Applied Research

International Journal of Plant Genomics ◽

10.1155/2008/496957 ◽

2008 ◽

Vol 2008 ◽

pp. 1-10 ◽

Cited By ~ 53

Author(s):

Carolyn J. Lawrence ◽

Lisa C. Harper ◽

Mary L. Schaeffer ◽

Taner Z. Sen ◽

Trent E. Seigfried ◽

...

Keyword(s):

Model Organism ◽

Basic Research ◽

Model Organism Database ◽

High Productivity ◽

Group Activities ◽

Food And Agriculture ◽

Food And Agriculture Organization ◽

Genetics And Genomics ◽

Excellent Source ◽

By Products

In 2001 maize became the number one production crop in the world with the Food and Agriculture Organization of the United Nations reporting over 614 million tonnes produced. Its success is due to the high productivity per acre in tandem with a wide variety of commercial uses. Not only is maize an excellent source of food, feed, and fuel, but also its by-products are used in the production of various commercial products. Maize's unparalleled success in agriculture stems from basic research, the outcomes of which drive breeding and product development. In order for basic, translational, and applied researchers to benefit from others' investigations, newly generated data must be made freely and easily accessible. MaizeGDB is the maize research community's central repository for genetics and genomics information. The overall goals of MaizeGDB are to facilitate access to the outcomes of maize research by integrating new maize data into the database and to support the maize research community by coordinating group activities.

Download Full-text

Modeling Elementary Students' Ideas about Heredity: A Comparison of Curricular Interventions

The American Biology Teacher ◽

10.1525/abt.2019.81.9.626 ◽

2019 ◽

Vol 81 (9) ◽

pp. 626-635

Author(s):

Cory T. Forbes ◽

Dante Cisterna ◽

Devarati Bhattacharya ◽

Ranu Roy

Keyword(s):

Model Organism ◽

Life Cycles ◽

Elementary Grades ◽

Trait Variation ◽

Model Based ◽

Third Grade Students ◽

K 12 ◽

Support Students ◽

Curricular Interventions ◽

Pilot Version

Learning about heredity is important across the K–12 continuum. However, these ideas may be challenging for students. We examined third-grade students' ideas about heredity in the context of a new, six-week, model-based science unit that uses corn as a model organism to support students' ideas about heredity. We analyzed data collected during implementation of the unit, including student artifacts and interviews. We compared these data to those from a pilot version of the curriculum – implemented in the prior year – that was focused on the same disciplinary concepts but was not designed around scientific modeling. Our findings illustrate levels of understanding in students' ideas about three target concepts underlying heredity: life cycles, trait inheritance, and trait variation. We also found that students experiencing the model-based version of the unit exhibited higher levels of understanding for two of the three target concepts than those experiencing the non-model-based curriculum. Analysis of student interviews also showed that students experiencing the model-based curriculum were better able to use key elements of life cycle, such as pollination and reproduction to support their explanations about inheritance. We discuss implications of this work for design and enactment of model-based curricula in elementary grades that can support students' learning about heredity.

Download Full-text