Hayai-Annotation Plants: an ultra-fast and comprehensive functional gene annotation system in plants

Abstract Summary Hayai-Annotation Plants is a browser-based interface for an ultra-fast and accurate functional gene annotation system for plant species using R. The pipeline combines the sequence-similarity searches, using USEARCH against UniProtKB (taxonomy Embryophyta), with a functional annotation step. Hayai-Annotation Plants provides five layers of annotation: i) protein name; ii) gene ontology terms consisting of its three main domains (Biological Process, Molecular Function and Cellular Component); iii) enzyme commission number; iv) protein existence level; and v) evidence type. It implements a new algorithm that gives priority to protein existence level to propagate GO and EC information and annotated Arabidopsis thaliana representative peptide sequences (Araport11) within 5 min at the PC level. Availability and implementation The software is implemented in R and runs on Macintosh and Linux systems. It is freely available at https://github.com/kdri-genomics/Hayai-Annotation-Plants under the GPLv3 license. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Hayai-Annotation Plants: an ultra-fast and comprehensive gene annotation system in plants

10.1101/473488 ◽

2018 ◽

Cited By ~ 2

Author(s):

Andrea Ghelfi ◽

Kenta Shirasawa ◽

Hideki Hirakawa ◽

Sachiko Isobe

Keyword(s):

Enzyme Commission ◽

Biological Process ◽

Gene Annotation ◽

Sequence Similarity ◽

Enzyme Commission Number ◽

Molecular Function ◽

Annotation System ◽

Evidence Type ◽

Similarity Searches ◽

Speed And Accuracy

SummaryHayai-Annotation Plants is a browser-based interface for an ultra-fast and accurate gene annotation system for plant species using R. The pipeline combines the sequence-similarity searches, using USEARCH against UniProtKB (taxonomy Embryophyta), with a functional annotation step. Hayai-Annotation Plants provides five layers of annotation: 1) gene name; 2) gene ontology terms consisting of its three main domains (Biological Process, Molecular Function, and Cellular Component); 3) enzyme commission number; 4) protein existence level; 5) and evidence type. In regard to speed and accuracy, Hayai-Annotation Plants annotated Arabidopsis thaliana (Araport11, representative peptide sequences) within five minutes with an accuracy of 96.4 %.Availability and ImplementationThe software is implemented in R and runs on Macintosh and Linux systems. It is freely available at https://github.com/kdri-genomics/Hayai-Annotation-Plants under the GPLv3 license.

Download Full-text

Comment on ‘Hayai-Annotation Plants: an ultrafast and comprehensive functional gene annotation system in plants’: the importance of taking the GO graph structure into account

Bioinformatics ◽

10.1093/bioinformatics/btaa1052 ◽

2020 ◽

Author(s):

Michiel Van Bel ◽

Klaas Vandepoele

Keyword(s):

Gene Annotation ◽

Functional Gene ◽

Graph Structure ◽

Annotation System ◽

Functional Gene Annotation

Download Full-text

Genome, Functional Gene Annotation, and Nuclear Transformation of the Heterokont Oleaginous Alga Nannochloropsis oceanica CCMP1779

PLoS Genetics ◽

10.1371/journal.pgen.1003064 ◽

2012 ◽

Vol 8 (11) ◽

pp. e1003064 ◽

Cited By ~ 252

Author(s):

Astrid Vieler ◽

Guangxi Wu ◽

Chia-Hong Tsai ◽

Blair Bullard ◽

Adam J. Cornish ◽

...

Keyword(s):

Gene Annotation ◽

Functional Gene ◽

Nuclear Transformation ◽

Nannochloropsis Oceanica ◽

Functional Gene Annotation

Download Full-text

Genome-wide profiling of 24 hr diel rhythmicity in the water flea, Daphnia pulex: network analysis reveals rhythmic gene expression and enhances functional gene annotation

BMC Genomics ◽

10.1186/s12864-016-2998-2 ◽

2016 ◽

Vol 17 (1) ◽

Cited By ~ 8

Author(s):

Samuel S. C. Rund ◽

Boyoung Yoo ◽

Camille Alam ◽

Taryn Green ◽

Melissa T. Stephens ◽

...

Keyword(s):

Gene Expression ◽

Network Analysis ◽

Gene Annotation ◽

Daphnia Pulex ◽

Functional Gene ◽

Water Flea ◽

Genome Wide ◽

Rhythmic Gene ◽

Functional Gene Annotation

Download Full-text

A comparative analysis of heavy metal bioaccumulation and functional gene annotation towards multiple metal resistant potential by Ochrobactrum intermedium BPS-20 and Ochrobactrum ciceri BPS-26

Bioresource Technology ◽

10.1016/j.biortech.2020.124330 ◽

2021 ◽

Vol 320 ◽

pp. 124330

Author(s):

Babita Sharma ◽

Pratyoosh Shukla

Keyword(s):

Heavy Metal ◽

Comparative Analysis ◽

Gene Annotation ◽

Functional Gene ◽

Metal Bioaccumulation ◽

Ochrobactrum Intermedium ◽

Heavy Metal Bioaccumulation ◽

Functional Gene Annotation

Download Full-text

Correction: Genome, Functional Gene Annotation, and Nuclear Transformation of the Heterokont Oleaginous Alga Nannochloropsis oceanica CCMP1779

PLoS Genetics ◽

10.1371/journal.pgen.1006802 ◽

2017 ◽

Vol 13 (5) ◽

pp. e1006802 ◽

Cited By ~ 3

Author(s):

Astrid Vieler ◽

Guangxi Wu ◽

Chia-Hong Tsai ◽

Blair Bullard ◽

Adam J. Cornish ◽

...

Keyword(s):

Gene Annotation ◽

Functional Gene ◽

Nuclear Transformation ◽

Nannochloropsis Oceanica ◽

Functional Gene Annotation

Download Full-text

An integrated multi-level comparison highlights common aspects and specific features between distantly-related species: Tomato and Grapevine

10.7287/peerj.preprints.2208v1 ◽

2016 ◽

Author(s):

Luca Ambrosino ◽

Hamed Bostan ◽

Valentino Ruggieri ◽

Maria Luisa Chiusano

Keyword(s):

Comparative Genomics ◽

Developmental Stages ◽

Gene Annotation ◽

Sequence Similarity ◽

Gene Families ◽

Plant Evolution ◽

Loss Of Function ◽

Evolutionary Mechanisms ◽

Important Species ◽

Similarity Searches

Motivation. Even after years from the first completion of genomes by sequencing, comparative genomics still remains a challenge, also enhanced by the availability of numerous draft genomes with still poor annotation quality. The detection of ortholog genes between different species is a key approach for comparative genomics. For example, ortholog gene detection may support investigations on mechanisms that shaped the organization of the genomes, highlighting on gain or loss of function and on gene annotation. On the other hand, the detection of paralog genes is fundamental for understanding the evolutionary mechanisms that drove gene function innovation and support gene families analyses. Here we report on the gene comparison between two distantly related plants, Solanum lycopersicum (Tomato) (The Tomato Genome Consortium 2012) and Vitis vinifera (Grapevine) (Jaillon et al. 2007), considered as economically important species from asterids and rosids clades, respectively. The strategy was accompanied by integration of multilevel analyses, from domain investigations to expression profiling, to get to the most reliable results and to offer powerful resources, in order to understand different useful aspects of plant evolution and physiology and to dissect traits and molecular aspects that could provide novel tools for agriculture applications and biotechnologies. Methods. In order to predict best putative orthologs and paralogs between Tomato and Grapevine, and to overcome possible annotation issues, all-against-all sequence similarity searches between genes, mRNAs and proteins collections of both species were performed. A Bidirectional Best Hit approach was implemented to detect the best orthologs between the two species. Moreover we developed a dedicated algorithm in Python programming language able to define more extended alignments between mRNA sequences. NetworkX package (Hagberg et al. 2008) was used to define networks of paralogs and orthologs. Proteins domain prediction was carried out on the entire Tomato and Grapevine protein collection by using InterProScan program (Jones et al. 2014). The enzyme classification was obtained by sequence similarity searches between Tomato and Grapevine mRNA collections and the entire UniProt reviewed protein collection (UniProt consortium 2015). The metabolic pathways associated to the detected enzymes were identified exploiting the KEGG Database (Kanehisa and Goto 2000). Expression level of three developmental stages of Tomato (2 cm fruit, breaker and mature red) and the corresponding stages of Grapevine (post-setting, veraison, mature berry) was defined on the basis of the iTAG loci (Shearer et al. 2014) and v1 vitis loci, respectively. The expression was normalized by Reads Per Kilobases per Million (RPKM) for each tissue/stage. Abstract truncated at 3,000 characters - the full version is available in the pdf file

Download Full-text

Ex Vivo Metabolomics: A Powerful Approach for Functional Gene Annotation

Trends in Plant Science ◽

10.1016/j.tplants.2020.03.012 ◽

2020 ◽

Vol 25 (8) ◽

pp. 829-830 ◽

Cited By ~ 1

Author(s):

Kirstin Feussner ◽

Ivo Feussner

Keyword(s):

Ex Vivo ◽

Gene Annotation ◽

Functional Gene ◽

Powerful Approach ◽

Functional Gene Annotation

Download Full-text

WASPS: web-assisted symbolic plasmid synteny server

Bioinformatics ◽

10.1093/bioinformatics/btz745 ◽

2019 ◽

Author(s):

Catherine Badel ◽

Violette Da Cunha ◽

Ryan Catchpole ◽

Patrick Forterre ◽

Jacques Oberto

Keyword(s):

Web Service ◽

Dna Sequences ◽

Sequence Similarity ◽

Supplementary Information ◽

Orthologous Protein ◽

Large Numbers ◽

Daunting Task ◽

Internet Explorer ◽

Genome Analyses ◽

Similarity Searches

Abstract Motivation Comparative plasmid genome analyses require complex tools, the manipulation of large numbers of sequences and constitute a daunting task for the wet bench experimentalist. Dedicated plasmid databases are sparse, only comprise bacterial plasmids and provide exclusively access to sequence similarity searches. Results We have developed Web-Assisted Symbolic Plasmid Synteny (WASPS), a web service granting protein and DNA sequence similarity searches against a database comprising all completely sequenced natural plasmids from bacterial, archaeal and eukaryal origin. This database pre-calculates orthologous protein clustering and enables WASPS to generate fully resolved plasmid synteny maps in real time using internal and user-provided DNA sequences. Availability and implementation WASPS queries befit all current browsers such as Firefox, Edge or Safari while the best functionality is achieved with Chrome. Internet Explorer is not supported. WASPS is freely accessible at https://archaea.i2bc.paris-saclay.fr/wasps/. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

immuneSIM: tunable multi-feature simulation of B- and T-cell receptor repertoires for immunoinformatics benchmarking

Bioinformatics ◽

10.1093/bioinformatics/btaa158 ◽

2020 ◽

Vol 36 (11) ◽

pp. 3594-3596 ◽

Cited By ~ 5

Author(s):

Cédric R Weber ◽

Rahmad Akbar ◽

Alexander Yermanos ◽

Milena Pavlović ◽

Igor Snapkov ◽

...

Keyword(s):

T Cell ◽

T Cell Receptor ◽

Network Architecture ◽

Gene Annotation ◽

Sequence Similarity ◽

Cell Receptor ◽

Supplementary Information ◽

Germline Gene ◽

Immune Receptor ◽

Estimation Sequence

Abstract Summary B- and T-cell receptor repertoires of the adaptive immune system have become a key target for diagnostics and therapeutics research. Consequently, there is a rapidly growing number of bioinformatics tools for immune repertoire analysis. Benchmarking of such tools is crucial for ensuring reproducible and generalizable computational analyses. Currently, however, it remains challenging to create standardized ground truth immune receptor repertoires for immunoinformatics tool benchmarking. Therefore, we developed immuneSIM, an R package that allows the simulation of native-like and aberrant synthetic full-length variable region immune receptor sequences by tuning the following immune receptor features: (i) species and chain type (BCR, TCR, single and paired), (ii) germline gene usage, (iii) occurrence of insertions and deletions, (iv) clonal abundance, (v) somatic hypermutation and (vi) sequence motifs. Each simulated sequence is annotated by the complete set of simulation events that contributed to its in silico generation. immuneSIM permits the benchmarking of key computational tools for immune receptor analysis, such as germline gene annotation, diversity and overlap estimation, sequence similarity, network architecture, clustering analysis and machine learning methods for motif detection. Availability and implementation The package is available via https://github.com/GreiffLab/immuneSIM and on CRAN at https://cran.r-project.org/web/packages/immuneSIM. The documentation is hosted at https://immuneSIM.readthedocs.io. Contact [email protected] or [email protected] Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text