orthology assignment
Recently Published Documents


TOTAL DOCUMENTS

8
(FIVE YEARS 1)

H-INDEX

4
(FIVE YEARS 1)

2020 ◽  
Vol 37 (11) ◽  
pp. 3389-3396 ◽  
Author(s):  
Romain Derelle ◽  
Hervé Philippe ◽  
John K Colbourne

Abstract Orthology assignment is a key step of comparative genomic studies, for which many bioinformatic tools have been developed. However, all gene clustering pipelines are based on the analysis of protein distances, which are subject to many artifacts. In this article, we introduce Broccoli, a user-friendly pipeline designed to infer, with high precision, orthologous groups, and pairs of proteins using a phylogeny-based approach. Briefly, Broccoli performs ultrafast phylogenetic analyses on most proteins and builds a network of orthologous relationships. Orthologous groups are then identified from the network using a parameter-free machine learning algorithm. Broccoli is also able to detect chimeric proteins resulting from gene-fusion events and to assign these proteins to the corresponding orthologous groups. Tested on two benchmark data sets, Broccoli outperforms current orthology pipelines. In addition, Broccoli is scalable, with runtimes similar to those of recent distance-based pipelines. Given its high level of performance and efficiency, this new pipeline represents a suitable choice for comparative genomic studies. Broccoli is freely available at https://github.com/rderelle/Broccoli.


Author(s):  
Romain Derelle ◽  
Hervé Philippe ◽  
John K. Colbourne

AbstractOrthology assignment is a key step of comparative genomic studies, for which many bioinformatic tools have been developed. However, all gene clustering pipelines are based on the analysis of protein distances, which are subject to many artefacts. In this paper we introduce Broccoli, a user-friendly pipeline designed to infer, with high precision, orthologous groups and pairs of proteins using a phylogeny-based approach. Briefly, Broccoli performs ultra-fast phylogenetic analyses on most proteins and builds a network of orthologous relationships. Orthologous groups are then identified from the network using a parameter-free machine learning algorithm. Broccoli is also able to detect chimeric proteins resulting from gene-fusion events and to assign these proteins to the corresponding orthologous groups. Tested on two benchmark datasets, Broccoli outperforms current orthology pipelines. In addition, Broccoli is scalable, with runtimes similar to those of recent distance-based pipelines. Given its high level of performance and efficiency, this new pipeline represents a suitable choice for comparative genomic studies.Broccoli is freely available at https://github.com/rderelle/Broccoli.


2019 ◽  
Vol 2 (1) ◽  
Author(s):  
Paschalis Natsidis ◽  
Alexandros Tsakogiannis ◽  
Pavlos Pavlidis ◽  
Costas S. Tsigenopoulos ◽  
Tereza Manousaki

Abstract Sparidae (Teleostei: Spariformes) are a family of fish constituted by approximately 150 species with high popularity and commercial value, such as porgies and seabreams. Although the phylogeny of this family has been investigated multiple times, its position among other teleost groups remains ambiguous. Most studies have used a single or few genes to decipher the phylogenetic relationships of sparids. Here, we conducted a thorough phylogenomic analysis using five recently available Sparidae gene-sets and 26 high-quality, genome-predicted teleost proteomes. Our analysis suggested that Tetraodontiformes (puffer fish, sunfish) are the closest relatives to sparids than all other groups used. By analytically comparing this result to our own previous contradicting finding, we show that this discordance is not due to different orthology assignment algorithms; on the contrary, we prove that it is caused by the increased taxon sampling of the present study, outlining the great importance of this aspect in phylogenomic analyses in general.


2019 ◽  
Author(s):  
P Natsidis ◽  
A Tsakogiannis ◽  
P Pavlidis ◽  
CS Tsigenopoulos ◽  
T Manousaki

ABSTRACTSparidae (Teleostei: Spariformes) are a family of fish constituted by approximately 150 species with high popularity and commercial value, such as porgies and seabreams. Although the phylogeny of this family has been investigated multiple times, its position among other teleost groups remains ambiguous. Most studies have used a single or few genes to decipher the phylogenetic relationships of sparids. Here, we conducted a phylogenomic attempt to resolve the position of the family using five recently available Sparidae gene-sets and 26 available fish proteomes from species with a sequenced genome, to ensure higher quality of the predicted genes. A thorough phylogenomic analysis suggested that Tetraodontiformes (puffer fish, sunfish) are the closest relatives to sparids than all other groups used, a finding that contradicts our previous phylogenomic analysis that proposed the yellow croaker and the european seabass as closest taxa of sparids. By analytically comparing the methodologies applied in both cases, we show that this discordance is not due to different orthology assignment algorithms; on the contrary, we prove that it is caused by the increased taxon sampling of the present study, outlining the great importance of this aspect in phylogenomic analyses in general.


2017 ◽  
Vol 34 (8) ◽  
pp. 2115-2122 ◽  
Author(s):  
Jaime Huerta-Cepas ◽  
Kristoffer Forslund ◽  
Luis Pedro Coelho ◽  
Damian Szklarczyk ◽  
Lars Juhl Jensen ◽  
...  

2016 ◽  
Author(s):  
Jaime Huerta-Cepas ◽  
Kristoffer Forslund ◽  
Damian Szklarczyk ◽  
Lars Juhl Jensen ◽  
Christian von Mering ◽  
...  

AbstractOrthology assignment is ideally suited for functional inference. However, because predicting orthology is computationally intensive at large scale, and most pipelines relatively in accessible, less precise homology-based functional transfer is still the default for (meta-)genome annotation. We therefore developed eggNOG-mapper, a tool for functional annotation of large sets of sequences based on fast orthology assignments using precomputed clusters and phylogenies from eggNOG. To validate our method, we benchmarked Gene Ontology predictions against two widely used homology-based approaches: BLAST and InterProScan. Compared to BLAST, eggNOG-mapper reduced by 7% the rate of false positive assignments, and increased by 19% the ratio of curated terms recovered over all terms assigned per protein. Compared to InterProScan, eggNOG-mapper achieved similar proteome coverage and precision, while predicting on average 32 more terms per protein and increasing by 26% the rate of curated terms recovered over total term assignments per protein. Through strict orthology assignments, eggNOG-mapper further renders more specific annotations than possible from domain similarity only (e.g. predicting gene family names). eggNOG-mapper runs ~15x than BLAST and at least 2.5x faster than InterProScan. The tool is available standalone or as an online service at http://eggnog-mapper.embl.de.


Author(s):  
Krister M. Swenson ◽  
Nicholas D. Pattengale ◽  
B. M. E. Moret

Sign in / Sign up

Export Citation Format

Share Document