scholarly journals AGP: A Multimethods Web Server for Alignment-Free Genome Phylogeny

2013 ◽  
Vol 30 (5) ◽  
pp. 1032-1037 ◽  
Author(s):  
Jinkui Cheng ◽  
Fuliang Cao ◽  
Zhihua Liu

Abstract Phylogenetic analysis based on alignment method meets huge challenges when dealing with whole-genome sequences, for example, recombination, shuffling, and rearrangement of sequences. Thus, various alignment-free methods for phylogeny construction have been proposed. However, most of these methods have not been implemented as tools or web servers. Researchers cannot use these methods easily with their data sets. To facilitate the usage of various alignment-free methods, we implemented most of the popular alignment-free methods and constructed a user-friendly web server for alignment-free genome phylogeny (AGP). AGP integrated the phylogenetic tree construction, visualization, and comparison functions together. Both AGP and all source code of the methods are available at http://www.herbbol.org:8000/agp (last accessed February 26, 2013). AGP will facilitate research in the field of whole-genome phylogeny and comparison.

2019 ◽  
Vol 47 (W1) ◽  
pp. W52-W58 ◽  
Author(s):  
Ling Xu ◽  
Zhaobin Dong ◽  
Lu Fang ◽  
Yongjiang Luo ◽  
Zhaoyuan Wei ◽  
...  

Abstract OrthoVenn is a powerful web platform for the comparison and analysis of whole-genome orthologous clusters. Here we present an updated version, OrthoVenn2, which provides new features that facilitate the comparative analysis of orthologous clusters among up to 12 species. Additionally, this update offers improvements to data visualization and interpretation, including an occurrence pattern table for interrogating the overlap of each orthologous group for the queried species. Within the occurrence table, the functional annotations and summaries of the disjunctions and intersections of clusters between the chosen species can be displayed through an interactive Venn diagram. To facilitate a broader range of comparisons, a larger number of species, including vertebrates, metazoa, protists, fungi, plants and bacteria, have been added in OrthoVenn2. Finally, a stand-alone version is available to perform large dataset comparisons and to visualize results locally without limitation of species number. In summary, OrthoVenn2 is an efficient and user-friendly web server freely accessible at https://orthovenn2.bioinfotoolkits.net.


2020 ◽  
Vol 117 (7) ◽  
pp. 3678-3686 ◽  
Author(s):  
JaeJin Choi ◽  
Sung-Hou Kim

An organism tree of life (organism ToL) is a conceptual and metaphorical tree to capture a simplified narrative of the evolutionary course and kinship among the extant organisms. Such a tree cannot be experimentally validated but may be reconstructed based on characteristics associated with the organisms. Since the whole-genome sequence of an organism is, at present, the most comprehensive descriptor of the organism, a whole-genome sequence-based ToL can be an empirically derivable surrogate for the organism ToL. However, experimentally determining the whole-genome sequences of many diverse organisms was practically impossible until recently. We have constructed three types of ToLs for diversely sampled organisms using the sequences of whole genome, of whole transcriptome, and of whole proteome. Of the three, whole-proteome sequence-based ToL (whole-proteome ToL), constructed by applying information theory-based feature frequency profile method, an “alignment-free” method, gave the most topologically stable ToL. Here, we describe the main features of a whole-proteome ToL for 4,023 species with known complete or almost complete genome sequences on grouping and kinship among the groups at deep evolutionary levels. The ToL reveals 1) all extant organisms of this study can be grouped into 2 “Supergroups,” 6 “Major Groups,” or 35+ “Groups”; 2) the order of emergence of the “founders” of all of the groups may be assigned on an evolutionary progression scale; 3) all of the founders of the groups have emerged in a “deep burst” at the very beginning period near the root of the ToL—an explosive birth of life’s diversity.


Author(s):  
Hsin-Hsiung Huang ◽  
Senthil Balaji Girimurugan

Abstract In recent years, alignment-free methods have been widely applied in comparing genome sequences, as these methods compute efficiently and provide desirable phylogenetic analysis results. These methods have been successfully combined with hierarchical clustering methods for finding phylogenetic trees. However, it may not be suitable to apply these alignment-free methods directly to existing statistical classification methods, because an appropriate statistical classification theory for integrating with the alignment-free representation methods is still lacking. In this article, we propose a discriminant analysis method which uses the discrete wavelet packet transform to classify whole genome sequences. The proposed alignment-free representation statistics of features follow a joint normal distribution asymptotically. The data analysis results indicate that the proposed method provides satisfactory classification results in real time.


2015 ◽  
Vol 32 (6) ◽  
pp. 929-931 ◽  
Author(s):  
Michael Richter ◽  
Ramon Rosselló-Móra ◽  
Frank Oliver Glöckner ◽  
Jörg Peplies

Abstract Summary: JSpecies Web Server (JSpeciesWS) is a user-friendly online service for in silico calculating the extent of identity between two genomes, a parameter routinely used in the process of polyphasic microbial species circumscription. The service measures the average nucleotide identity (ANI) based on BLAST+ (ANIb) and MUMmer (ANIm), as well as correlation indexes of tetra-nucleotide signatures (Tetra). In addition, it provides a Tetra Correlation Search function, which allows to rapidly compare selected genomes against a continuously updated reference database with currently about 32 000 published whole and draft genome sequences. For comparison, own genomes can be uploaded and references can be selected from the JSpeciesWS reference database. The service indicates whether two genomes share genomic identities above or below the species embracing thresholds, and serves as a fast way to allocate unknown genomes in the frame of the hitherto sequenced species. Availability and implementation: JSpeciesWS is available at http://jspecies.ribohost.com/jspeciesws. Supplementary information: Supplementary data are available at Bioinformatics online. Contact: [email protected]


2020 ◽  
Author(s):  
Tiansheng Zhu ◽  
Guo-Bo Chen ◽  
Chunhui Yuan ◽  
Rui Sun ◽  
Fangfei Zhang ◽  
...  

AbstractBatch effects are unwanted data variations that may obscure biological signals, leading to bias or errors in subsequent data analyses. Effective evaluation and elimination of batch effects are necessary for omics data analysis. In order to facilitate the evaluation and correction of batch effects, here we present BatchSever, an open-source R/Shiny based user-friendly interactive graphical web platform for batch effects analysis. In BatchServer we introduced autoComBat, a modified version of ComBat, which is the most widely adopted tool for batch effect correction. BatchServer uses PVCA (Principal Variance Component Analysis) and UMAP (Manifold Approximation and Projection) for evaluation and visualizion of batch effects. We demonstate its application in multiple proteomics and transcriptomic data sets. BatchServer is provided at https://lifeinfo.shinyapps.io/batchserver/ as a web server. The source codes are freely available at https://github.com/guomics-lab/batch_server.


2020 ◽  
Author(s):  
Sung-Hou Kim ◽  
JaeJin Choi ◽  
Byung-Ju Kim

Abstract Background: An “organism tree” of a group of extant-organisms can be considered as a conceptual tree to capture a simplified narrative of the evolutionary course among the organisms. Due to the difficulties of whole-genome sequencing for many organisms, the most common approach has been to construct a “gene tree” by selecting a group of genes common among the organisms, “align” each gene family and estimate evolutionary distances. Despite broad acceptance of the gene trees as the surrogates for the organism trees, there are important limitations and confounding issues with the approach. During last decades, whole-genome sequences of many extant-arthropods became available, providing an opportunity to construct a “whole-proteome tree” of the arthropods, the largest and most species-diverse group of all living animals. Results: An “alignment-free” whole-proteome tree of the arthropods shows that (a) the demographic grouping-pattern is similar to those in the gene trees, but there are notable differences in the branching orders of the groups and the sisterhood relationships between pairs of the groups; and (b) almost all the “founders” of the groups have emerged in an “explosive burst” near the root of the tree. Conclusion: Since the whole-proteome sequence of an organism can be considered as a “book” of amino-acid alphabets, a tree of the books can be constructed, without alignment of sequences, using a text analysis method of Information Theory, which allows comparing the information content of whole-proteomes. Such tree provides another view-point to consider in telling the narrative of kinship among the arthropods.


2020 ◽  
Vol 48 (W1) ◽  
pp. W529-W537 ◽  
Author(s):  
Long Tian ◽  
Chengjie Huang ◽  
Reza Mazloom ◽  
Lenwood S Heath ◽  
Boris A Vinatzer

Abstract High throughput DNA sequencing in combination with efficient algorithms could provide the basis for a highly resolved, genome phylogeny-based and digital prokaryotic taxonomy. However, current taxonomic practice continues to rely on cumbersome journal publications for the description of new species, which still constitute the smallest taxonomic units. In response, we introduce LINbase, a web server that allows users to genomically circumscribe any group of prokaryotes with measurable DNA similarity and that uses the individual isolate as smallest unit. Since LINbase leverages the concept of Life Identification Numbers (LINs), which are codes assigned to individual genomes based on reciprocal average nucleotide identity, we refer to groups circumscribed in LINbase as LINgroups. Users can associate with each LINgroup a name, a short description, and a URL to a peer-reviewed publication. As soon as a LINgroup is circumscribed, any user can immediately identify query genomes as members and submit comments about the LINgroup. Most genomes currently in LINbase were imported from GenBank, but users can upload their own genome sequences as well. In conclusion, LINbase combines the resolution of LINs with the power of crowdsourcing in support of a highly resolved, genome phylogeny-based digital taxonomy. LINbase is available at http://www.LINbase.org.


Sign in / Sign up

Export Citation Format

Share Document