VISTA Family of Computational Tools for Comparative Analysis of DNA Sequences and Whole Genomes

Author(s):  
Inna Dubchak ◽  
Dmitriy V. Ryaboy
2017 ◽  
Vol 110 (10) ◽  
pp. 1357-1371 ◽  
Author(s):  
Nitish Kumar Mahato ◽  
Vipin Gupta ◽  
Priya Singh ◽  
Rashmi Kumari ◽  
Helianthous Verma ◽  
...  

PeerJ ◽  
2019 ◽  
Vol 7 ◽  
pp. e6563
Author(s):  
Jianying Sun ◽  
Xiaofeng Dong ◽  
Qinghe Cao ◽  
Tao Xu ◽  
Mingku Zhu ◽  
...  

Background Ipomoea is the largest genus in the family Convolvulaceae. The species in this genus have been widely used in many fields, such as agriculture, nutrition, and medicine. With the development of next-generation sequencing, more than 50 chloroplast genomes of Ipomoea species have been sequenced. However, the repeats and divergence regions in Ipomoea have not been well investigated. In the present study, we sequenced and assembled eight chloroplast genomes from sweet potato’s close wild relatives. By combining these with 32 published chloroplast genomes, we conducted a detailed comparative analysis of a broad range of Ipomoea species. Methods Eight chloroplast genomes were assembled using short DNA sequences generated by next-generation sequencing technology. By combining these chloroplast genomes with 32 other published Ipomoea chloroplast genomes downloaded from GenBank and the Oxford Research Archive, we conducted a comparative analysis of the repeat sequences and divergence regions across the Ipomoea genus. In addition, separate analyses of the Batatas group and Quamoclit group were also performed. Results The eight newly sequenced chloroplast genomes ranged from 161,225 to 161,721 bp in length and displayed the typical circular quadripartite structure, consisting of a pair of inverted repeat (IR) regions (30,798–30,910 bp each) separated by a large single copy (LSC) region (87,575–88,004 bp) and a small single copy (SSC) region (12,018–12,051 bp). The average guanine-cytosine (GC) content was approximately 40.5% in the IR region, 36.1% in the LSC region, 32.2% in the SSC regions, and 37.5% in complete sequence for all the generated plastomes. The eight chloroplast genome sequences from this study included 80 protein-coding genes, four rRNAs (rrn23, rrn16, rrn5, and rrn4.5), and 37 tRNAs. The boundaries of single copy regions and IR regions were highly conserved in the eight chloroplast genomes. In Ipomoea, 57–89 pairs of repetitive sequences and 39–64 simple sequence repeats were found. By conducting a sliding window analysis, we found six relatively high variable regions (ndhA intron, ndhH-ndhF, ndhF-rpl32, rpl32-trnL, rps16-trnQ, and ndhF) in the Ipomoea genus, eight (trnG, rpl32-trnL, ndhA intron, ndhF-rpl32, ndhH-ndhF, ccsA-ndhD, trnG-trnR, and pasA-ycf3) in the Batatas group, and eight (ndhA intron, petN-psbM, rpl32-trnL, trnG-trnR, trnK-rps16, ndhC-trnV, rps16-trnQ, and trnG) in the Quamoclit group. Our maximum-likelihood tree based on whole chloroplast genomes confirmed the phylogenetic topology reported in previous studies. Conclusions The chloroplast genome sequence and structure were highly conserved in the eight newly-sequenced Ipomoea species. Our comparative analysis included a broad range of Ipomoea chloroplast genomes, providing valuable information for Ipomoea species identification and enhancing the understanding of Ipomoea genetic resources.


2019 ◽  
Author(s):  
Marc Manceau ◽  
Julie Marin ◽  
Hélène Morlon ◽  
Amaury Lambert

AbstractIn standard models of molecular evolution, DNA sequences evolve through asynchronous substitutions according to Poisson processes with a constant rate (called the molecular clock) or a time-varying rate (relaxed clock). However, DNA sequences can also undergo episodes of fast divergence that will appear as synchronous substitutions affecting several sites simultaneously at the macroevolutionary time scale. Here, we develop a model combining basal, clock-like molecular evolution with episodes of fast divergence called spikes arising at speciation events. Given a multiple sequence alignment and its time-calibrated species phylogeny, our model is able to detect speciation events (including hidden ones) co-occurring with spike events and to estimate the probability and amplitude of these spikes on the phylogeny. We identify the conditions under which spikes can be distinguished from the natural variance of the clock-like component of molecular evolution and from temporal variations of the clock. We apply the method to genes underlying snake venom proteins and identify several spikes at gene-specific locations in the phylogeny. This work should pave the way for analyses relying on whole genomes to inform on modes of species diversification.


Parasitology ◽  
2012 ◽  
Vol 139 (8) ◽  
pp. 1063-1073 ◽  
Author(s):  
K. CWIKLINSKI ◽  
F. N. J. KOOYMAN ◽  
D. C. K. VAN DOORN ◽  
J. B. MATTHEWS ◽  
J. E. HODGKINSON

SUMMARYCyathostomins comprise a group of 50 species of parasitic nematodes that infect equids. Ribosomal DNA sequences, in particular the intergenic spacer (IGS) region, have been utilized via several methodologies to identify pre-parasitic stages of the commonest species that affect horses. These methods rely on the availability of accurate sequence information for each species, as well as detailed knowledge of the levels of intra- and inter-specific variation. Here, the IGS DNA region was amplified and sequenced from 10 cyathostomin species for which sequence was not previously available. Also, additional IGS DNA sequences were generated from individual worms of 8 species already studied. Comparative analysis of these sequences revealed a greater range of intra-specific variation than previously reported (up to 23%); whilst the level of inter-specific variation (3–62%) was similar to that identified in earlier studies. The reverse line blot (RLB) method has been used to exploit the cyathostomin IGS DNA region for species identification. Here, we report validation of novel and existing DNA probes for identification of cyathostomins using this method and highlight their application in differentiating life-cycle stages such as third-stage larvae that cannot be identified to species by morphological means†.


2020 ◽  
Vol 16 ◽  
pp. 2448-2468
Author(s):  
Kanhaya Lal ◽  
Rafael Bermeo ◽  
Serge Perez

Drawing and visualisation of molecular structures are some of the most common tasks carried out in structural glycobiology, typically using various software. In this perspective article, we outline developments in the computational tools for the sketching, visualisation and modelling of glycans. The article also provides details on the standard representation of glycans, and glycoconjugates, which helps the communication of structure details within the scientific community. We highlight the comparative analysis of the available tools which could help researchers to perform various tasks related to structure representation and model building of glycans. These tools can be useful for glycobiologists or any researcher looking for a ready to use, simple program for the sketching or building of glycans.


2020 ◽  
Vol 37 (11) ◽  
pp. 3308-3323 ◽  
Author(s):  
Marc Manceau ◽  
Julie Marin ◽  
Hélène Morlon ◽  
Amaury Lambert

Abstract In standard models of molecular evolution, DNA sequences evolve through asynchronous substitutions according to Poisson processes with a constant rate (called the molecular clock) or a rate that can vary (relaxed clock). However, DNA sequences can also undergo episodes of fast divergence that will appear as synchronous substitutions affecting several sites simultaneously at the macroevolutionary timescale. Here, we develop a model, which we call the Relaxed Clock with Spikes model, combining basal, clock-like molecular substitutions with episodes of fast divergence called spikes arising at speciation events. Given a multiple sequence alignment and its time-calibrated species phylogeny, our model is able to detect speciation events (including hidden ones) cooccurring with spike events and to estimate the probability and amplitude of these spikes on the phylogeny. We identify the conditions under which spikes can be distinguished from the natural variance of the clock-like component of molecular substitutions and from variations of the clock. We apply the method to genes underlying snake venom proteins and identify several spikes at gene-specific locations in the phylogeny. This work should pave the way for analyses relying on whole genomes to inform on modes of species diversification.


2004 ◽  
Vol 20 (9) ◽  
pp. 1468-1469 ◽  
Author(s):  
M. C. Frith ◽  
A. S. Halees ◽  
U. Hansen ◽  
Z. Weng
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document