scholarly journals Tetranucleotide usage highlights genomic heterogeneity among mycobacteriophages

F1000Research ◽  
2015 ◽  
Vol 4 ◽  
pp. 36
Author(s):  
Benjamin Siranosian ◽  
Sudheesha Perera ◽  
Edward Williams ◽  
Chen Ye ◽  
Christopher de Graffenried ◽  
...  

BackgroundThe genomic sequences of mycobacteriophages, phages infecting mycobacterial hosts, are diverse and mosaic. Mycobacteriophages often share little nucleotide similarity, but most of them have been grouped into lettered clusters and further into subclusters. Traditionally, mycobacteriophage genomes are analyzed based on sequence alignment or knowledge of gene content. However, these approaches are computationally expensive and can be ineffective for significantly diverged sequences. As an alternative to alignment-based genome analysis, we evaluated tetranucleotide usage in mycobacteriophage genomes. These methods make it easier to characterize features of the mycobacteriophage population at many scales.DescriptionWe computed tetranucleotide usage deviation (TUD), the ratio of observed counts of 4-mers in a genome to the expected count under a null model. TUD values are comparable between members of a phage subcluster and distinct between subclusters. With few exceptions, neighbor joining phylogenetic trees and hierarchical clustering dendrograms constructed using TUD values place phages in a monophyletic clade with members of the same subcluster. Regions in a genome with exceptional TUD values can point to interesting features of genomic architecture. Finally, we found that subcluster B3 mycobacteriophages contain significantly overrepresented 4-mers and 6-mers that are atypical of phage genomes.ConclusionsStatistics based on tetranucleotide usage support established clustering of mycobacteriophages and can uncover interesting relationships within and between sequenced phage genomes. These methods are efficient to compute and do not require sequence alignment or knowledge of gene content. The code to download mycobacteriophage genome sequences and reproduce our analysis is freely available athttps://github.com/bsiranosian/tango_final.

F1000Research ◽  
2015 ◽  
Vol 4 ◽  
pp. 36 ◽  
Author(s):  
Benjamin Siranosian ◽  
Sudheesha Perera ◽  
Edward Williams ◽  
Chen Ye ◽  
Christopher de Graffenried ◽  
...  

BackgroundThe genomic sequences of mycobacteriophages, phages infecting mycobacterial hosts, are diverse and mosaic. Mycobacteriophages often share little nucleotide similarity, but most of them have been grouped into lettered clusters and further into subclusters. Traditionally, mycobacteriophage genomes are analyzed based on sequence alignment or knowledge of gene content. However, these approaches are computationally expensive and can be ineffective for significantly diverged sequences. As an alternative to alignment-based genome analysis, we evaluated tetranucleotide usage in mycobacteriophage genomes. These methods make it easier to characterize features of the mycobacteriophage population at many scales.DescriptionWe computed tetranucleotide usage deviation (TUD), the ratio of observed counts of 4-mers in a genome to the expected count under a null model. TUD values are comparable between members of a phage subcluster and distinct between subclusters. With few exceptions, neighbor joining phylogenetic trees and hierarchical clustering dendrograms constructed using TUD values place phages in a monophyletic clade with members of the same subcluster. Regions in a genome with exceptional TUD values can point to interesting features of genomic architecture. Finally, we found that subcluster B3 mycobacteriophages contain significantly overrepresented 4-mers and 6-mers that are atypical of phage genomes.ConclusionsStatistics based on tetranucleotide usage support established clustering of mycobacteriophages and can uncover interesting relationships within and between sequenced phage genomes. These methods are efficient to compute and do not require sequence alignment or knowledge of gene content. The code to download mycobacteriophage genome sequences and reproduce our analysis is freely available athttps://github.com/bsiranosian/tango_final.


Viruses ◽  
2021 ◽  
Vol 13 (5) ◽  
pp. 764
Author(s):  
Bohu Pan ◽  
Zuowei Ji ◽  
Sugunadevi Sakkiah ◽  
Wenjing Guo ◽  
Jie Liu ◽  
...  

Severe acute respiratory syndrome coronavirus 2 (SARS−CoV−2) has caused the ongoing global COVID-19 pandemic that began in late December 2019. The rapid spread of SARS−CoV−2 is primarily due to person-to-person transmission. To understand the epidemiological traits of SARS−CoV−2 transmission, we conducted phylogenetic analysis on genome sequences from >54K SARS−CoV−2 cases obtained from two public databases. Hierarchical clustering analysis on geographic patterns in the resulting phylogenetic trees revealed a co-expansion tendency of the virus among neighboring countries with diverse sources and transmission routes for SARS−CoV−2. Pairwise sequence similarity analysis demonstrated that SARS−CoV−2 is transmitted locally and evolves during transmission. However, no significant differences were seen among SARS−CoV−2 genomes grouped by host age or sex. Here, our identified epidemiological traits provide information to better prevent transmission of SARS−CoV−2 and to facilitate the development of effective vaccines and therapeutics against the virus.


Viruses ◽  
2021 ◽  
Vol 13 (8) ◽  
pp. 1460
Author(s):  
Irene Hoxie ◽  
John J. Dennehy

Reassortment of the Rotavirus A (RVA) 11-segment dsRNA genome may generate new genome constellations that allow RVA to expand its host range or evade immune responses. Reassortment may also produce phylogenetic incongruities and weakly linked evolutionary histories across the 11 segments, obscuring reassortment-specific epistasis and changes in substitution rates. To determine the co-segregation patterns of RVA segments, we generated time-scaled phylogenetic trees for each of the 11 segments of 789 complete RVA genomes isolated from mammalian hosts and compared the segments’ geodesic distances. We found that segments 4 (VP4) and 9 (VP7) occupied significantly different tree spaces from each other and from the rest of the genome. By contrast, segments 10 and 11 (NSP4 and NSP5/6) occupied nearly indistinguishable tree spaces, suggesting strong co-segregation. Host-species barriers appeared to vary by segment, with segment 9 (VP7) presenting the weakest association with host species. Bayesian Skyride plots were generated for each segment to compare relative genetic diversity among segments over time. All segments showed a dramatic decrease in diversity around 2007 coinciding with the introduction of RVA vaccines. To assess selection pressures, codon adaptation indices and relative codon deoptimization indices were calculated with respect to different host genomes. Codon usage varied by segment with segment 11 (NSP5) exhibiting significantly higher adaptation to host genomes. Furthermore, RVA codon usage patterns appeared optimized for expression in humans and birds relative to the other hosts examined, suggesting that translational efficiency is not a barrier in RVA zoonosis.


2021 ◽  
Author(s):  
Rania Jbir Koubaa ◽  
Mariem Ayadi ◽  
Mohamed Najib Saidi ◽  
Safa Charfeddine ◽  
Radhia Gargouri Bouzid ◽  
...  

Abstract As antioxidant enzymes, catalase (CAT) protects organisms from oxidative stress via the production of reactive oxygen species (ROS). These enzymes play important roles in diverse biological processes. However, little is known about the CAT genes in potato plants despite its important economical rank of this crop in the world. Yet, abiotic and biotic stresses severely hinder growth and development of the plants which affects the production and quality of the crop. To define the possible roles of CAT genes under various stresses, a genome-wide analysis of CAT gene family has been performed in potato plant.In this study, the StCAT gene’s structure, secondary and 3D protein structure, physicochemical properties, synteny analysis, phylogenetic tree and also expression profiling under various developmental and environmental cues were predicted using bioinformatics tools. The expression analysis by RT-PCR was performed using commercial potato cultivar. Three genes encoding StCAT that code for three proteins each of size 492 aa, interrupted by seven introns have been identified in potatoes. StCAT proteins were found to be localized in the peroxisome which is judged as the main H2O2 cell production site during different processes. Many regulating cis-elements related to stress responses and plant hormones signaling were found in the promoter sequence of each gene. The analysis of motifs and phylogenetic trees showed that StCAT are closer to their homologous in S. lycopersicum and share a 41% – 95% identity with other plants’ CATs. Expression profiling revealed that StCAT1 is the constitutively expressive member; while StCAT2 and StCAT3 are the stress-responsive members.


2019 ◽  
Author(s):  
Zachary L. Fuller ◽  
Veronique J.L. Mocellin ◽  
Luke Morris ◽  
Neal Cantin ◽  
Jihanne Shepherd ◽  
...  

AbstractAlthough reef-building corals are rapidly declining worldwide, responses to bleaching vary both within and among species. Because these inter-individual differences are partly heritable, they should in principle be predictable from genomic data. Towards that goal, we generated a chromosome-scale genome assembly for the coral Acropora millepora. We then obtained whole genome sequences for 237 phenotyped samples collected at 12 reefs distributed along the Great Barrier Reef, among which we inferred very little population structure. Scanning the genome for evidence of local adaptation, we detected signatures of long-term balancing selection in the heat-shock co-chaperone sacsin. We further used 213 of the samples to conduct a genome-wide association study of visual bleaching score, incorporating the polygenic score derived from it into a predictive model for bleaching in the wild. These results set the stage for the use of genomics-based approaches in conservation strategies.


2020 ◽  
Vol 11 ◽  
Author(s):  
Huimin Liu ◽  
Zhibin Shi ◽  
Chunguo Liu ◽  
Pengfei Wang ◽  
Ming Wang ◽  
...  

Pseudorabies viruses (PRVs) pose a great threat to the pig industry of many countries around the world. Human infections with PRV have also been reported occasionally in China. Therefore, understanding the epidemiology and evolution of PRVs is of great importance for disease control in the pig populations and humans as well. In this study, we isolated a PRV designated HLJ-2013 from PRV-positive samples that had been collected in Heilongjiang, China, in 2013. The full genome sequence of the virus was determined to be ∼143 kbp in length using high-throughput sequencing. The genomic sequence identities between this isolate and 21 other previous PRV isolates ranged from 92.4% (with Bartha) to 97.3% (with SC). Phylogenetic analysis based on the full-length genome sequences revealed that PRV HLJ-2013 clustered together with all the Chinese strains in one group belonging to Genotype II, but this virus occurred phylogenetically earlier than all the other Chinese PRV strains. Phylogenetic trees based on both protein-coding genes and non-coding regions revealed that HLJ-2013 probably obtained its genome sequences from three origins: a yet unknown parent virus, the European viruses, and the same ancestor of all Chinese PRVs. Recombination analysis showed that HLJ-2013-like virus possibly donated the main framework of the genome of the Chinese PRVs. HLJ-2013 exhibited cytopathic and growth characteristics similar to that of the Chinese PRV strains SC and HeN1, but its pathogenicity in mice was higher than that of SC and lower than that of HeN1. The identification of HLJ-2013 takes us one step closer to understanding the origin of PRVs in China and provides new knowledge about the evolution of PRVs worldwide.


2021 ◽  
Author(s):  
David Emms ◽  
Steven Kelly

Determining the evolutionary relationships between gene sequences is fundamental to comparative biological research. However, conducting such analyses requires a high degree of technical proficiency in several computational tools including gene family construction, multiple sequence alignment, and phylogenetic inference. Here we present SHOOT, an easy to use phylogenetic search engine for fast and accurate phylogenetic analysis of biological sequences. SHOOT searches a user-provided query sequence against a database of phylogenetic trees of gene sequences (gene trees) and returns a gene tree with the given query sequence correctly grafted within it. We show that SHOOT can perform this search and placement with comparable speed to a conventional BLAST search. We demonstrate that SHOOT phylogenetic placements are as accurate as conventional multiple sequence alignment and maximum likelihood tree inference approaches. We further show that SHOOT can be used to identify orthologs with equivalent accuracy to conventional orthology inference methods. In summary, SHOOT is an accurate and fast tool for complete phylogenetic analysis of novel query sequences. An easy to use webserver is available online at www.shoot.bio.


2020 ◽  
pp. 565-579 ◽  
Author(s):  
Mohamed Issa ◽  
Aboul Ella Hassanien

Sequence alignment is a vital process in many biological applications such as Phylogenetic trees construction, DNA fragment assembly and structure/function prediction. Two kinds of alignment are pairwise alignment which align two sequences and Multiple Sequence alignment (MSA) that align sequences more than two. The accurate method of alignment is based on Dynamic Programming (DP) approach which suffering from increasing time exponentially with increasing the length and the number of the aligned sequences. Stochastic or meta-heuristics techniques speed up alignment algorithm but with near optimal alignment accuracy not as that of DP. Hence, This chapter aims to review the recent development of MSA using meta-heuristics algorithms. In addition, two recent techniques are focused in more deep: the first is Fragmented protein sequence alignment using two-layer particle swarm optimization (FTLPSO). The second is Multiple sequence alignment using multi-objective based bacterial foraging optimization algorithm (MO-BFO).


Sign in / Sign up

Export Citation Format

Share Document