sequence evolution
Recently Published Documents


TOTAL DOCUMENTS

872
(FIVE YEARS 218)

H-INDEX

74
(FIVE YEARS 7)

Viruses ◽  
2022 ◽  
Vol 14 (1) ◽  
pp. 146
Author(s):  
Angelo Pavesi ◽  
Fabio Romerio

Gene overprinting occurs when point mutations within a genomic region with an existing coding sequence create a new one in another reading frame. This process is quite frequent in viral genomes either to maximize the amount of information that they encode or in response to strong selective pressure. The most frequent scenario involves two different reading frames in the same DNA strand (sense overlap). Much less frequent are cases of overlapping genes that are encoded on opposite DNA strands (antisense overlap). One such example is the antisense ORF, asp in the minus strand of the HIV-1 genome overlapping the env gene. The asp gene is highly conserved in pandemic HIV-1 strains of group M, and it is absent in non-pandemic HIV-1 groups, HIV-2, and lentiviruses infecting non-human primates, suggesting that the ~190-amino acid protein that is expressed from this gene (ASP) may play a role in virus spread. While the function of ASP in the virus life cycle remains to be elucidated, mounting evidence from several research groups indicates that ASP is expressed in vivo. There are two alternative hypotheses that could be envisioned to explain the origin of the asp ORF. On one hand, asp may have originally been present in the ancestor of contemporary lentiviruses, and subsequently lost in all descendants except for most HIV-1 strains of group M due to selective advantage. Alternatively, the asp ORF may have originated very recently with the emergence of group M HIV-1 strains from SIVcpz. Here, we used a combination of computational and statistical approaches to study the genomic region of env in primate lentiviruses to shed light on the origin, structure, and sequence evolution of the asp ORF. The results emerging from our studies support the hypothesis of a recent de novo addition of the antisense ORF to the HIV-1 genome through a process that entailed progressive removal of existing internal stop codons from SIV strains to HIV-1 strains of group M, and fine tuning of the codon sequence in env that reduced the chances of new stop codons occurring in asp. Altogether, the study supports the notion that the HIV-1 asp gene encodes an accessory protein, providing a selective advantage to the virus.


Nature ◽  
2022 ◽  
Author(s):  
J. Grey Monroe ◽  
Thanvi Srikant ◽  
Pablo Carbonell-Bejerano ◽  
Claude Becker ◽  
Mariele Lensink ◽  
...  

AbstractSince the first half of the twentieth century, evolutionary theory has been dominated by the idea that mutations occur randomly with respect to their consequences1. Here we test this assumption with large surveys of de novo mutations in the plant Arabidopsis thaliana. In contrast to expectations, we find that mutations occur less often in functionally constrained regions of the genome—mutation frequency is reduced by half inside gene bodies and by two-thirds in essential genes. With independent genomic mutation datasets, including from the largest Arabidopsis mutation accumulation experiment conducted to date, we demonstrate that epigenomic and physical features explain over 90% of variance in the genome-wide pattern of mutation bias surrounding genes. Observed mutation frequencies around genes in turn accurately predict patterns of genetic polymorphisms in natural Arabidopsis accessions (r = 0.96). That mutation bias is the primary force behind patterns of sequence evolution around genes in natural accessions is supported by analyses of allele frequencies. Finally, we find that genes subject to stronger purifying selection have a lower mutation rate. We conclude that epigenome-associated mutation bias2 reduces the occurrence of deleterious mutations in Arabidopsis, challenging the prevailing paradigm that mutation is a directionless force in evolution.


2022 ◽  
Author(s):  
Kaichi Huang ◽  
Kate L Ostevik ◽  
Cassandra Elphinstone ◽  
Marco Todesco ◽  
Natalia Bercovich ◽  
...  

Recombination is critical both for accelerating adaptation and for the purging of deleterious mutations. Chromosomal inversions can act as recombination modifiers that suppress local recombination and, thus, are predicted to accumulate such mutations. In this study, we investigated patterns of recombination, transposable element abundance and coding sequence evolution across the genomes of 1,445 individuals from three sunflower species, as well as within nine inversions segregating within species. We also analyzed the effects of inversion genotypes on 87 phenotypic traits to test for overdominance. We found significant negative correlations of long terminal repeat retrotransposon abundance and deleterious mutations with recombination rates across the genome in all three species. However, we failed to detect an increase in these features in the inversions, except for a modest increase in the proportion of stop codon mutations in several very large or rare inversions. Moreover, there was little evidence of phenotypic overdominance in inversion heterozygotes, consistent with observations of minimal deleterious load. On the other hand, significantly greater load was observed for inversions in populations polymorphic for a given inversion compared to populations monomorphic for one of the arrangements, suggesting that the local state of inversion polymorphism affects deleterious load. These seemingly contradictory results can be explained by the geographic structuring and consequent excess homozygosity of inversions in wild sunflowers. Inversions contributing to local adaptation often exhibit geographic structure; such inversions represent ideal recombination modifiers, acting to facilitate adaptive divergence with gene flow, while largely averting the accumulation of deleterious mutations due to recombination suppression.


2021 ◽  
Author(s):  
Michael Terence Boswell ◽  
Jamirah Nazziwa ◽  
Kimiko Kuroki ◽  
Angelica Palm ◽  
Sara Karlson ◽  
...  

Background: HIV-2 infection will progress to AIDS in most patients without treatment, albeit at approximately half the rate of HIV-1 infection. HIV-2 p26 amino acid variations are associated with lower viral loads and enhanced processing of T cell epitopes, which may lead to protective Gag-specific CTL responses common in slower disease progressors. Lower virus evolutionary rates, and positive selection on conserved residues in HIV-2 env have been associated with slower progression to AIDS. We therefore aimed to determine if intrahost evolution of HIV-2 p26 is associated with disease progression. Methods: Twelve treatment-naive, HIV-2 mono-infected participants from the Guinea-Bissau Police cohort with longitudinal CD4+ T cell data and clinical follow-up were included in the analysis. CD4% change over time was analysed via linear regression models to stratify participants into relative faster and slower disease progressor groups. Gag amplicons of 735 nucleotides which spanned the p26 region were amplified by PCR and sequenced. We analysed p26 sequence diversity evolution, measured site-specific selection pressures and evolutionary rates, and determined if these evolutionary parameters were associated with progression status. Amino acid polymorphisms were mapped to existing p26 protein structures. Results: In total, 369 heterochronous HIV-2 p26 sequences from 12 male patients with a median age of 30 (IQR: 28-37) years at enrolment were analysed. Faster progressors had lower CD4% and faster CD4% decline rates. Median pairwise sequence diversity was higher in faster progressors (5.7x10-3 versus 1.4x10-3 base substitutions per site, P<0.001). p26 evolved under negative selection in both groups (dN/dS=0.12). Virus evolutionary rates were higher in faster than slower progressors - synonymous rates: 4.6x10-3 vs. 2.3x10-3; and nonsynonymous rates: 6.9x10-4 vs. 2.7x10-4 substitutions/site/year, respectively. Virus evolutionary rates correlated negatively with CD4% change rates (rho = -0.8, P=0.02), but not CD4% level. However, Bayes factor (BF) testing indicated that the association between evolutionary rates and CD4% kinetics was supported by weak evidence (BF=0.5). The signature amino acid at p26 positions 6, 12 and 119 differed between faster (6A, 12I, 119A) and slower (6G, 12V, 119P) progressors. These amino acid positions clustered near to the TRIM5 alpha/p26 hexamer interface surface. Conclusions: Faster p26 evolutionary rates were associated with faster progression to AIDS and were mostly driven by synonymous substitutions. Nonsynonymous evolutionary rates were an order of magnitude lower than synonymous rates, with limited amino acid sequence evolution over time within hosts. These results indicate the HIV-2 p26 may be an attractive vaccine or therapeutic target.


2021 ◽  
Author(s):  
Peter D Price ◽  
Daniela H Palmer Droguett ◽  
Jessica A Taylor ◽  
Dong W Kim ◽  
Elsie S Place ◽  
...  

A substantial amount of phenotypic diversity results from changes in gene regulation. Understanding how regulatory diversity evolves is therefore a key priority in identifying mechanisms of adaptive change. However, in contrast to powerful models of sequence evolution, we lack a consensus model of regulatory evolution. Furthermore, recent work has shown that many of the comparative approaches used to study gene regulation are subject to biases that can lead to false signatures of selection. In this review, we first outline the main approaches for describing regulatory evolution and their inherent biases. Next, we bridge the gap between the fields of comparative phylogenetic methods and transcriptomics to reinforce the main pitfalls of inferring regulatory selection and use simulation studies to show that shifts in tissue composition can heavily bias inferences of selection. We close by highlighting the multi-dimensional nature of regulatory variation and identifying major, unanswered questions in disentangling how selection acts on the transcriptome.


2021 ◽  
Author(s):  
Sebastian Burgstaller-Muehlbacher ◽  
Stephen M Crotty ◽  
Heiko A Schmidt ◽  
Tamara Drucks ◽  
Arndt von Haeseler

Selecting the best model of sequence evolution for a multiple sequence alignment (MSA) constitutes the first step of phylogenetic tree reconstruction. Common approaches for inferring nucleotide models typically apply maximum likelihood (ML) methods, with discrimination between models determined by one of several information criteria. This requires tree reconstruction and optimisation which can be computationally expensive. We demonstrate that neural networks can be used to perform model selection, without the need to reconstruct trees, optimise parameters, or calculate likelihoods. We introduce ModelRevelator, a model selection tool underpinned by two deep neural networks. The first neural network, NNmodelfind, recommends one of six commonly used models of sequence evolution, ranging in complexity from JC to GTR. The second, NNalphafind, recommends whether or not a Γ--distributed rate heterogeneous model should be incorporated, and if so, provides an estimate of the shape parameter, ɑ. Users can simply input an MSA into ModelRevelator, and swiftly receive output recommending the evolutionary model, inclusive of the presence or absence of rate heterogeneity, and an estimate of ɑ. We show that ModelRevelator performs comparably with likelihood-based methods over a wide range of parameter settings, with significant potential savings in computational effort. Further, we show that this performance is not restricted to the alignments on which the networks were trained, but is maintained even on unseen empirical data. ModelRevelator will be made freely available in the forthcoming version of IQ-Tree (http://www.iqtree.org), and we expect it will provide a valuable alternative for phylogeneticists, especially where traditional methods of model selection are computationally prohibitive.


Genes ◽  
2021 ◽  
Vol 12 (12) ◽  
pp. 2023
Author(s):  
Dirson Jian Li

Nirenberg’s genetic code chart shows a profound correspondence between codons and amino acids. The aim of this article is to try to explain the primordial formation of the codon degeneracy. It remains a puzzle how informative molecules arose from the supposed prebiotic random sequences. If introducing an initial driving force based on the relative stabilities of triplex base pairs, the prebiotic sequence evolution became innately nonrandom. Thus, the primordial assignment of the 64 codons to the 20 amino acids has been explained in detail according to base substitutions during the coevolution of tRNAs with aaRSs; meanwhile, the classification of aaRSs has also been explained.


2021 ◽  
Author(s):  
Samuel King ◽  
Xinyi E. Chen ◽  
Sarah W. S. Ng ◽  
Kimia Rostin ◽  
Tylo Roberts ◽  
...  

AbstractViral vaccines can lose their efficacy as the genomes of targeted viruses rapidly evolve, resulting in new variants that may evade vaccine-induced immunity. This process is apparent in the emergence of new SARS-CoV-2 variants which have the potential to undermine vaccination efforts and cause further outbreaks. Predictive vaccinology points to a future of pandemic preparedness in which vaccines can be developed preemptively based in part on predictive models of viral evolution. Thus, modeling the trajectory of SARS-CoV-2 spike protein evolution could have value for mRNA vaccine development. Traditionally, in silico sequence evolution has been modeled discretely, while there has been limited investigation into continuous models. Here we present the Viral Predictor for mRNA Evolution (VPRE), an open-source software tool which learns from mutational patterns in viral proteins and models their most statistically likely evolutionary trajectories. We trained a variational autoencoder with real-time and simulated SARS-CoV-2 genome data from Australia to encode discrete spike protein sequences into continuous numerical variables. To simulate evolution along a phylogenetic path, we trained a Gaussian process model with the numerical variables to project spike protein evolution up to five months in advance. Our predictions mapped primarily to a sequence that differed by a single amino acid from the most reported spike protein in Australia within the prediction timeframe, indicating the utility of deep learning and continuous latent spaces for modeling viral protein evolution. VPRE can be readily adapted to investigate and predict the evolution of viruses other than SARS-CoV-2 in temporal, geographic, and lineage-specific pathways.


2021 ◽  
Author(s):  
Iulia Darolti ◽  
Pedro Almeida ◽  
Alison E Wright ◽  
Judith E Mank

Studies of sex chromosome systems at early stages of divergence are key to understanding the initial process and underlying causes of recombination suppression. However, identifying signatures of divergence in homomorphic sex chromosomes can be challenging due to high levels of sequence similarity between the X and the Y. Variations in methodological precision and underlying data can make all the difference between detecting subtle divergence patterns or missing them entirely. Recent efforts to test for X-Y sequence differentiation in the guppy have led to contradictory results. Here we apply different analytical methodologies to the same dataset to test for the accuracy of different approaches in identifying patterns of sex chromosome divergence in the guppy. Our comparative analysis reveals that the most substantial source of variation in the results of the different analyses lies in the reference genome used. Analyses using custom-made de novo genome assemblies for the focal species successfully recover a signal of divergence across different methodological approaches. By contrast, using the distantly related Xiphophorus reference genome results in variable patterns, due to both sequence evolution and structural variations on the sex chromosomes between the guppy and Xiphophorus. Changes in mapping and filtering parameters can additionally introduce noise and obscure the signal. Our results illustrate how analytical differences can alter perceived results and we highlight best practices for the study of nascent sex chromosomes.


2021 ◽  
Author(s):  
Ziwei Wang ◽  
Mathieu Rouard ◽  
Manosh Kumar Biswas ◽  
Gaetan Droc ◽  
Dongli Cui ◽  
...  

Background: Ensete glaucum (2n = 2x = 18) is a giant herbaceous monocotyledonous plant in the small Musaceae family along with banana (Musa). A high-quality reference genome sequence of E. glaucum offers a vital genomic resource for functional and evolutionary studies of Ensete, the Musaceae, and more widely in the Zingiberales. Findings: Using a combination of Illumina and Oxford Nanopore Technologies (ONT) sequencing, genome-wide chromosome conformation capture (Hi-C), and RNA survey sequence, we report a high-quality assembly of the 481.5Mb genome with 9 pseudochromosomes and 36,836 genes (BUSCO 94.7%). A total of 55% of the genome is composed of repetitive sequences with LTR-retroelements (37%) and DNA transposons (7%) predominant. The 5S and 45S rDNA were each present at one locus, and the 5S rDNA had an exceptionally long monomer length of c.1,056 bp, contrasting with the c. 450 bp monomer at multiple loci in Musa. A tandemly repeated c. 134 bp satellite, 1.1% of the genome (with no similar sequence in Musa), was present around all nine centromeres, with a LINE retroelement also found at Musa centromeres. The assembly, including centromeric positions, enabled us to characterize in detail the chromosomal rearrangements occurring between the x = 9 species and x = 11 species of Musa. Only one chromosome has the same gene content as M. acuminata (ma). Three ma chromosomes represent part of only one E. glaucum (eg) chromosome, while the remaining seven ma chromosomes are fusions of parts of two, three, or four eg chromosomes, demonstrating complex and multiple evolutionary rearrangements in the change between x = 9 and x = 11. Conclusions: The advance towards a Musaceae pangenome including E. glaucum, tolerant of extreme environments, makes a complete set of gene alleles available for crop breeding and understanding environmental responses. The chromosome-scale genome assembly show how chromosome number evolves, and features of the rapid evolution of repetitive sequences.


Sign in / Sign up

Export Citation Format

Share Document