scholarly journals Recurrent Sequence Evolution After Independent Gene Duplication

2020 ◽  
Author(s):  
Samuel Hermann Alexander Von Der Dunk ◽  
Berend Snel

Abstract Background Convergent and parallel evolution provide unique insights into the mechanisms of natural selection. Some of the most striking convergent and parallel (collectively recurrent ) amino acid substitutions in proteins are adaptive, but there are also many that are selectively neutral. Accordingly, genome-wide assessment has shown that recurrent sequence evolution in orthologs is chiefly explained by nearly neutral evolution. For paralogs, more frequent functional change is expected because additional copies are generally not retained if they do not acquire their own niche. Yet, it is unknown to what extent recurrent sequence differentiation is discernible after independent gene duplications in different eukaryotic taxa. Results We develop a framework that detects patterns of recurrent sequence evolution in duplicated genes. This is used to analyze the genomes of 90 diverse eukaryotes. We find a remarkable number of families with a potentially predictable functional differentiation following gene duplication. In some protein families, more than ten independent duplications show a similar sequence-level differentiation between paralogs. Based on further analysis, the sequence divergence is found to be generally asymmetric. Moreover, about 6\% of the recurrent sequence evolution between paralog pairs can be attributed to recurrent differentiation of subcellular localization. Finally, we reveal the specific recurrent patterns for the gene families Hint1/Hint2, Sco1/Sco2 and vma11/vma3. Conclusions The presented methodology provides a means to study the biochemical underpinning of functional differentiation between paralogs. For instance, two abundantly repeated substitutions are identified between independently derived Sco1 and Sco2 paralogs. Such identified substitutions allow direct experimental testing of the biological role of these residues for the repeated functional differentiation. We also uncover a diverse set of families with recurrent sequence evolution and reveal trends in the functional and evolutionary trajectories of this hitherto understudied phenomenon.

2020 ◽  
Author(s):  
Samuel Hermann Alexander Von Der Dunk ◽  
Berend Snel

Abstract Background Convergent and parallel evolution provide unique insights into the mechanisms of natural selection. Some of the most striking convergent and parallel (collectively recurrent ) amino acid substitutions in proteins are adaptive, but there are also many that are selectively neutral. Accordingly, genome-wide assessment has shown that recurrent sequence evolution in orthologs is chiefly explained by nearly neutral evolution. For paralogs, more frequent functional change is expected because additional copies are generally not retained if they do not acquire their own niche. Yet, it is unknown to what extent recurrent sequence differentiation is discernible after independent gene duplications in different eukaryotic taxa. Results We develop a framework that detects patterns of recurrent sequence evolution in duplicated genes. This is used to analyze the genomes of 90 diverse eukaryotes. We find a remarkable number of families with a potentially predictable functional differentiation following gene duplication. In some protein families, more than ten independent duplications show a similar sequence-level differentiation between paralogs. Based on further analysis, the sequence divergence is found to be generally asymmetric. Moreover, about 6\% of the recurrent sequence evolution between paralog pairs can be attributed to recurrent differentiation of subcellular localization. Finally, we reveal the specific recurrent patterns for the gene families Hint1/Hint2, Sco1/Sco2 and vma11/vma3. Conclusions The presented methodology provides a means to study the biochemical underpinning of functional differentiation between paralogs. For instance, two abundantly repeated substitutions are identified between independently derived Sco1 and Sco2 paralogs. Such identified substitutions allow direct experimental testing of the biological role of these residues for the repeated functional differentiation. We also uncover a diverse set of families with recurrent sequence evolution and reveal trends in the functional and evolutionary trajectories of this hitherto understudied phenomenon.


2020 ◽  
Author(s):  
Samuel Hermann Alexander Von Der Dunk ◽  
Berend Snel

Abstract Background Convergent and parallel evolution provide unique insights into the mechanisms of natural selection. Some of the most striking convergent and parallel (collectively recurrent ) amino acid substitutions in proteins are adaptive, but there are also many that are selectively neutral. Accordingly, genome-wide assessment has shown that recurrent sequence evolution in orthologs is chiefly explained by nearly neutral evolution. For paralogs, more frequent functional change is expected because additional copies are generally not retained if they do not acquire their own niche. Yet, it is unknown to what extent recurrent sequence differentiation is discernible after independent gene duplications in different eukaryotic taxa. Results We develop a framework that detects patterns of recurrent sequence evolution in duplicated genes. This is used to analyze the genomes of 90 diverse eukaryotes. We find a remarkable number of families with a potentially predictable functional differentiation following gene duplication. In some protein families, more than ten independent duplications show a similar sequence-level differentiation between paralogs. Based on further analysis, the sequence divergence is found to be generally asymmetric. Moreover, about 6\% of the recurrent sequence evolution between paralog pairs can be attributed to recurrent differentiation of subcellular localization. Finally, we reveal the specific recurrent patterns for the gene families Hint1/Hint2, Sco1/Sco2 and vma11/vma3. Conclusions The presented methodology provides a means to study the biochemical underpinning of functional differentiation between paralogs. For instance, two abundantly repeated substitutions are identified between independently derived Sco1 and Sco2 paralogs. Such identified substitutions allow direct experimental testing of the biological role of these residues for the repeated functional differentiation. We also uncover a diverse set of families with recurrent sequence evolution and reveal trends in the functional and evolutionary trajectories of this hitherto understudied phenomenon.


2020 ◽  
Author(s):  
Samuel Hermann Alexander Von Der Dunk ◽  
Berend Snel

Abstract Background Convergent and parallel evolution provide unique insights into the mechanisms of natural selection. Some of the most striking convergent and parallel (collectively recurrent ) amino acid substitutions in proteins are adaptive, but there are also many that are selectively neutral. Genome-wide assessment of recurrent substitutions has only been performed for orthologs. These studies have revealed that the pervasiveness of recurrent substitutions is for a large part explained by purifying selection. At any position in a protein, only a subset of amino acids is allowed, increasing the chance of the same substitution happening in different lineages. ResultsWe developed a framework that detects patterns of recurrent differentiation in paralogs across 90 divergent eukaryotic genomes. A skew in recurrent substitutions serves as a proxy for a recurrent trend in function. We find remarkable examples of recurrent sequence evolution after independent duplication, in some cases involving more than ten different lineages where duplicates show a similar differentiation. We reveal the implicated functional patterns for the gene families Hint1/Hint2, Sco1/Sco2 and vma11/vma3. ConclusionsThe presented methodology provides a means to study the biochemical underpinning of functional differentiation between paralogs. For instance, two abundantly repeated substitutions are identified between independently derived Sco1 and Sco2 paralogs. Such identified substitutions allow direct experimental testing of the biological role of these residues for the repeated functional differentiation. The present study uncovers a diverse set of families with recurrent sequence evolution and reveals trends in the functional and evolutionary trajectories of this hitherto understudied phenomenon.


2021 ◽  
Author(s):  
Stefano Pascarelli ◽  
Paola Laurino

Connecting protein sequence to function is becoming increasingly relevant since high-throughput sequencing studies accumulate large amounts of genomic data. Protein database annotation helps to bridge this gap; however, it is fundamental to understand the mechanisms underlying functional inheritance and divergence. If the homology relationship between proteins is known, can we determine whether the function diverged? In this work, we analyze different possibilities of protein sequence evolution after gene duplication and identify "residue inversions", i.e., sites where the relationship between the ancestry and the functional signal is decoupled. Residues in these sites play a role in functional divergence and could indicate a shift in protein function. We develop a method to recognize residue inversions in a phylogeny and test it on real and simulated datasets. In a dataset built from the Epidermal Growth Factor Receptor (EGFR) sequences found in 88 fish species, we identify 19 positions that went through inversion after gene duplication, mostly located at the ligand-binding extracellular domain.


Genetics ◽  
2000 ◽  
Vol 154 (4) ◽  
pp. 1711-1720 ◽  
Author(s):  
Bryant F McAllister ◽  
Gilean A T McVean

Abstract The amino acid sequence of the transformer (tra) gene exhibits an extremely rapid rate of evolution among Drosophila species, although the gene performs a critical step in sex determination. These changes in amino acid sequence are the result of either natural selection or neutral evolution. To differentiate between selective and neutral causes of this evolutionary change, analyses of both intraspecific and interspecific patterns of molecular evolution of tra gene sequences are presented. Sequences of 31 tra alleles were obtained from Drosophila americana. Many replacement and silent nucleotide variants are present among the alleles; however, the distribution of this sequence variation is consistent with neutral evolution. Sequence evolution was also examined among six species representative of the genus Drosophila. For most lineages and most regions of the gene, both silent and replacement substitutions have accumulated in a constant, clock-like manner. In exon 3 of D. virilis and D. americana we find evidence for an elevated rate of nonsynonymous substitution, but no statistical support for a greater rate of nonsynonymous relative to synonymous substitutions. Both levels of analysis of the tra sequence suggest that, although the gene is evolving at a rapid pace, these changes are neutral in function.


2020 ◽  
Vol 117 (11) ◽  
pp. 5873-5882 ◽  
Author(s):  
Jose Alberto de la Paz ◽  
Charisse M. Nartey ◽  
Monisha Yuvaraj ◽  
Faruck Morcos

We introduce a model of amino acid sequence evolution that accounts for the statistical behavior of real sequences induced by epistatic interactions. We base the model dynamics on parameters derived from multiple sequence alignments analyzed by using direct coupling analysis methodology. Known statistical properties such as overdispersion, heterotachy, and gamma-distributed rate-across-sites are shown to be emergent properties of this model while being consistent with neutral evolution theory, thereby unifying observations from previously disjointed evolutionary models of sequences. The relationship between site restriction and heterotachy is characterized by tracking the effective alphabet dynamics of sites. We also observe an evolutionary Stokes shift in the fitness of sequences that have undergone evolution under our simulation. By analyzing the structural information of some proteins, we corroborate that the strongest Stokes shifts derive from sites that physically interact in networks near biochemically important regions. Perspectives on the implementation of our model in the context of the molecular clock are discussed.


2018 ◽  
Vol 115 (33) ◽  
pp. 8364-8369 ◽  
Author(s):  
Edward Tunnacliffe ◽  
Adam M. Corrigan ◽  
Jonathan R. Chubb

During the evolution of gene families, functional diversification of proteins often follows gene duplication. However, many gene families expand while preserving protein sequence. Why do cells maintain multiple copies of the same gene? Here we have addressed this question for an actin family with 17 genes encoding an identical protein. The genes have divergent flanking regions and are scattered throughout the genome. Surprisingly, almost the entire family showed similar developmental expression profiles, with their expression also strongly coupled in single cells. Using live cell imaging, we show that differences in gene expression were apparent over shorter timescales, with family members displaying different transcriptional bursting dynamics. Strong “bursty” behaviors contrasted steady, more continuous activity, indicating different regulatory inputs to individual actin genes. To determine the sources of these different dynamic behaviors, we reciprocally exchanged the upstream regulatory regions of gene family members. This revealed that dynamic transcriptional behavior is directly instructed by upstream sequence, rather than features specific to genomic context. A residual minor contribution of genomic context modulates the gene OFF rate. Our data suggest promoter diversification following gene duplication could expand the range of stimuli that regulate the expression of essential genes. These observations contextualize the significance of transcriptional bursting.


Sign in / Sign up

Export Citation Format

Share Document