scholarly journals The regulatory genome constrains protein sequence evolution: implications for the search for disease-associated genes

PeerJ ◽  
2020 ◽  
Vol 8 ◽  
pp. e9554
Author(s):  
Patrick Evans ◽  
Nancy J. Cox ◽  
Eric R. Gamazon

The development of explanatory models of protein sequence evolution has broad implications for our understanding of cellular biology, population history, and disease etiology. Here we analyze the GTEx transcriptome resource to quantify the effect of the transcriptome on protein sequence evolution in a multi-tissue framework. We find substantial variation among the central nervous system tissues in the effect of expression variance on evolutionary rate, with highly variable genes in the cortex showing significantly greater purifying selection than highly variable genes in subcortical regions (Mann–Whitney U p = 1.4 × 10−4). The remaining tissues cluster in observed expression correlation with evolutionary rate, enabling evolutionary analysis of genes in diverse physiological systems, including digestive, reproductive, and immune systems. Importantly, the tissue in which a gene attains its maximum expression variance significantly varies (p = 5.55 × 10−284) with evolutionary rate, suggesting a tissue-anchored model of protein sequence evolution. Using a large-scale reference resource, we show that the tissue-anchored model provides a transcriptome-based approach to predicting the primary affected tissue of developmental disorders. Using gradient boosted regression trees to model evolutionary rate under a range of model parameters, selected features explain up to 62% of the variation in evolutionary rate and provide additional support for the tissue model. Finally, we investigate several methodological implications, including the importance of evolutionary-rate-aware gene expression imputation models using genetic data for improved search for disease-associated genes in transcriptome-wide association studies. Collectively, this study presents a comprehensive transcriptome-based analysis of a range of factors that may constrain molecular evolution and proposes a novel framework for the study of gene function and disease mechanism.

2011 ◽  
Vol 11 (1) ◽  
pp. 361 ◽  
Author(s):  
Johan A Grahnen ◽  
Priyanka Nandakumar ◽  
Jan Kubelka ◽  
David A Liberles

2009 ◽  
Vol 37 (4) ◽  
pp. 783-786 ◽  
Author(s):  
Romain A. Studer ◽  
Marc Robinson-Rechavi

The evolution of protein function appears to involve alternating periods of conservative evolution and of relatively rapid change. Evidence for such episodic evolution, consistent with some theoretical expectations, comes from the application of increasingly sophisticated models of evolution to large sequence datasets. We present here some of the recent methods to detect functional shifts, using amino acid or codon models. Both provide evidence for punctual shifts in patterns of amino acid conservation, including the fixation of key changes by positive selection. Although a link to gene duplication, a presumed source of functional changes, has been difficult to establish, this episodic model appears to apply to a wide variety of proteins and organisms.


2019 ◽  
Author(s):  
Daniel S. Carvalho ◽  
Sunil Kumar Kenchanmane Raju ◽  
Yang Zhang ◽  
James C. Schnable

AbstractThe grass tribe Paniceae includes a monophyletic subclade of species, the MPC clade, which specialize in each of the three primary C4 sub-pathways NADP-ME, NAD-ME and PCK. The evolutionary history of C4 photosynthesis in this subclade remains ambiguous. Leveraging newly sequenced grass genomes and syntenic orthology data, we estimated rates of protein sequence evolution on ancestral branches for both core enzymes shared across different C4 sub-pathways and enzymes specific to C4 sub-pathways. While core enzymes show elevated rates of protein sequence evolution in ancestral branches consistent with a transition from C3 to C4 photosynthesis in the ancestor for this clade, no subtype specific enzymes showed similar patterns. At least one protein involved in photorespiration also showed elevated rates of protein sequence evolution in the ancestral branch. The set of core C4 enzymes examined here combined with the photorespiratory pathway are necessary for the C2 photosynthetic cycle, a previously proposed intermediate between C3 and C4 photosynthesis. The patterns reported here are consistent with, but not conclusive proof that, C4 photosynthesis in the MPC clade of the Paniceae evolved via a C2 intermediate.


2018 ◽  
Author(s):  
Jeffrey I. Boucher ◽  
Troy W. Whitfield ◽  
Ann Dauphin ◽  
Gily Nachum ◽  
Carl Hollins ◽  
...  

AbstractThe evolution of HIV-1 protein sequences should be governed by a combination of factors including nucleotide mutational probabilities, the genetic code, and fitness. The impact of these factors on protein sequence evolution are interdependent, making it challenging to infer the individual contribution of each factor from phylogenetic analyses alone. We investigated the protein sequence evolution of HIV-1 by determining an experimental fitness landscape of all individual amino acid changes in protease. We compared our experimental results to the frequency of protease variants in a publicly available dataset of 32,163 sequenced isolates from drug-naïve individuals. The most common amino acids in sequenced isolates supported robust experimental fitness, indicating that the experimental fitness landscape captured key features of selection acting on protease during viral infections of hosts. Amino acid changes requiring multiple mutations from the likely ancestor were slightly less likely to support robust experimental fitness than single mutations, consistent with the genetic code favoring chemically conservative amino acid changes. Amino acids that were common in sequenced isolates were predominantly accessible by single mutations from the likely protease ancestor. Multiple mutations commonly observed in isolates were accessible by mutational walks with highly fit single mutation intermediates. Our results indicate that the prevalence of multiple base mutations in HIV-1 protease is strongly influenced by mutational sampling.


Sign in / Sign up

Export Citation Format

Share Document