scholarly journals Pairs of compensatory frameshifting mutations contribute to evolution of protein-coding sequences in vertebrates and insects

2020 ◽  
Author(s):  
Dmitry Biba ◽  
Galya Klink ◽  
Georgii Bazykin

AbstractInsertions and deletions of lengths not divisible by 3 in protein-coding sequences cause frameshifts that usually induce premature stop codons and may carry a high fitness cost. However, this cost can be circumvented by a second compensatory indel restoring the reading frame. The role of such compensatory frameshifting mutations (CFMs) in evolution has not been studied systematically. Here, we use whole-genome alignments of protein coding genes of 100 vertebrate species, and of 122 insect species, studying the prevalence of CFMs in their divergence. After stringent filtering, we detect a total of 11 high-confidence genes carrying pairs of CFMs, including three human genes: RAB36, ARHGAP6 and NCR3LG1. CFMs tended to occur in genes under relaxed negative selection, indicating that they are typically prevented at functionally important genes. In some instances, mutations closely predating or following the CFMs restored the biochemical similarity of the frameshifted segment to the ancestral sequence, possibly reducing or negating the fitness cost of a CFM. Typically, however, the resulting sequence bore no similarity to the ancestral one, indicating that the CFMs can uncover radically novel regions of sequence space. In total, CFMs represent a potentially important and previously overlooked source of novel variation in amino acid sequences.

2019 ◽  
Author(s):  
Yatish Turakhia ◽  
Heidi I. Chen ◽  
Amir Marcovitz ◽  
Gill Bejerano

Gene losses provide an insightful route for studying the morphological and physiological adaptations of species, but their discovery is challenging. Existing genome annotation tools and protein databases focus on annotating intact genes and do not attempt to distinguish nonfunctional genes from genes missing annotation due to sequencing and assembly artifacts. Previous attempts to annotate gene losses have required significant manual curation, which hampers their scalability for the ever-increasing deluge of newly sequenced genomes. Using extreme sequence erosion (deletion and non-synonymous substitution) as an unambiguous signature of loss, we developed an automated approach for detecting high-confidence protein-coding gene loss events across a species tree. Our approach relies solely on gene annotation in a single reference genome, raw assemblies for the remaining species to analyze, and the associated phylogenetic tree for all organisms involved. Using the hg38 human assembly as a reference, we discovered over 500 unique human genes affected by such high-confidence erosion events in different clades across 58 mammals. While most of these events likely have benign consequences, we also found dozens of clade-specific gene losses that result in early lethality in outgroup mammals or are associated with severe congenital diseases in humans. Our discoveries yield intriguing potential for translational medical genetics and for evolutionary biology, and our approach is readily applicable to large-scale genome sequencing efforts across the tree of life.


1990 ◽  
Vol 10 (3) ◽  
pp. 1153-1163
Author(s):  
D H Lowenstein ◽  
D A Butler ◽  
D Westaway ◽  
M P McKinley ◽  
S J DeArmond ◽  
...  

Given the critical role of the prion protein (PrP) in the transmission and pathogenesis of experimental scrapie, we investigated the PrP gene and its protein products in three hamster species, Chinese (CHa), Armenian (AHa), and Syrian (SHa), each of which were found to have distinctive scrapie incubation times. Passaging studies demonstrated that the host species, and not the source of scrapie prions, determined the incubation time for each species, and histochemical studies of hamsters with clinical signs of scrapie revealed characteristic patterns of neuropathology. Northern (RNA) analysis showed the size of PrP mRNA from CHa, AHa, and SHa hamsters to be 2.5, 2.4, and 2.1 kilobases, respectively. Immunoblotting demonstrated that the PrP isoforms were of similar size (33 to 35 kilodaltons); however, the monoclonal antibody 13A5 raised against SHa PrP did not react with the CHa or AHa PrP molecules. Comparison of the three predicted amino acid sequences revealed that each is distinct. Furthermore, differences within the PrP open reading frame that uniquely distinguish the three hamster species are within a hydrophilic segment of 11 amino acids that includes polymorphisms linked to scrapie incubation times in inbred mice and an inherited prion disease of humans. Single polymorphisms in this region correlate with the presence or absence of amyloid plaques for a given hamster species or mouse inbred strain. Our findings demonstrate distinctive molecular, pathological, and clinical characteristics of scrapie in three related species and are consistent with the hypothesis that molecular properties of the host PrP play a pivotal role in determining the incubation time and neuropathological features of scrapie.


1990 ◽  
Vol 10 (3) ◽  
pp. 1153-1163 ◽  
Author(s):  
D H Lowenstein ◽  
D A Butler ◽  
D Westaway ◽  
M P McKinley ◽  
S J DeArmond ◽  
...  

Given the critical role of the prion protein (PrP) in the transmission and pathogenesis of experimental scrapie, we investigated the PrP gene and its protein products in three hamster species, Chinese (CHa), Armenian (AHa), and Syrian (SHa), each of which were found to have distinctive scrapie incubation times. Passaging studies demonstrated that the host species, and not the source of scrapie prions, determined the incubation time for each species, and histochemical studies of hamsters with clinical signs of scrapie revealed characteristic patterns of neuropathology. Northern (RNA) analysis showed the size of PrP mRNA from CHa, AHa, and SHa hamsters to be 2.5, 2.4, and 2.1 kilobases, respectively. Immunoblotting demonstrated that the PrP isoforms were of similar size (33 to 35 kilodaltons); however, the monoclonal antibody 13A5 raised against SHa PrP did not react with the CHa or AHa PrP molecules. Comparison of the three predicted amino acid sequences revealed that each is distinct. Furthermore, differences within the PrP open reading frame that uniquely distinguish the three hamster species are within a hydrophilic segment of 11 amino acids that includes polymorphisms linked to scrapie incubation times in inbred mice and an inherited prion disease of humans. Single polymorphisms in this region correlate with the presence or absence of amyloid plaques for a given hamster species or mouse inbred strain. Our findings demonstrate distinctive molecular, pathological, and clinical characteristics of scrapie in three related species and are consistent with the hypothesis that molecular properties of the host PrP play a pivotal role in determining the incubation time and neuropathological features of scrapie.


2001 ◽  
Vol 47 (3) ◽  
pp. 269-275 ◽  
Author(s):  
Chien-Yuan Chen ◽  
Wen-Tung Wu ◽  
Chang-Jen Huang ◽  
Mei-Huei Lin ◽  
Chen-Kai Chang ◽  
...  

A segment of DNA containing the L-glutamate oxidase (gox) gene from Streptomyces platensis NTU3304 was cloned. The entire nucleotide sequence of the protein-coding portion consisting of 2130 bp (710 codons, including AUG and UGA) of the cloned DNA fragment was determined. The gox gene contained only one open reading frame (ORF) which coded for a 78-kDa polypeptide, the precursor of active extracellular Gox. Mature Gox is composed of three subunits, designated as α, β, and γ, with molecular masses of 39, 19, and 16 kDa, respectively. Analyses of the N-terminal amino acid sequences of the subunits revealed that the order of subunits in the precursor polypeptide encoded by the ORF, from N-terminus to C-terminus, is α–γ–β. The presence of the flavin adenine dinucleotide (FAD)-binding motif place Gox as a member of the flavoenzyme family. Furthermore, a negative effect of glucose on the biosynthesis of Gox was observed when it was used as carbon source.Key words: L-glutamate oxidase, gox gene, signal peptide, DNA sequence, flavoenzyme, pIJ702 vector.


1997 ◽  
Vol 17 (3) ◽  
pp. 1666-1673 ◽  
Author(s):  
R Bishop ◽  
A Musoke ◽  
S Morzaria ◽  
B Sohanpal ◽  
E Gobright

Concerted evolution of multicopy gene families in vertebrates is recognized as an important force in the generation of biological novelty but has not been documented for the multicopy genes of protozoa. A multicopy locus, Tpr, which consists of tandemly arrayed open reading frames (ORFs) containing several repeated elements has been described for Theileria parva. Herein we show that probes derived from the 5'/N-terminal ends of ORFs in the genomic DNAs of T. parva Uganda (1,108 codons) and Boleni (699 codons) hybridized with multicopy sequences in homologous DNA but did not detect similar sequences in the DNA of 14 heterologous T. parva stocks and clones. The probe sequences were, however, protein coding according to predictive algorithms and codon usage. The 3'/C-terminal ends of the Uganda and Boleni ORFs exhibited 75% similarity and identity, respectively, to the previously identified Tpr1 and Tpr2 repetitive elements of T. parva Muguga. Tpr1-homologous sequences were detected in two additional species of Theileria. Eight different Tpr1-homologous transcripts were present in piroplasm mRNA from a single T. parva Muguga-infected animal. The Tpr1 and Tpr2 amino acid sequences contained six predicted membrane-associated segments. The ratio of synonymous to nonsynonymous substitutions indicates that Tpr1 evolves like protein-encoding DNA. The previously determined nucleotide sequence of the gene encoding the p67 antigen is completely identical in T. parva Muguga, Boleni, and Uganda, including the third base in codons. The data suggest that concerted evolution can lead to the radical divergence of coding sequences and that this can be a mechanism for the generation of novel genes.


2020 ◽  
Author(s):  
László Bányai ◽  
Mária Trexler ◽  
Krisztina Kerekes ◽  
Orsolya Csuka ◽  
László Patthy

AbstractA major goal of cancer genomics is to identify all genes that play critical roles in carcinogenesis. Most approaches focused on genes that are positively selected for mutations that drive carcinogenesis and neglected the role of negative selection. Some studies have actually concluded that negative selection has no role in cancer evolution. In the present work we have re-examined the role of negative selection in tumor evolution through the analysis of the patterns of somatic mutations affecting the coding sequences of human genes. Our analyses have confirmed that tumor suppressor genes are positively selected for inactivating mutations. Oncogenes, however, were found to display signals of both negative selection for inactivating mutations and positive selection for activating mutations. Significantly, we have identified numerous human genes that show signs of strong negative selection during tumor evolution, suggesting that their functional integrity is essential for the growth and survival of tumor cells.


2001 ◽  
Vol 5 (3) ◽  
pp. 113-118 ◽  
Author(s):  
ANNELOOR L. M. A. TEN ASBROEK ◽  
JEFFREY OLSEN ◽  
DAVID HOUSMAN ◽  
FRANK BAAS ◽  
VINCE STANTON

The frequency and distribution of genetic polymorphism in the human genome is a question of major importance. We have studied this in highly conserved genes, which encode crucial functions such as DNA replication, mRNA transcription, and translation. Evolutionary comparisons suggest that these genes are under particularly strong selective pressure, and their frequency of nucleotide sequence polymorphism would be expected to represent a minimum estimate for sequence variation throughout the genome. We have analyzed the complete coding sequence and the 3′-untranslated region (3′-UTR) of 22 human genes, most of which have homologs in all cellular organisms and all of which are at least 25% amino acid identical to homologs in yeast. Comparisons with similar studies of less conserved human disease genes indicate that 1) evolutionarily conserved genes are, on average, less polymorphic than disease related genes; 2) the difference in polymorphism levels is attributable almost entirely to reduced levels of variation in protein coding sequences, whereas noncoding sequences have similar levels of polymorphism; and 3) the character of polymorphism, in terms of the spectrum and frequency of mutational changes, is similar.


2011 ◽  
Vol 29 (3) ◽  
pp. 883-886 ◽  
Author(s):  
M. Toll-Riera ◽  
N. Rado-Trilla ◽  
F. Martys ◽  
M. M. Alba

Sign in / Sign up

Export Citation Format

Share Document