scholarly journals The Evolution of Isochores: Evidence From SNP Frequency Distributions

Genetics ◽  
2002 ◽  
Vol 162 (4) ◽  
pp. 1805-1810 ◽  
Author(s):  
Martin J Lercher ◽  
Nick G C Smith ◽  
Adam Eyre-Walker ◽  
Laurence D Hurst

AbstractThe large-scale systematic variation in nucleotide composition along mammalian and avian genomes has been a focus of the debate between neutralist and selectionist views of molecular evolution. Here we test whether the compositional variation is due to mutation bias using two new tests, which do not assume compositional equilibrium. In the first test we assume a standard population genetics model, but in the second we make no assumptions about the underlying population genetics. We apply the tests to single-nucleotide polymorphism data from noncoding regions of the human genome. Both models of neutral mutation bias fit the frequency distributions of SNPs segregating in low- and medium-GC-content regions of the genome adequately, although both suggest compositional nonequilibrium. However, neither model fits the frequency distribution of SNPs from the high-GC-content regions. In contrast, a simple population genetics model that incorporates selection or biased gene conversion cannot be rejected. The results suggest that mutation biases are not solely responsible for the compositional biases found in noncoding regions.

Genome ◽  
2011 ◽  
Vol 54 (8) ◽  
pp. 663-673 ◽  
Author(s):  
Dai-Yong Kuang ◽  
Hong Wu ◽  
Ya-Ling Wang ◽  
Lian-Ming Gao ◽  
Shou-Zhou Zhang ◽  
...  

Here, we report a completely sequenced plastome using Illumina/Solexa sequencing-by-synthesis (SBS) technology. The plastome of Magnolia kwangsiensis Figlar & Noot. is 159 667 bp in length with a typical quadripartite structure: 88 030 bp large single-copy (LSC) and 18 669 bp small single-copy (SSC) regions, separated by two 26 484 bp inverted repeat (IR) regions. The overall predicted gene number is 129, among which 17 genes are duplicated in IR regions. The plastome of M. kwangsiensis is identical in its gene order to previously published plastomes of magnoliids. Furthermore, the C-to-U type RNA editing frequency of 114 seed plants is positively correlated with plastome GC content and plastome length, whereas plastome length is not correlated with GC content. A total of 16 potential putative barcoding or low taxonomic level phylogenetic study markers in Magnoliaceae were detected by comparing the coding and noncoding regions of the plastome of M. kwangsiensis with that of Liriodendron tulipifera L. At least eight markers might be applied not only to Magnoliaceae but also to other taxa. The 86 mononucleotide cpSSRs that distributed in single-copy noncoding regions are highly valuable to study population genetics and conservation genetics of this endangered rare species.


2017 ◽  
Author(s):  
Prashant Mainali ◽  
Sobita Pathak

ABSTRACTCodon usage bias is the preferential use of the subset of synonymous codons during translation. In this paper, the comparisons of normalized entropy and GC content between the sequence of coding regions of Escherichia coli k12 and noncoding regions (ncRNA, rRNA) of various organisms were done to shed light on the origin of the codon usage bias.The normalized entropy of the coding regions was found significantly higher than the noncoding regions, suggesting the role of the translation process in shaping codon usage bias. Further, when the position specific GC content of both coding and noncoding regions was analyzed, the GC2 content in coding regions was lower than GC1 and GC2 while in noncoding regions, the GC1, GC2, GC3 contents were approximately equal. This discrepancy is explained by the biased mutation coupled with the presence and absence of selection pressure. The accumulation of CG content occurs in the sequences due to mutation bias in DNA repair and recombination process. In noncoding regions, the mutation is harmful and thus, selected against while due to the degeneracy of codons in coding regions, a mutation in GC3 is neutral and hence, not selected. Thus, the accumulation of GC content occurs in coding regions, and thus codon usage bias occurs.


2006 ◽  
Vol 11 (3) ◽  
pp. 236-246 ◽  
Author(s):  
Laurence H. Lamarcq ◽  
Bradley J. Scherer ◽  
Michael L. Phelan ◽  
Nikolai N. Kalnine ◽  
Yen H. Nguyen ◽  
...  

A method for high-throughput cloning and analysis of short hairpin RNAs (shRNAs) is described. Using this approach, 464 shRNAs against 116 different genes were screened for knockdown efficacy, enabling rapid identification of effective shRNAs against 74 genes. Statistical analysis of the effects of various criteria on the activity of the shRNAs confirmed that some of the rules thought to govern small interfering RNA (siRNA) activity also apply to shRNAs. These include moderate GC content, absence of internal hairpins, and asymmetric thermal stability. However, the authors did not find strong support for positionspecific rules. In addition, analysis of the data suggests that not all genes are equally susceptible to RNAinterference (RNAi).


Author(s):  
Andreina I Castillo ◽  
Rodrigo P P Almeida

Abstract Nucleotide composition (GC content) varies across bacteria species, genome regions, and specific genes. In Xylella fastidiosa, a vector-borne fastidious plant pathogen infecting multiple crops, GC content ranges between ∼51-52%; however, these values were gathered using limited genomic data. We evaluated GC content variations across X. fastidiosa subspecies fastidiosa (N = 194), subsp. pauca (N = 107), and subsp. multiplex (N = 39). Genomes were classified based on plant host and geographic origin; individual genes within each genome were classified based on gene function, strand, length, ortholog group, Core vs. Accessory, and Recombinant vs. Non-recombinant. GC content was calculated for each gene within each evaluated genome. The effects of genome and gene level variables were evaluated with a mixed effect ANOVA, and the marginal-GC content was calculated for each gene. Also, the correlation between gene-specific GC content vs. natural selection (dN/dS) and recombination/mutation (r/m) was estimated. Our analyses show that intra-genomic changes in nucleotide composition in X. fastidiosa are small and influenced by multiple variables. Higher AT-richness is observed in genes involved in replication and translation, and genes in the leading strand. In addition, we observed a negative correlation between high-AT and dN/dS in subsp. pauca. The relationship between recombination and GC content varied between core and accessory genes. We hypothesize that distinct evolutionary forces and energetic constraints both drive and limit these small variations in nucleotide composition.


2006 ◽  
Vol 04 (03) ◽  
pp. 639-647 ◽  
Author(s):  
ELEAZAR ESKIN ◽  
RODED SHARAN ◽  
ERAN HALPERIN

The common approaches for haplotype inference from genotype data are targeted toward phasing short genomic regions. Longer regions are often tackled in a heuristic manner, due to the high computational cost. Here, we describe a novel approach for phasing genotypes over long regions, which is based on combining information from local predictions on short, overlapping regions. The phasing is done in a way, which maximizes a natural maximum likelihood criterion. Among other things, this criterion takes into account the physical length between neighboring single nucleotide polymorphisms. The approach is very efficient and is applied to several large scale datasets and is shown to be successful in two recent benchmarking studies (Zaitlen et al., in press; Marchini et al., in preparation). Our method is publicly available via a webserver at .


2021 ◽  
Author(s):  
Neetu Tyagi ◽  
Rahila Sardar ◽  
Dinesh Gupta

AbstractThe Coronavirus disease 2019 (COVID-19) outbreak caused by Severe Acute Respiratory Syndrome Coronavirus 2 virus (SARS-CoV-2) poses a worldwide human health crisis, causing respiratory illness with a high mortality rate. To investigate the factors governing codon usage bias in all the respiratory viruses, including SARS-CoV-2 isolates from different geographical locations (~62K), including two recently emerging strains from the United Kingdom (UK), i.e., VUI202012/01 and South Africa (SA), i.e., 501.Y.V2 codon usage bias (CUBs) analysis was performed. The analysis includes RSCU analysis, GC content calculation, ENC analysis, dinucleotide frequency and neutrality plot analysis. We were motivated to conduct the study to fulfil two primary aims: first, to identify the difference in codon usage bias amongst all SARS-CoV-2 genomes and, secondly, to compare their CUBs properties with other respiratory viruses. A biased nucleotide composition was found as most of the highly preferred codons were A/U-ending in all the respiratory viruses studied here. Compared with the human host, the RSCU analysis led to the identification of 11 over-represented codons and 9 under-represented codons in SARS-CoV-2 genomes. Correlation analysis of ENC and GC3s revealed that mutational pressure is the leading force determining the CUBs. The present study results yield a better understanding of codon usage preferences for SARS-CoV-2 genomes and discover the possible evolutionary determinants responsible for the biases found among the respiratory viruses, thus unveils a unique feature of the SARS-CoV-2 evolution and adaptation. To the best of our knowledge, this is the first attempt at comparative CUBs analysis on the worldwide genomes of SARS-CoV-2, including novel emerged strains and other respiratory viruses.


2019 ◽  
Vol 14 (6) ◽  
pp. 711-717 ◽  
Author(s):  
Gustavo Monnerat ◽  
Alex S. Maior ◽  
Marcio Tannure ◽  
Lia K.F.C. Back ◽  
Caleb G.M. Santos

Purpose: Soccer is one of the most popular sports worldwide, a physical activity of great physiological demand and complexity. Currently, numerous trials involving physiological responses such as hypertrophy, energy expenditure, vasodilation, cardiac output, VO2max, and recovery have supported the possibility of genomic predictors’ affecting performance. In a complementary way to association studies with single nucleotide polymorphisms (SNPs), the objective was to evaluate if the use of population genetics data from human-genomics databases can provide information for a better understanding of the relationship between heritability and sport performance. Methods: The study included 25 healthy male professional soccer players (25.5 [4.3] y, 177.4 [6.4] cm, 76.4 [6.4] kg, body fat 10.5% [4.3%]) from the Brazilian first-division soccer club. Anthropometric measurements and field and isokinetic tests were performed to evaluate performance and physiologic parameters of subjects. Moreover, 10 genetic polymorphisms previously related to performance were genotyped. The genotypes of the same polymorphisms were obtained for 2504 individuals from the populations deposited in the 1000 Genomes database. A principal-component analysis and matrix genetic-distances approach (Fst) were evaluated. Results: As expected, the admixture Brazilian population has numerous genetic similarities with the European and American populations from genomic databases. Although the African component is absolutely recognized in genomes from the Brazilian population, using the specific performance-related SNPs, surprisingly the African population was one of the most genetically distant of the players (P < .00001). Conclusions: The early results suggest a selective pressure on genes of elite soccer players, possibly related simultaneously to physical-performance, environmental, cognitive, and sociocultural aspects.


2021 ◽  
Author(s):  
Alexander L Cope ◽  
Premal Shah

Patterns of non-uniform usage of synonymous codons (codon bias) varies across genes in an organism and across species from all domains of life. The bias in codon usage is due to a combination of both non-adaptive (e.g. mutation biases) and adaptive (e.g. natural selection for translation efficiency/accuracy) evolutionary forces. Most population genetics models quantify the effects of mutation bias and selection on shaping codon usage patterns assuming a uniform mutation bias across the genome. However, mutation biases can vary both along and across chromosomes due to processes such as biased gene conversion, potentially obfuscating signals of translational selection. Moreover, estimates of variation in genomic mutation biases are often lacking for non-model organisms. Here, we combine an unsupervised learning method with a population genetics model of synonymous codon bias evolution to assess the impact of intragenomic variation in mutation bias on the strength and direction of natural selection on synonymous codon usage across 49 Saccharomycotina budding yeasts. We find that in the absence of a priori information, unsupervised learning approaches can be used to identify regions evolving under different mutation biases. We find that the impact of intragenomic variation in mutation bias varies widely, even among closely-related species. We show that the overall strength and direction of selection on codon usage can be underestimated by failing to account for intragenomic variation in mutation biases. Interestingly, genes falling into clusters identified by machine learning are also often physically clustered across chromosomes, consistent with processes such as biased gene conversion. Our results indicate the need for more nuanced models of sequence evolution that systematically incorporate the effects of variable mutation biases on codon frequencies.


Author(s):  
Sankar Subramanian

The worldwide outbreak of a novel coronavirus, SARS-CoV-2 has caused a pandemic of respiratory disease. Due to this emergency, researchers around the globe have been investigating the evolution of the genome of SARS-CoV-2 in order to design vaccines. Here I examined the evolution of GC content of SARS-CoV-2 by comparing the genomes of the members of the group Betacoronavirus. The results of this investigation revealed a highly significant positive correlation between the GC contents of betacoronaviruses and their divergence from SARS-CoV-2. The betacoronaviruses that are distantly related to SARS-CoV-2 have much higher GC contents than the latter. Conversely, the closely related ones have low GC contents, which are only slightly higher than that of SARS-CoV-2. This suggests a systematic reduction in the GC content in the SARS-CoV-2 lineage over time. The declining trend in this lineage predicts a much-reduced GC content in the coronaviruses that will descend/evolve from SARS-CoV-2 in the future. Due to the three consecutive outbreaks (MERS-CoV, SARS-CoV and SARS-CoV-2) caused by the members of the SARS-CoV-2, the scientific community is emphasizing the need for universal vaccines that are effective across many strains including those, that will inevitably emerge in the near future. The reduction in GC contents implies an increase in the rate of GC&rarr;AT mutations than that the mutational changes in the reverse direction. Therefore, understanding the evolution of base composition and mutational patterns of SARS-CoV-2 could be useful in designing broad-spectrum vaccines that could identify and neutralize the present and future strains of this virus.


Sign in / Sign up

Export Citation Format

Share Document