scholarly journals Accounting for Background Nucleotide Composition When Measuring Codon Usage Bias

2002 ◽  
Vol 19 (8) ◽  
pp. 1390-1394 ◽  
Author(s):  
John A. Novembre
2021 ◽  
Author(s):  
Neetu Tyagi ◽  
Rahila Sardar ◽  
Dinesh Gupta

AbstractThe Coronavirus disease 2019 (COVID-19) outbreak caused by Severe Acute Respiratory Syndrome Coronavirus 2 virus (SARS-CoV-2) poses a worldwide human health crisis, causing respiratory illness with a high mortality rate. To investigate the factors governing codon usage bias in all the respiratory viruses, including SARS-CoV-2 isolates from different geographical locations (~62K), including two recently emerging strains from the United Kingdom (UK), i.e., VUI202012/01 and South Africa (SA), i.e., 501.Y.V2 codon usage bias (CUBs) analysis was performed. The analysis includes RSCU analysis, GC content calculation, ENC analysis, dinucleotide frequency and neutrality plot analysis. We were motivated to conduct the study to fulfil two primary aims: first, to identify the difference in codon usage bias amongst all SARS-CoV-2 genomes and, secondly, to compare their CUBs properties with other respiratory viruses. A biased nucleotide composition was found as most of the highly preferred codons were A/U-ending in all the respiratory viruses studied here. Compared with the human host, the RSCU analysis led to the identification of 11 over-represented codons and 9 under-represented codons in SARS-CoV-2 genomes. Correlation analysis of ENC and GC3s revealed that mutational pressure is the leading force determining the CUBs. The present study results yield a better understanding of codon usage preferences for SARS-CoV-2 genomes and discover the possible evolutionary determinants responsible for the biases found among the respiratory viruses, thus unveils a unique feature of the SARS-CoV-2 evolution and adaptation. To the best of our knowledge, this is the first attempt at comparative CUBs analysis on the worldwide genomes of SARS-CoV-2, including novel emerged strains and other respiratory viruses.


Viruses ◽  
2019 ◽  
Vol 11 (12) ◽  
pp. 1087 ◽  
Author(s):  
Sheng-Lin Shi ◽  
Run-Xi Xia

All iflavirus members belong to the unique genus, Iflavirus, of the family, Iflaviridae. The host taxa and sequence identities of these viruses are diverse. A codon usage bias, maintained by a balance between selection, mutation, and genetic drift, exists in a wide variety of organisms. We characterized the codon usage patterns of 44 iflavirus genomes that were isolated from the classes, Insecta, Arachnida, Mammalia, and Malacostraca. Iflaviruses lack a strong codon usage bias when they are evaluated using an effective number of codons. The odds ratios of the majority of dinucleotides are within the normal range. However, the dinucleotides at the 1st–2nd codon positions are more biased than those at the 2nd–3rd codon positions. Plots of effective numbers of codons, relative neutrality analysis, and PR2 bias analysis all indicate that selection pressure dominates mutations in shaping codon usage patterns in the family, Iflaviridae. When these viruses were grouped into their host taxa, we found that the indices, including the nucleotide composition, effective number of codons, relative synonymous codon usage, and the influencing factors behind the codon usage patterns, all show that there are non-significant differences between the six host-taxa-groups. Our results disagree with our assumption that diverse viruses should possess diverse codon usage patterns, suggesting that the nucleotide composition and codon usage in the family, Iflaviridae, are not host taxa-specific signatures.


Genetica ◽  
2017 ◽  
Vol 145 (3) ◽  
pp. 295-305 ◽  
Author(s):  
Monisha Nath Choudhury ◽  
Arif Uddin ◽  
Supriyo Chakraborty

2021 ◽  
Author(s):  
Zhihua Ou ◽  
Wei Liu ◽  
Junhua LI ◽  
Hongli Du

Human papillomavirus type 16 (HPV16) is the most prevalent HPV type causing cervical cancers. Herein, using 1,597 full genomes of HPV16, we systemically investigated the mutation profiles, surface protein glycosylation sites and the codon usage bias of the eight open reading frames (ORFs) of HPV16 genomes from different lineages and sublineages. Multiple lineage- or subline-age-specific mutation sites were identified. Glycosylation analysis showed that HPV16 lineage D contained the highest number of unique potential glycosylation site in both L1 and L2 capsid protein, which might lead to their antigenic distances from other HPV16 lineages. Nucleotide composition of HPV16 showed that the overall AT content was higher than GC content at the 3rd codon position. Relatively high ENC values suggested that the HPV16 ORFs didn't have strong codon usage bias. Most of the HPV16 ORFs were mainly governed by natural selection pressure such as translational pressure, except for L2. HPV16 only shared some of the preferred codons with human, which might help reduce competition in translational resources. These findings may help increase our understanding of the heterogeneity between HPV16 lineages and sublineages, and the adaptation mechanism of HPV in human cells, which might facilitate HPV classification and improve vaccine development and application.


Andrologia ◽  
2017 ◽  
Vol 50 (1) ◽  
pp. e12787 ◽  
Author(s):  
M. N. Choudhury ◽  
A. Uddin ◽  
S. Chakraborty

Author(s):  
Yicong Li ◽  
Rui Wang ◽  
Huihui Wang ◽  
Feiyang Pu ◽  
Xili Feng ◽  
...  

Synonymous codon usage bias is a universal characteristic of genomes across various organisms. Autophagy-related gene 13 (atg13) is one essential gene for autophagy initiation, yet the evolutionary trends of the atg13 gene at the usages of nucleotide and synonymous codon remains unexplored. According to phylogenetic analyses for the atg13 gene of 226 eukaryotic organisms at the nucleotide and amino acid levels, it is clear that their nucleotide usages exhibit more genetic information than their amino acid usages. Specifically, the overall nucleotide usage bias quantified by information entropy reflected that the usage biases at the first and second codon positions were stronger than those at the third position of the atg13 genes. Furthermore, the bias level of nucleotide ‘G’ usage is highest, while that of nucleotide ‘C’ usage is lowest in the atg13 genes. On top of that, genetic features represented by synonymous codon usage exhibits a species-specific pattern on the evolution of the atg13 genes to some extent. Interestingly, the codon usages of atg13 genes in the ancestor animals (Latimeria chalumnae, Petromyzon marinus, and Rhinatrema bivittatum) are strongly influenced by mutation pressure from nucleotide composition constraint. However, the distributions of nucleotide composition at different codon positions in the atg13 gene display that natural selection still dominates atg13 codon usages during organisms’ evolution.


2016 ◽  
Author(s):  
Aakash Pandey

AbstractFor the heterologous gene expression systems, the codon bias has to be optimized according to the host for efficient expression. Although DNA viruses show a correlation on codon bias with their hosts, HIV genes show low correlation for both nucleotide composition and codon usage bias with its human host which limits the efficient expression of HIV genes. Despite this variation, HIV is efficient at infecting hosts and multiplying in large number. In this study, first, the degree of codon adaptation is calculated as codon adaptation index (CAI) and compared with the expected threshold value (eCAI) determined from the sequences with the same nucleotide composition as that of the HIV-1 genome. Then, information theoretic analysis of nine genes of HIV-1 based on codon statistics of the HIV-1 genome, individual genes and codon usage of human genes is done. Comparison of codon adaptation indices with their respective threshold values shows that the CAI lies very close to the threshold values. Despite not being well adapted to the codon usage bias of human hosts, it was found that the Shannon entropies of the nine genes based on overall codon statistics of HIV-1 genome are very similar to the entropies calculated from codon usage of human genes. Similarly, for the HIV-1 genome sequence analyzed, the codon statistics of the third reading frame has the highest bias representing minimum entropy and hence the maximum information.


2014 ◽  
Vol 6 (4) ◽  
pp. 417-421 ◽  
Author(s):  
Chakraborty SUPRIYO ◽  
Paul PROSENJIT ◽  
Tarikul Huda MAZUMDER

The base composition at three different codon positions in relation to codon usagebias and gene expressivity was studied in a sample of twenty five essential genes from Haemophilus influenzae. ENC, CBI and Fop were used to quantify the variation in codon usage bias for the cds. CAI is used to estimate the level of gene expression of the cds selected in the present study. To find out the relationship between the extent of codon bias and nucleotide composition the values of A, T, G, C and GC they were compared with the A3, T3, G3, C3 and GC3 values, respectively. The results showed relatively weak codon usage bias among the coding sequences (cds) of Haemophilus influenzae. This in turn, implies that the essential genes prefer to use a set of restricted codons. However, the base compositional analysis of essential genes in Haemophilus influenzae revealed preference of AT to GC bases within their coding sequences and this preference might affect gene expression as indicated by the relatively high CAI values ofthe coding sequences.


Sign in / Sign up

Export Citation Format

Share Document