Codon usage bias controls mRNA and protein abundance in trypanosomatids

Protein abundance differs from a few to millions of copies per cell. Trypanosoma brucei presents an excellent model for studies on codon bias and differential gene expression because transcription is broadly unregulated and uniform across the genome. T. brucei is also a major human and animal protozoal pathogen. Here, an experimental assessment, using synthetic reporter genes, revealed that GC3 codons have a major positive impact on both mRNA and protein abundance. Our estimates of relative expression, based on coding sequences alone (codon usage and sequence length), are within 2-fold of the observed values for the majority of measured cellular mRNAs (n > 7000) and proteins (n > 2000). Our estimates also correspond with expression measures from published transcriptome and proteome datasets from other trypanosomatids. We conclude that codon usage is a key factor affecting global relative mRNA and protein expression in trypanosomatids and that relative abundance can be effectively estimated using only protein coding sequences.

Download Full-text

Evolutionary Patterns of Sex-Biased Genes in Three Species of Haplodiploid Insects

Insects ◽

10.3390/insects11060326 ◽

2020 ◽

Vol 11 (6) ◽

pp. 326

Author(s):

Yu-Jun Wang ◽

Hua-Ling Wang ◽

Xiao-Wei Wang ◽

Shu-Sheng Liu

Keyword(s):

Gene Expression ◽

Slow Rate ◽

Species Complex ◽

Evolutionary Patterns ◽

Protein Coding ◽

Coding Sequences ◽

Species Comparisons ◽

Differential Gene ◽

Males And Females ◽

And Behavior

Females and males often differ obviously in morphology and behavior, and the differences between sexes are the result of natural selection and/or sexual selection. To a great extent, the differences between the two sexes are the result of differential gene expression. In haplodiploid insects, this phenomenon is obvious, since males develop from unfertilized zygotes and females develop from fertilized zygotes. Whiteflies of the Bemisia tabaci species complex are typical haplodiploid insects, and some species of this complex are important pests of many crops worldwide. Here, we report the transcriptome profiles of males and females in three species of this whitefly complex. Between-species comparisons revealed that non-sex-biased genes display higher variation than male-biased or female-biased genes. Sex-biased genes evolve at a slow rate in protein coding sequences and gene expression and have a pattern of evolution that differs from those of social haplodiploid insects and diploid animals. Genes with high evolutionary rates are more related to non-sex-biased traits—such as nutrition, immune system, and detoxification—than to sex-biased traits, indicating that the evolution of protein coding sequences and gene expression has been mainly driven by non-sex-biased traits.

Download Full-text

A novel framework for evaluating the performance of codon usage bias metrics

Journal of The Royal Society Interface ◽

10.1098/rsif.2017.0667 ◽

2018 ◽

Vol 15 (138) ◽

pp. 20170667 ◽

Cited By ~ 3

Author(s):

Sophia S. Liu ◽

Adam J. Hockenberry ◽

Michael C. Jewett ◽

Luís A. N. Amaral

Keyword(s):

Codon Usage ◽

Dna Sequences ◽

Codon Usage Bias ◽

False Negative ◽

Gc Content ◽

Sequence Length ◽

Protein Coding ◽

Cellular Processes ◽

Negative Findings ◽

Measured Effect

The unequal utilization of synonymous codons affects numerous cellular processes including translation rates, protein folding and mRNA degradation. In order to understand the biological impact of variable codon usage bias (CUB) between genes and genomes, it is crucial to be able to accurately measure CUB for a given sequence. A large number of metrics have been developed for this purpose, but there is currently no way of systematically testing the accuracy of individual metrics or knowing whether metrics provide consistent results. This lack of standardization can result in false-positive and false-negative findings if underpowered or inaccurate metrics are applied as tools for discovery. Here, we show that the choice of CUB metric impacts both the significance and measured effect sizes in numerous empirical datasets, raising questions about the generality of findings in published research. To bring about standardization, we developed a novel method to create synthetic protein-coding DNA sequences according to different models of codon usage. We use these benchmark sequences to identify the most accurate and robust metrics with regard to sequence length, GC content and amino acid heterogeneity. Finally, we show how our benchmark can aid the development of new metrics by providing feedback on its performance compared to the state of the art.

Download Full-text

Comprehensive codon usage analysis of rice black-streaked dwarf virus based on P8 and P10 protein coding sequences

Infection Genetics and Evolution ◽

10.1016/j.meegid.2020.104601 ◽

2020 ◽

Vol 86 ◽

pp. 104601

Author(s):

Zhen He ◽

Zhuozhuo Dong ◽

Haifeng Gan

Keyword(s):

Codon Usage ◽

Dwarf Virus ◽

Protein Coding ◽

Coding Sequences ◽

Usage Analysis

Download Full-text

The relationship between base composition and codon usage in bacterial genes and its use for the simple and reliable identification of protein-coding sequences

Gene ◽

10.1016/0378-1119(84)90116-1 ◽

1984 ◽

Vol 30 (1-3) ◽

pp. 157-166 ◽

Cited By ~ 486

Author(s):

M.J. Bibb ◽

P.R. Findlay ◽

M.W. Johnson

Keyword(s):

Codon Usage ◽

Base Composition ◽

Protein Coding ◽

Coding Sequences ◽

Reliable Identification ◽

Bacterial Genes ◽

The Relationship

Download Full-text

A Graphic Approach to Analyzing Codon Usage in 1562 Escherichia coli Protein Coding Sequences

Journal of Molecular Biology ◽

10.1006/jmbi.1994.1263 ◽

1994 ◽

Vol 238 (1) ◽

pp. 1-8 ◽

Cited By ~ 67

Author(s):

Chun-Ting Zhang ◽

Kuo-Chen Chou

Keyword(s):

Escherichia Coli ◽

Codon Usage ◽

Protein Coding ◽

Coding Sequences

Download Full-text

Regulatory context drives conservation of glycine riboswitch aptamers

10.1101/766626 ◽

2019 ◽

Author(s):

Matt Crum ◽

Nikhil Ram-Mohan ◽

Michelle M. Meyer

Keyword(s):

Gene Expression ◽

Ligand Binding ◽

Graph Clustering ◽

Sequence Length ◽

Primary Sequence ◽

Protein Coding ◽

Regulate Gene Expression ◽

Coding Sequences ◽

The Relationship ◽

Regulate Gene

AbstractIn comparison to protein coding sequences, the impact of mutation and natural selection on the sequence and function of non-coding (ncRNA) genes is not well understood. Many ncRNA genes are narrowly distributed to only a few organisms, and appear to be rapidly evolving. Compared to protein coding sequences, there are many challenges associated with assessment of ncRNAs that are not well addressed by conventional phylogenetic approaches, including: short sequence length, lack of primary sequence conservation, and the importance of secondary structure for biological function. Riboswitches are structured ncRNAs that directly interact with small molecules to regulate gene expression in bacteria. They typically consist of a ligand-binding domain (aptamer) whose folding changes drive changes in gene expression. The glycine riboswitch is among the most well-studied due to the widespread occurrence of a tandem aptamer arrangement (tandem), wherein two homologous aptamers interact with glycine and each other to regulate gene expression. However, a significant proportion of glycine riboswitches are comprised of single aptamers (singleton). Here we use graph clustering to circumvent the limitations of traditional phylogenetic analysis when studying the relationship between the tandem and singleton glycine aptamers. Graph clustering enables a broader range of pairwise comparison measures to be used to assess aptamer similarity. Using this approach, we show that one aptamer of the tandem glycine riboswitch pair is typically much more highly conserved, and that which aptamer is conserved depends on the regulated gene. Furthermore, our analysis also reveals that singleton aptamers are more similar to either the first or second tandem aptamer, again based on the regulated gene. Taken together, our findings suggest that tandem glycine riboswitches degrade into functional singletons, with the regulated gene(s) dictating which glycine-binding aptamer is conserved.Author SummaryThe glycine riboswitch is a ncRNA responsible for the regulation of several distinct gene sets in bacteria that is found with either one (singleton) or two (tandem) aptamers, each of which directly senses glycine. Which aptamer is more important for gene-regulation, and the functional difference between tandem and singleton aptamers, are long-standing questions in the riboswitch field. Like many biologically functional RNAs, glycine aptamers require a specific 3D folded conformation. Thus, they have low primary sequence similarity across distantly related homologs, and large changes in sequence length that make creation and analysis of accurate multiple sequence alignments challenging. To better understand the relationship between tandem and singleton aptamers, we used a graph clustering approach that allows us to compare the similarity of aptamers using metrics that measure both sequence and structure similarity. Our investigation reveals that in tandem glycine riboswitches, one aptamer is more highly conserved than the other, and which aptamer is conserved depends on what gene(s) are regulated. Moreover, we find that many singleton glycine riboswitches likely originate from tandem riboswitches in which the ligand-binding site of the non-conserved aptamer has degraded over time.

Download Full-text

Faculty Opinions recommendation of Role of low-complexity sequences in the formation of novel protein coding sequences.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.718030532.793494763 ◽

2014 ◽

Author(s):

Erich Bornberg-Bauer ◽

Magdalena Heberlein

Keyword(s):

Low Complexity ◽

Protein Coding ◽

Coding Sequences ◽

Novel Protein

Download Full-text

Draft Genome Sequence of Urease-Producing Pseudorhodobacter sp. Strain E13, Isolated from the Yellow Sea in Gunsan, South Korea

Microbiology Resource Announcements ◽

10.1128/mra.00189-19 ◽

2019 ◽

Vol 8 (23) ◽

Author(s):

Si Chul Kim ◽

Hyo Jung Lee

Keyword(s):

South Korea ◽

Genome Sequence ◽

Yellow Sea ◽

Draft Genome ◽

The Yellow Sea ◽

Draft Genome Sequence ◽

Protein Coding ◽

Coding Sequences ◽

Gram Negative ◽

Content Type

Here, we report the draft genome sequence of Pseudorhodobacter sp. strain E13, a Gram-negative, aerobic, nonflagellated, and rod-shaped bacterium which was isolated from the Yellow Sea in South Korea. The assembled genome sequence is 3,878,578 bp long with 3,646 protein-coding sequences in 159 contigs.

Download Full-text

Massively parallel gene expression variation measurement of a synonymous codon library

BMC Genomics ◽

10.1186/s12864-021-07462-z ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Alexander Schmitz ◽

Fuzhong Zhang

Keyword(s):

Gene Expression ◽

Codon Usage ◽

Single Cells ◽

Massively Parallel ◽

Protein Abundance ◽

Translation Efficiency ◽

Gene Expression Variation ◽

Expression Variation ◽

Change In Mean ◽

Adaptation Index

Abstract Background Cell-to-cell variation in gene expression strongly affects population behavior and is key to multiple biological processes. While codon usage is known to affect ensemble gene expression, how codon usage influences variation in gene expression between single cells is not well understood. Results Here, we used a Sort-seq based massively parallel strategy to quantify gene expression variation from a green fluorescent protein (GFP) library containing synonymous codons in Escherichia coli. We found that sequences containing codons with higher tRNA Adaptation Index (TAI) scores, and higher codon adaptation index (CAI) scores, have higher GFP variance. This trend is not observed for codons with high Normalized Translation Efficiency Index (nTE) scores nor from the free energy of folding of the mRNA secondary structure. GFP noise, or squared coefficient of variance (CV2), scales with mean protein abundance for low-abundant proteins but does not change at high mean protein abundance. Conclusions Our results suggest that the main source of noise for high-abundance proteins is likely not originating at translation elongation. Additionally, the drastic change in mean protein abundance with small changes in protein noise seen from our library implies that codon optimization can be performed without concerning gene expression noise for biotechnology applications.

Download Full-text

Cloning and Expression Analysis of Two Kdm Lysine Demethylases in the Testes of Mature Yaks and Their Sterile Hybrids

Animals ◽

10.3390/ani10030521 ◽

2020 ◽

Vol 10 (3) ◽

pp. 521

Author(s):

Zhenhua Shen ◽

Lin Huang ◽

Suyu Jin ◽

Yucai Zheng

Keyword(s):

Amino Acids ◽

Protein Expression ◽

Male Sterility ◽

Histone Methylation ◽

Real Time Pcr ◽

Cloning And Expression ◽

Rt Pcr ◽

Coding Sequences ◽

Mrna And Protein Expression ◽

Sterile Hybrids

The objective of this study was to explore the molecular mechanism for male sterility of yak hybrids based on two demethylases. Total RNA was extracted from the testes of adult yaks (n = 10) and yak hybrids (cattle–yaks, n = 10). The coding sequences (CDS) of two lysine demethylases (KDMs), KDM1A and KDM4B, were cloned by RT-PCR. The levels of KDM1A and KDM4B in yaks and cattle–yaks testes were detected using Real-time PCR and Western blotting for mRNA and protein, respectively. In addition, the histone methylation modifications of H3K36me3 and H3K27me3 were compared between testes of yaks and cattle–yaks using ELISA. The CDS of KDM1A and KDM4B were obtained from yak testes. The results showed that the CDS of KDM1A exhibited two variants: variant 1 has a CDS of 2622 bp, encoding 873 amino acids, while variant 2 has a CDS of 2562 bp, encoding 853 amino acids. The CDS of the KDM4B gene was 3351 bp in length, encoding 1116 amino acids. The mRNA and protein expression of KDM1A and KDM4B, as well as the level of H3K36me3, were dramatically decreased in the testes of cattle–yaks compared with yaks. The present results suggest that the male sterility of cattle–yaks might be associated with reduced histone methylation modifications.

Download Full-text