scholarly journals Codon choice directs constitutive mRNA levels in trypanosomes

eLife ◽  
2018 ◽  
Vol 7 ◽  
Author(s):  
Janaina de Freitas Nascimento ◽  
Steven Kelly ◽  
Jack Sunter ◽  
Mark Carrington

Selective transcription of individual protein coding genes does not occur in trypanosomes and the cellular copy number of each mRNA must be determined post-transcriptionally. Here, we provide evidence that codon choice directs the levels of constitutively expressed mRNAs. First, a novel codon usage metric, the gene expression codon adaptation index (geCAI), was developed that maximised the relationship between codon choice and the measured abundance for a transcriptome. Second, geCAI predictions of mRNA levels were tested using differently coded GFP transgenes and were successful over a 25-fold range, similar to the variation in endogenous mRNAs. Third, translation was necessary for the accelerated mRNA turnover resulting from codon choice. Thus, in trypanosomes, the information determining the levels of most mRNAs resides in the open reading frame and translation is required to access this information.

2020 ◽  
Author(s):  
Julie D. Thompson ◽  
Raymond Ripp ◽  
Claudine Mayer ◽  
Olivier Poch ◽  
Christian J. Michel

AbstractThe X circular code is a set of 20 trinucleotides (codons) that has been identified in the protein-coding genes of most organisms (bacteria, archaea, eukaryotes, plasmids, viruses). It has been shown previously that the X circular code has the important mathematical property of being an error-correcting code. Thus, motifs of the X circular code, i.e. a series of codons belonging to X, which are significantly enriched in the genes, allow identification and maintenance of the reading frame in genes. X motifs have also been identified in many transfer RNA (tRNA) genes and in important functional regions of the ribosomal RNA (rRNA), notably in the peptidyl transferase center and the decoding center. Here, we investigate the potential role of X motifs as functional elements in the regulation of gene expression. Surprisingly, the definition of a simple parameter identifies several relations between the X circular code and gene expression. First, we identify a correlation between the 20 codons of the X circular code and the optimal codons/dicodons that have been shown to influence translation efficiency. Using previously published experimental data, we then demonstrate that the presence of X motifs in genes can be used to predict the level of gene expression. Based on these observations, we propose the hypothesis that the X motifs represent a new genetic signal, contributing to the maintenance of the correct reading frame and the optimization and regulation of gene expression.Author SummaryThe standard genetic code is used by (quasi-) all organisms to translate information in genes into proteins. Recently, other codes have been identified in genomes that increase the versatility of gene decoding. Here, we focus on the circular codes, an important class of genome codes, that have the ability to detect and maintain the reading frame during translation. Motifs of the X circular code are enriched in protein-coding genes from most organisms from bacteria to eukaryotes, as well as in important molecules in the gene translation machinery, including transfer RNA (tRNA) and ribosomal RNA (rRNA). Based on these observations, it has been proposed that the X circular code represents an ancestor of the standard genetic code, that was used in primordial systems to simultaneously decode a smaller set of amino acids and synchronize the reading frame. Using previously published experimental data, we highlight several links between the presence of X motifs in genes and more efficient gene expression, supporting the hypothesis that the X circular code still contributes to the complex dynamics of gene regulation in extant genomes.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
T. M. Porter ◽  
M. Hajibabaei

Abstract Background Pseudogenes are non-functional copies of protein coding genes that typically follow a different molecular evolutionary path as compared to functional genes. The inclusion of pseudogene sequences in DNA barcoding and metabarcoding analysis can lead to misleading results. None of the most widely used bioinformatic pipelines used to process marker gene (metabarcode) high throughput sequencing data specifically accounts for the presence of pseudogenes in protein-coding marker genes. The purpose of this study is to develop a method to screen for nuclear mitochondrial DNA segments (nuMTs) in large COI datasets. We do this by: (1) describing gene and nuMT characteristics from an artificial COI barcode dataset, (2) show the impact of two different pseudogene removal methods on perturbed community datasets with simulated nuMTs, and (3) incorporate a pseudogene filtering step in a bioinformatic pipeline that can be used to process Illumina paired-end COI metabarcode sequences. Open reading frame length and sequence bit scores from hidden Markov model (HMM) profile analysis were used to detect pseudogenes. Results Our simulations showed that it was more difficult to identify nuMTs from shorter amplicon sequences such as those typically used in metabarcoding compared with full length DNA barcodes that are used in the construction of barcode libraries. It was also more difficult to identify nuMTs in datasets where there is a high percentage of nuMTs. Existing bioinformatic pipelines used to process metabarcode sequences already remove some nuMTs, especially in the rare sequence removal step, but the addition of a pseudogene filtering step can remove up to 5% of sequences even when other filtering steps are in place. Conclusions Open reading frame length filtering alone or combined with hidden Markov model profile analysis can be used to effectively screen out apparent pseudogenes from large datasets. There is more to learn from COI nuMTs such as their frequency in DNA barcoding and metabarcoding studies, their taxonomic distribution, and evolution. Thus, we encourage the submission of verified COI nuMTs to public databases to facilitate future studies.


2017 ◽  
Author(s):  
Cristina Cruz ◽  
Monica Della Rosa ◽  
Christel Krueger ◽  
Qian Gao ◽  
Lucy Field ◽  
...  

AbstractTranscription of protein coding genes is accompanied by recruitment of COMPASS to promoter-proximal chromatin, which deposits di- and tri-methylation on histone H3 lysine 4 (H3K4) to form H3K4me2 and H3K4me3. Here we determine the importance of COMPASS in maintaining gene expression across lifespan in budding yeast. We find that COMPASS mutations dramatically reduce replicative lifespan and cause widespread gene expression defects. Known repressive functions of H3K4me2 are progressively lost with age, while hundreds of genes become dependent on H3K4me3 for full expression. Induction of these H3K4me3 dependent genes is also impacted in young cells lacking COMPASS components including the H3K4me3-specific factor Spp1. Remarkably, the genome-wide occurrence of H3K4me3 is progressively reduced with age despite widespread transcriptional induction, minimising the normal positive correlation between promoter H3K4me3 and gene expression. Our results provide clear evidence that H3K4me3 is required to attain normal expression levels of many genes across organismal lifespan.


1998 ◽  
Vol 72 (1) ◽  
pp. 857-861 ◽  
Author(s):  
Adrian Whitehouse ◽  
Matthew Cooper ◽  
David M. Meredith

ABSTRACT The herpesvirus saimiri (HVS) immediate-early gene product encoded by open reading frame (ORF) 57 shares limited amino acid homology with HSV-1 ICP27 and Epstein-Barr virus BMLF1, both regulatory proteins. The ORF 57 gene has been proposed to be spliced based on the genome sequence, and here we confirm the intron-exon structure of the gene. We also demonstrate that a cDNA construct of the ORF 57 gene product represses the transactivating capability of the ORF 50a gene product (which is produced from a spliced transcript), but activates that of ORF 50b (an unspliced transcript). Further analyses with cotransfection experiments show that ORF 57 can either activate or repress expression from a range of both early and late HVS promoters, depending on the target gene. These results indicate that repression of gene expression mediated by the ORF 57 gene product is dependent on the presence of an intron within the target gene encoding region. Furthermore, Northern blot analysis demonstrates that the levels of mRNA transcribed from genes not containing an intron are not significantly affected in the presence of the ORF 57 gene product. This suggests that it regulates gene expression through a posttranscriptional mechanism.


1990 ◽  
Vol 10 (7) ◽  
pp. 3727-3736
Author(s):  
B Leiting ◽  
I J Lindner ◽  
A A Noegel

Dictyostelium discoideum plasmid Ddp2 from the wild strain WS380B is a 5.8-kilobase (kb) supercoiled circle with a copy number of 300 per haploid genome. We previously described the construction of an extrachromosomally replicating transformation vector pnDeI carrying 4.7 kb of Ddp2 sequences (B. Leiting, and A. Noegel, Plasmid 20:241-248, 1988). In order to reduce the sequences required for extrachromosomal maintenance in D. discoideum, we characterized Ddp2 by sequence analysis, by deletion experiments, by transcription mapping, by electrophoretic mobility shift assays, and by expression of its single open reading frame in Escherichia coli. Two elements were involved in replication of Ddp2: a cis-acting sequence located on a 592-base-pair (bp) fragment that consisted of 220 bp of essential and 372 bp of auxiliary sequences, and a 2.7-kb open reading frame which most likely encodes a trans-acting factor. The cis- and trans-acting elements did not overlap and were shown to act independently from the location of the sequences encoding the trans-acting factor.


2020 ◽  
Vol 35 (5) ◽  
pp. 1230-1245 ◽  
Author(s):  
L C Poulsen ◽  
J A Bøtkjær ◽  
O Østrup ◽  
K B Petersen ◽  
C Yding Andersen ◽  
...  

Abstract STUDY QUESTION How does the human granulosa cell (GC) transcriptome change during ovulation? SUMMARY ANSWER Two transcriptional peaks were observed at 12 h and at 36 h after induction of ovulation, both dominated by genes and pathways known from the inflammatory system. WHAT IS KNOWN ALREADY The crosstalk between GCs and the oocyte, which is essential for ovulation and oocyte maturation, can be assessed through transcriptomic profiling of GCs. Detailed transcriptional changes during ovulation have not previously been assessed in humans. STUDY DESIGN, SIZE, DURATION This prospective cohort study comprised 50 women undergoing fertility treatment in a standard antagonist protocol at a university hospital-affiliated fertility clinic in 2016–2018. PARTICIPANTS/MATERIALS, SETTING, METHODS From each woman, one sample of GCs was collected by transvaginal ultrasound-guided follicle aspiration either before or 12 h, 17 h or 32 h after ovulation induction (OI). A second sample was collected at oocyte retrieval, 36 h after OI. Total RNA was isolated from GCs and analyzed by microarray. Gene expression differences between the five time points were assessed by ANOVA with a random factor accounting for the pairing of samples, and seven clusters of protein-coding genes representing distinct expression profiles were identified. These were used as input for subsequent bioinformatic analyses to identify enriched pathways and suggest upstream regulators. Subsets of genes were assessed to explore specific ovulatory functions. MAIN RESULTS AND THE ROLE OF CHANCE We identified 13 345 differentially expressed transcripts across the five time points (false discovery rate, <0.01) of which 58% were protein-coding genes. Two clusters of mainly downregulated genes represented cell cycle pathways and DNA repair. Upregulated genes showed one peak at 12 h that resembled the initiation of an inflammatory response, and one peak at 36 h that resembled the effector functions of inflammation such as vasodilation, angiogenesis, coagulation, chemotaxis and tissue remodelling. Genes involved in cell–matrix interactions as a part of cytoskeletal rearrangement and cell motility were also upregulated at 36 h. Predicted activated upstream regulators of ovulation included FSH, LH, transforming growth factor B1, tumour necrosis factor, nuclear factor kappa-light-chain-enhancer of activated B cells, coagulation factor 2, fibroblast growth factor 2, interleukin 1 and cortisol, among others. The results confirmed early regulation of several previously described factors in a cascade inducing meiotic resumption and suggested new factors involved in cumulus expansion and follicle rupture through co-regulation with previously described factors. LARGE SCALE DATA The microarray data were deposited to the Gene Expression Omnibus (www.ncbi.nlm.nih.gov/gds/, accession number: GSE133868). LIMITATIONS, REASONS FOR CAUTION The study included women undergoing ovarian stimulation and the findings may therefore differ from a natural cycle. However, the results confirm significant regulation of many well-established ovulatory genes from a series of previous studies such as amphiregulin, epiregulin, tumour necrosis factor alfa induced protein 6, tissue inhibitor of metallopeptidases 1 and plasminogen activator inhibitor 1, which support the relevance of the results. WIDER IMPLICATIONS OF THE FINDINGS The study increases our understanding of human ovarian function during ovulation, and the publicly available dataset is a valuable resource for future investigations. Suggested upstream regulators and highly differentially expressed genes may be potential pharmaceutical targets in fertility treatment and gynaecology. STUDY FUNDING/COMPETING INTEREST(S) The study was funded by EU Interreg ÔKS V through ReproUnion (www.reprounion.eu) and by a grant from the Region Zealand Research Foundation. None of the authors have any conflicts of interest to declare.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Mikhail Pomaznoy ◽  
Ashu Sethi ◽  
Jason Greenbaum ◽  
Bjoern Peters

Abstract RNA-seq methods are widely utilized for transcriptomic profiling of biological samples. However, there are known caveats of this technology which can skew the gene expression estimates. Specifically, if the library preparation protocol does not retain RNA strand information then some genes can be erroneously quantitated. Although strand-specific protocols have been established, a significant portion of RNA-seq data is generated in non-strand-specific manner. We used a comprehensive stranded RNA-seq dataset of 15 blood cell types to identify genes for which expression would be erroneously estimated if strand information was not available. We found that about 10% of all genes and 2.5% of protein coding genes have a two-fold or higher difference in estimated expression when strand information of the reads was ignored. We used parameters of read alignments of these genes to construct a machine learning model that can identify which genes in an unstranded dataset might have incorrect expression estimates and which ones do not. We also show that differential expression analysis of genes with biased expression estimates in unstranded read data can be recovered by limiting the reads considered to those which span exonic boundaries. The resulting approach is implemented as a package available at https://github.com/mikpom/uslcount.


2016 ◽  
Vol 113 (41) ◽  
pp. E6117-E6125 ◽  
Author(s):  
Zhipeng Zhou ◽  
Yunkun Dang ◽  
Mian Zhou ◽  
Lin Li ◽  
Chien-hung Yu ◽  
...  

Codon usage biases are found in all eukaryotic and prokaryotic genomes, and preferred codons are more frequently used in highly expressed genes. The effects of codon usage on gene expression were previously thought to be mainly mediated by its impacts on translation. Here, we show that codon usage strongly correlates with both protein and mRNA levels genome-wide in the filamentous fungus Neurospora. Gene codon optimization also results in strong up-regulation of protein and RNA levels, suggesting that codon usage is an important determinant of gene expression. Surprisingly, we found that the impact of codon usage on gene expression results mainly from effects on transcription and is largely independent of mRNA translation and mRNA stability. Furthermore, we show that histone H3 lysine 9 trimethylation is one of the mechanisms responsible for the codon usage-mediated transcriptional silencing of some genes with nonoptimal codons. Together, these results uncovered an unexpected important role of codon usage in ORF sequences in determining transcription levels and suggest that codon biases are an adaptation of protein coding sequences to both transcription and translation machineries. Therefore, synonymous codons not only specify protein sequences and translation dynamics, but also help determine gene expression levels.


Sign in / Sign up

Export Citation Format

Share Document