scholarly journals Genome-wide identification of Arabidopsis non-AUG-initiated upstream ORFs with evolutionarily conserved regulatory sequences that control protein expression levels

2021 ◽  
Author(s):  
Yuta Hiragori ◽  
Hiro Takahashi ◽  
Noriya Hayashi ◽  
Shun Sasaki ◽  
Kodai Nakao ◽  
...  

Upstream open reading frames (uORFs) are short ORFs found in the 5′-UTRs of many eukaryotic transcripts and can influence the translation of protein-coding main ORFs (mORFs). Recent genome-wide ribosome profiling studies have revealed that thousands of uORFs initiate translation at non-AUG start codons. However, the physiological significance of these non-AUG uORFs has so far been demonstrated for only a few of them. It is conceivable that physiologically important non-AUG uORFs are evolutionarily conserved across species. In this study, using a combination of bioinformatics and experimental approaches, we searched the Arabidopsis genome for non-AUG-initiated uORFs with conserved sequences that control the expression of the mORF-encoded proteins. As a result, we identified four novel regulatory non-AUG uORFs. Among these, two exerted repressive effects on mORF expression in an amino acid sequence-dependent manner. These two non-AUG uORFs are likely to encode regulatory peptides that cause ribosome stalling, thereby enhancing their repressive effects. In contrast, one of the identified regulatory non-AUG uORFs promoted mORF expression by alleviating the inhibitory effect of a downstream AUG-initiated uORF. These findings provide insights into the mechanisms that enable non-AUG uORFs to play regulatory roles despite their low translation initiation efficiencies.

2015 ◽  
Author(s):  
David E Weinberg ◽  
Premal Shah ◽  
Stephen W Eichhorn ◽  
Jeffrey A Hussmann ◽  
Joshua B Plotkin ◽  
...  

Ribosome-footprint profiling provides genome-wide snapshots of translation, but technical challenges can confound its analysis. Here, we use improved methods to obtain ribosome-footprint profiles and mRNA abundances that more faithfully reflect gene expression in Saccharomyces cerevisiae. Our results support proposals that both the beginning of coding regions and codons matching rare tRNAs are more slowly translated. They also indicate that emergent polypeptides with as few as three basic residues within a 10-residue window tend to slow translation. With the improved mRNA measurements, the variation attributable to translational control in exponentially growing yeast was less than previously reported, and most of this variation could be predicted with a simple model that considered mRNA abundance, upstream open reading frames, cap-proximal structure and nucleotide composition, and lengths of the coding and 5′- untranslated regions. Collectively, our results reveal key features of translational control in yeast and provide a framework for executing and interpreting ribosome- profiling studies.


2018 ◽  
Author(s):  
Anica Scholz ◽  
Florian Eggenhofer ◽  
Rick Gelhausen ◽  
Björn Grüning ◽  
Kathi Zarnack ◽  
...  

AbstractRibosome profiling (ribo-seq) provides a means to analyze active translation by determining ribosome occupancy in a transcriptome-wide manner. The vast majority of ribosome protected fragments (RPFs) resides within the protein-coding sequence of mRNAs. However, commonly reads are also found within the transcript leader sequence (TLS) (aka 5’ untranslated region) preceding the main open reading frame (ORF), indicating the translation of regulatory upstream ORFs (uORFs). Here, we present a workflow for the identification of translation-regulatory uORFs. Specifically, uORF-Tools identifies uORFs within a given dataset and generates a uORF annotation file. In addition, a comprehensive human uORF annotation file, based on 35 ribo-seq files, is provided, which can serve as an alternative input file for the workflow. To assess the translation-regulatory activity of the uORFs, stimulus-induced changes in the ratio of the RPFs residing in the main ORFs relative to those found in the associated uORFs are determined. The resulting output file allows for the easy identification of candidate uORFs, which have translation-inhibitory effects on their associated main ORFs. uORF-Tools is available as a free and open Snakemake workflow at https://github.com/Biochemistry1-FFM/uORF-Tools. It is easily installed and all necessary tools are provided in a version-controlled manner, which also ensures lasting usability. uORF-Tools is designed for intuitive use and requires only limited computing times and resources.


2017 ◽  
Author(s):  
Pierre Murat ◽  
Giovanni Marsico ◽  
Barbara Herdy ◽  
Avazeh Ghanbarian ◽  
Guillem Portella ◽  
...  

ABSTRACTRNA secondary structures in the 5’ untranslated regions (UTRs) of mRNAs have been characterised as key determinants of translation initiation. However the role of non-canonical secondary structures, such as RNA G-quadruplexes (rG4s), in modulating translation of human mRNAs and the associated mechanisms remain largely unappreciated. Here we use a ribosome profiling strategy to investigate the translational landscape of human mRNAs with structured 5’ untranslated regions (5’-UTR). We found that inefficiently translated mRNAs, containing rG4-forming sequences in their 5’-UTRs, have an accumulation of ribosome footprints in their 5’-UTRs. We show that rG4-forming sequences are determinants of 5’-UTR translation, suggesting that the folding of rG4 structures thwarts the translation of protein coding sequences (CDS) by stimulating the translation of repressive upstream open reading frames (uORFs). To support our model, we demonstrate that depletion of two rG4s-specialised DEAH-box helicases, DHX36 and DHX9, shifts translation towards rG4-containing uORFs reducing the translation of selected transcripts comprising proto-oncogenes, transcription factors and epigenetic regulators. Transcriptome-wide identification of DHX9 binding sites using individual-nucleotide resolution UV crosslinking and immunoprecipitation (iCLIP) demonstrate that translation regulation is mediated through direct physical interaction between the helicase and its rG4 substrate. Our findings unveil a previously unknown role for non-canonical structures in governing 5’-UTR translation and suggest that the interaction of helicases with rG4s could be considered as a target for future therapeutic intervention.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Hiro Takahashi ◽  
Shido Miyaki ◽  
Hitoshi Onouchi ◽  
Taichiro Motomura ◽  
Nobuo Idesako ◽  
...  

Abstract Upstream open reading frames (uORFs) are present in the 5′-untranslated regions of many eukaryotic mRNAs, and some peptides encoded by these regions play important regulatory roles in controlling main ORF (mORF) translation. We previously developed a novel pipeline, ESUCA, to comprehensively identify plant uORFs encoding functional peptides, based on genome-wide identification of uORFs with conserved peptide sequences (CPuORFs). Here, we applied ESUCA to diverse animal genomes, because animal CPuORFs have been identified only by comparing uORF sequences between a limited number of species, and how many previously identified CPuORFs encode regulatory peptides is unclear. By using ESUCA, 1517 (1373 novel and 144 known) CPuORFs were extracted from four evolutionarily divergent animal genomes. We examined the effects of 17 human CPuORFs on mORF translation using transient expression assays. Through these analyses, we identified seven novel regulatory CPuORFs that repressed mORF translation in a sequence-dependent manner, including one conserved only among Eutheria. We discovered a much higher number of animal CPuORFs than previously identified. Since most human CPuORFs identified in this study are conserved across a wide range of Eutheria or a wider taxonomic range, many CPuORFs encoding regulatory peptides are expected to be found in the identified CPuORFs.


2021 ◽  
Vol 12 ◽  
Author(s):  
Yan Liang ◽  
Wanchao Zhu ◽  
Sijia Chen ◽  
Jia Qian ◽  
Lin Li

Small peptides (sPeptides), <100 amino acids (aa) long, are encoded by small open reading frames (sORFs) often found in the 5′ and 3′ untranslated regions (or other parts) of mRNAs, in long non-coding RNAs, or transcripts from introns and intergenic regions; various sPeptides play important roles in multiple biological processes. In this study, we conducted a comprehensive study of maize (Zea mays) sPeptides using mRNA sequencing, ribosome profiling (Ribo-seq), and mass spectrometry (MS) on six tissues (each with at least two replicates). To identify maize sORFs and sPeptides from these data, we set up a robust bioinformatics pipeline and performed a genome-wide scan. This scan uncovered 9,388 sORFs encoding peptides of 2–100 aa. These sORFs showed distinct genomic features, such as different Kozak region sequences, higher specificity of translation, and high translational efficiency, compared with the canonical protein-coding genes. Furthermore, the MS data verified 2,695 sPeptides. These sPeptides perfectly discriminated all the tissues and were highly associated with their parental genes. Interestingly, the parental genes of sPeptides were significantly enriched in multiple functional gene ontology terms related to abiotic stress and development, suggesting the potential roles of sPeptides in the regulation of their parental genes. Overall, this study lays out the guidelines for genome-wide scans of sORFs and sPeptides in plants by integrating Ribo-seq and MS data and provides a more comprehensive resource of functional sPeptides in maize and gives a new perspective on the complex biological systems of plants.


2019 ◽  
Author(s):  
Hiro Takahashi ◽  
Shido Miyaki ◽  
Hitoshi Onouchi ◽  
Taichiro Motomura ◽  
Nobuo Idesako ◽  
...  

AbstractUpstream open reading frames (uORFs) are present in the 5’-untranslated regions of many eukaryotic mRNAs, and some peptides encoded by these regions play important regulatory roles in controlling main ORF (mORF) translation. We previously developed a novel pipeline, ESUCA, to comprehensively identify plant uORFs encoding functional peptides, based on genome-wide identification of uORFs with conserved peptide sequences (CPuORFs). Here, we applied ESUCA to diverse animal genomes, because animal CPuORFs have been identified only by comparing uORF sequences between a limited number of species, and how many previously identified CPuORFs encode regulatory peptides is unclear. By using ESUCA, 1,517 (1,373 novel and 144 known) CPuORFs were extracted from four evolutionarily divergent animal genomes. We examined the effects of 17 human CPuORFs on mORF translation using transient expression assays. Through these analyses, we identified seven novel regulatory CPuORFs that repressed mORF translation in a sequence-dependent manner, including one conserved only among Eutheria. We discovered a much higher number of animal CPuORFs than previously identified. Since most human CPuORFs identified in this study are conserved across a wide range of Eutheria or a wider taxonomic range, many CPuORFs encoding regulatory peptides are expected to be found in the identified CPuORFs.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Håkon Tjeldnes ◽  
Kornel Labun ◽  
Yamila Torres Cleuren ◽  
Katarzyna Chyżyńska ◽  
Michał Świrski ◽  
...  

Abstract Background With the rapid growth in the use of high-throughput methods for characterizing translation and the continued expansion of multi-omics, there is a need for back-end functions and streamlined tools for processing, analyzing, and characterizing data produced by these assays. Results Here, we introduce ORFik, a user-friendly R/Bioconductor API and toolbox for studying translation and its regulation. It extends GenomicRanges from the genome to the transcriptome and implements a framework that integrates data from several sources. ORFik streamlines the steps to process, analyze, and visualize the different steps of translation with a particular focus on initiation and elongation. It accepts high-throughput sequencing data from ribosome profiling to quantify ribosome elongation or RCP-seq/TCP-seq to also quantify ribosome scanning. In addition, ORFik can use CAGE data to accurately determine 5′UTRs and RNA-seq for determining translation relative to RNA abundance. ORFik supports and calculates over 30 different translation-related features and metrics from the literature and can annotate translated regions such as proteins or upstream open reading frames (uORFs). As a use-case, we demonstrate using ORFik to rapidly annotate the dynamics of 5′ UTRs across different tissues, detect their uORFs, and characterize their scanning and translation in the downstream protein-coding regions. Conclusion In summary, ORFik introduces hundreds of tested, documented and optimized methods. ORFik is designed to be easily customizable, enabling users to create complete workflows from raw data to publication-ready figures for several types of sequencing data. Finally, by improving speed and scope of many core Bioconductor functions, ORFik offers enhancement benefiting the entire Bioconductor environment. Availability http://bioconductor.org/packages/ORFik.


2021 ◽  
Author(s):  
Håkon Tjeldnes ◽  
Kornel Labun ◽  
Yamila Torres Cleuren ◽  
Katarzyna Chyżyńska ◽  
Michał Świrski ◽  
...  

ABSTRACT•BackgroundWith the rapid growth in the use of high-throughput methods for characterizing translation and the continued expansion of multi-omics, there is a need for back-end functions and streamlined tools for processing, analyzing, and characterizing data produced by these assays.•ResultsHere, we introduce ORFik, a user-friendly R/Bioconductor toolbox for studying translation and its regulation. It extends GenomicRanges from the genome to the transcriptome and implements a framework that integrates data from several sources. ORFik streamlines the steps to process, analyze, and visualize the different steps of translation with a particular focus on initiation and elongation. It accepts high-throughput sequencing data from ribosome profiling to quantify ribosome elongation or RCP-seq/TCP-seq to also quantify ribosome scanning. In addition, ORFik can use CAGE data to accurately determine 5’UTRs and RNA-seq for determining translation relative to RNA abundance. ORFik supports and calculates over 30 different translation-related features and metrics from the literature and can annotate translated regions such as proteins or upstream open reading frames. As a use-case, we demonstrate using ORFik to rapidly annotate the dynamics of 5’ UTRs across different tissues, detect their uORFs, and characterize their scanning and translation in the downstream protein-coding regions.•Availabilityhttp://bioconductor.org/packages/ORFik


Plants ◽  
2020 ◽  
Vol 9 (5) ◽  
pp. 608
Author(s):  
Yukio Kurihara

Upstream open reading frames (uORFs) are present in the 5’ leader sequences (or 5’ untranslated regions) upstream of the protein-coding main ORFs (mORFs) in eukaryotic polycistronic mRNA. It is well known that a uORF negatively affects translation of the mORF. Emerging ribosome profiling approaches have revealed that uORFs themselves, as well as downstream mORFs, can be translated. However, it has also been revealed that plants can fine-tune gene expression by modulating uORF-mediated regulation in some situations. This article reviews several proposed mechanisms that enable genes to escape from uORF-mediated negative regulation and gives insight into the application of uORF-mediated regulation for precisely controlling gene expression.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Audrey Montigny ◽  
Patrizia Tavormina ◽  
Carine Duboe ◽  
Hélène San Clémente ◽  
Marielle Aguilar ◽  
...  

Abstract Background Recent genome-wide studies of many species reveal the existence of a myriad of RNAs differing in size, coding potential and function. Among these are the long non-coding RNAs, some of them producing functional small peptides via the translation of short ORFs. It now appears that any kind of RNA presumably has a potential to encode small peptides. Accordingly, our team recently discovered that plant primary transcripts of microRNAs (pri-miRs) produce small regulatory peptides (miPEPs) involved in auto-regulatory feedback loops enhancing their cognate microRNA expression which in turn controls plant development. Here we investigate whether this regulatory feedback loop is present in Drosophila melanogaster. Results We perform a survey of ribosome profiling data and reveal that many pri-miRNAs exhibit ribosome translation marks. Focusing on miR-8, we show that pri-miR-8 can produce a miPEP-8. Functional assays performed in Drosophila reveal that miPEP-8 affects development when overexpressed or knocked down. Combining genetic and molecular approaches as well as genome-wide transcriptomic analyses, we show that miR-8 expression is independent of miPEP-8 activity and that miPEP-8 acts in parallel to miR-8 to regulate the expression of hundreds of genes. Conclusion Taken together, these results reveal that several Drosophila pri-miRs exhibit translation potential. Contrasting with the mechanism described in plants, these data shed light on the function of yet undescribed primary-microRNA-encoded peptides in Drosophila and their regulatory potential on genome expression.


Sign in / Sign up

Export Citation Format

Share Document