scholarly journals Translational landscape in tomato revealed by transcriptome assembly and ribosome profiling

2019 ◽  
Author(s):  
Hsin-Yen Larry Wu ◽  
Gaoyuan Song ◽  
Justin W. Walley ◽  
Polly Yingshan Hsu

mRNA translation is a critical step in gene expression, but our understanding of the landscape and control of translation in diverse crops remains lacking. Here, we combined de novo transcriptome assembly and ribosome profiling to study global mRNA translation in tomato roots. Taking advantage of the 3-nucleotide periodicity displayed by translating ribosomes, we identified 354 novel small ORFs (sORFs) translated from previously unannotated transcripts, as well as 1329 upstream ORFs (uORFs) translated within the 5-prime UTRs of annotated protein-coding genes. Proteomic analysis confirmed that some of these novel uORFs and sORFs generate stable proteins in planta. Compared with the annotated ORFs, the uORFs use more flexible Kozak sequences around translation start sites. Interestingly, uORF-containing genes are enriched for protein phosphorylation/dephosphorylation and signaling transduction pathways, suggesting a regulatory role for uORFs in these processes. We also demonstrated that ribosome profiling is useful to facilitate the annotation of translated ORFs and noncanonical translation initiation sites. In addition to defining the translatome, our results revealed the global control of mRNA translation by uORFs and microRNAs in tomato. In summary, our approach provides a high-throughput method to discover unannotated ORFs, elucidates evolutionarily conserved translational features, and identifies new regulatory mechanisms hidden in a crop genome.

2019 ◽  
Author(s):  
Thomas F. Martinez ◽  
Qian Chu ◽  
Cynthia Donaldson ◽  
Dan Tan ◽  
Maxim N. Shokhirev ◽  
...  

Protein-coding small open reading frames (smORFs) are emerging as an important class of genes, however, the coding capacity of smORFs in the human genome is unclear. By integrating de novo transcriptome assembly and Ribo-Seq, we confidently annotate thousands of novel translated smORFs in three human cell lines. We find that smORF translation prediction is noisier than for annotated coding sequences, underscoring the importance of analyzing multiple experiments and footprinting conditions. These smORFs are located within non-coding and antisense transcripts, the UTRs of mRNAs, and unannotated transcripts. Analysis of RNA levels and translation efficiency during cellular stress identifies regulated smORFs, providing an approach to select smORFs for further investigation. Sequence conservation and signatures of positive selection indicate that encoded microproteins are likely functional. Additionally, proteomics data from enriched human leukocyte antigen complexes validates the translation of hundreds of smORFs and positions them as a source of novel antigens. Thus, smORFs represent a significant number of important, yet unexplored human genes.


2020 ◽  
Vol 295 (27) ◽  
pp. 8999-9011 ◽  
Author(s):  
Alina Glaub ◽  
Christopher Huptas ◽  
Klaus Neuhaus ◽  
Zachary Ardern

Ribosome profiling (RIBO-Seq) has improved our understanding of bacterial translation, including finding many unannotated genes. However, protocols for RIBO-Seq and corresponding data analysis are not yet standardized. Here, we analyzed 48 RIBO-Seq samples from nine studies of Escherichia coli K12 grown in lysogeny broth medium and particularly focused on the size-selection step. We show that for conventional expression analysis, a size range between 22 and 30 nucleotides is sufficient to obtain protein-coding fragments, which has the advantage of removing many unwanted rRNA and tRNA reads. More specific analyses may require longer reads and a corresponding improvement in rRNA/tRNA depletion. There is no consensus about the appropriate sequencing depth for RIBO-Seq experiments in prokaryotes, and studies vary significantly in total read number. Our analysis suggests that 20 million reads that are not mapping to rRNA/tRNA are required for global detection of translated annotated genes. We also highlight the influence of drug-induced ribosome stalling, which causes bias at translation start sites. The resulting accumulation of reads at the start site may be especially useful for detecting weakly expressed genes. As different methods suit different questions, it may not be possible to produce a “one-size-fits-all” ribosome profiling data set. Therefore, experiments should be carefully designed in light of the scientific questions of interest. We propose some basic characteristics that should be reported with any new RIBO-Seq data sets. Careful attention to the factors discussed should improve prokaryotic gene detection and the comparability of ribosome profiling data sets.


eLife ◽  
2019 ◽  
Vol 8 ◽  
Author(s):  
Chen Xie ◽  
Cemalettin Bekpen ◽  
Sven Künzel ◽  
Maryam Keshavarz ◽  
Rebecca Krebs-Wheaton ◽  
...  

The de novo emergence of new genes has been well documented through genomic analyses. However, a functional analysis, especially of very young protein-coding genes, is still largely lacking. Here, we identify a set of house mouse-specific protein-coding genes and assess their translation by ribosome profiling and mass spectrometry data. We functionally analyze one of them, Gm13030, which is specifically expressed in females in the oviduct. The interruption of the reading frame affects the transcriptional network in the oviducts at a specific stage of the estrous cycle. This includes the upregulation of Dcpp genes, which are known to stimulate the growth of preimplantation embryos. As a consequence, knockout females have their second litters after shorter times and have a higher infanticide rate. Given that Gm13030 shows no signs of positive selection, our findings support the hypothesis that a de novo evolved gene can directly adopt a function without much sequence adaptation.


2018 ◽  
Author(s):  
Nathan Ryder ◽  
Kevin M. Dorn ◽  
Mark Huitsing ◽  
Micah Adams ◽  
Jeff Ploegstra ◽  
...  

AbstractRhizomes facilitate the wintering and vegetative propagation of many perennial grasses. Sorghum halepense (johnsongrass) is an aggressive perennial grass that relies on a robust rhizome system to persist through winters and reproduce asexually from its rootstock nodes. This study aimed to sequence and assemble expressed transcripts within the johnsongrass rhizome. A de novo transcriptome assembly was generated from a single johnsongrass rhizome meristem tissue sample. A total of 141,176 probable protein-coding sequences from the assembly were identified and assigned gene ontology terms using Blast2GO. The johnsongrass assembly was compared to Sorghum bicolor, a related non-rhizomatous species, along with an assembly of similar rhizome tissue from the perennial grain crop Thinopyrum intermedium. The presence/absence analysis yielded a set of 259 johnsongrass contigs that are likely associated with rhizome development.


Author(s):  
Meltem Kuruş ◽  
Soheil Akbari ◽  
Doğa Eskier ◽  
Ahmet Bursalı ◽  
Kemal Ergin ◽  
...  

The generation and use of induced pluripotent stem cells (iPSCs) in order to obtain all differentiated adult cell morphologies without requiring embryonic stem cells is one of the most important discoveries in molecular biology. Among the uses of iPSCs is the generation of neuron cells and organoids to study the biological cues underlying neuronal and brain development, in addition to neurological diseases. These iPSC-derived neuronal differentiation models allow us to examine the gene regulatory factors involved in such processes. Among these regulatory factors are long non-coding RNAs (lncRNAs), genes that are transcribed from the genome and have key biological functions in establishing phenotypes, but are frequently not included in studies focusing on protein coding genes. Here, we provide a comprehensive analysis and overview of the coding and non-coding transcriptome during multiple stages of the iPSC-derived neuronal differentiation process using RNA-seq. We identify previously unannotated lncRNAs via genome-guided de novo transcriptome assembly, and the distinct characteristics of the transcriptome during each stage, including differentially expressed and stage specific genes. We further identify key genes of the human neuronal differentiation network, representing novel candidates likely to have critical roles in neurogenesis using coexpression network analysis. Our findings provide a valuable resource for future studies on neuronal differentiation.


2016 ◽  
Vol 14 (02) ◽  
pp. 1641006 ◽  
Author(s):  
Oxana A. Volkova ◽  
Yury V. Kondrakhin ◽  
Ivan S. Yevshin ◽  
Tagir F. Valeev ◽  
Ruslan N. Sharipov

Ribosome profiling technology (Ribo-Seq) allowed to highlight more details of mRNA translation in cell and get additional information on importance of mRNA sequence features for this process. Application of translation inhibitors like harringtonine and cycloheximide along with mRNA-Seq technique helped to assess such important characteristic as translation efficiency. We assessed the translational importance of features of mRNA sequences with the help of statistical analysis of Ribo-Seq and mRNA-Seq data. Translationally important features known from literature as well as proposed by the authors were used in analysis. Such comparisons as protein coding versus non-coding RNAs and high- versus low-translated mRNAs were performed. We revealed a set of features that allowed to discriminate the compared categories of RNA. Significant relationships between mRNA features and efficiency of translation were also established.


Insects ◽  
2021 ◽  
Vol 12 (4) ◽  
pp. 281
Author(s):  
Haixia Zhan ◽  
Youssef Dewer ◽  
Cheng Qu ◽  
Shiyong Yang ◽  
Chen Luo ◽  
...  

Donacia provosti (Fairmaire, 1885) is a major pest of aquatic crops. It has been widely distributed in the world causing extensive damage to lotus and rice plants. Changes in gene regulation may play an important role in adaptive evolution, particularly during adaptation to feeding and living habits. However, little is known about the evolution and molecular mechanisms underlying the adaptation of D. provosti to its lifestyle and living habits. To address this question, we generated the first larval transcriptome of D. provosti. A total of 20,692 unigenes were annotated from the seven public databases and around 18,536 protein-coding genes have been predicted from the analysis of D. provosti transcriptome. About 5036 orthologous cutlers were identified among four species and 494 unique clusters were identified from D. provosti larvae including the visual perception. Furthermore, to reveal the molecular difference between D. provosti and the Colorado potato beetle Leptinotarsa decemlineata, a comparison between CDS of the two beetles was conducted and 6627 orthologous gene pairs were identified. Based on the ratio of nonsynonymous and synonymous substitutions, 93 orthologous gene pairs were found evolving under positive selection. Interestingly, our results also show that there are 4 orthologous gene pairs of the 93 gene pairs were associated with the “mTOR signaling pathway”, which are predicted to be involved in the molecular mechanism of D. provosti adaptation to the underwater environment. This study will provide us with an important scientific basis for building effective prevention and control system of the aquatic leaf beetle Donacia provosti.


2017 ◽  
Author(s):  
Mariana B. Grizante ◽  
Marc Tollis ◽  
Juan J. Rodriguez ◽  
Ofir Levy ◽  
Michael J. Angilletta ◽  
...  

AbstractBackgroundThe eastern fence lizard (Sceloporus undulatus) has been a model species for ecological and evolutionary research. Genomic and transcriptomic resources for this species would promote investigation of genetic mechanisms that underpin plastic responses to environmental stress, such as climate warming. Moreover, such resources would aid comparative studies of complex traits at the molecular level, such as the transition from oviparous to viviparous reproduction, which happened at least four times within Sceloporus.FindingsA de novo transcriptome assembly for Sceloporus undulatus, Sund_v1.0, was generated using over 179 million Illumina reads obtained from three tissues (whole brain, skeletal muscle, and embryo) as well as previously reported liver sequences. The Sund_v1.0 assembly had an average contig length of 782 nucleotides and an E90N50 statistic of 2,550 nucleotides. Comparing S. undulatus transcripts with the benchmarking universal single-copy orthologs (BUSCO) for tetrapod species yielded 97.2% gene representation. A total of 13,422 protein-coding orthologs were identified in comparison to the genome of the green anole lizard, Anolis carolinensis, which is the closest related species with genomic data available.ConclusionsThe multi-tissue transcriptome of S. undulatus is the first for a member of the family Phrynosomatidae, offering an important resource to advance studies of adaptation in this species and genomic research in reptiles.


2015 ◽  
Author(s):  
Lorenzo Calviello ◽  
Neelanjan Mukherjee ◽  
Emanuel Wyler ◽  
Henrik Zauber ◽  
Antje Hirsekorn ◽  
...  

RNA sequencing protocols allow for quantifying gene expression regulation at each individual step, from transcription to protein synthesis. Ribosome Profiling (Ribo-seq) maps the positions of translating ribosomes over the entire transcriptome. Despite its great potential, a rigorous statistical approach to identify translated regions by means of the characteristic three-nucleotide periodicity of Ribo-seq data is not yet available. To fill this gap, we developed RiboTaper, which quantifies the significance of periodic Ribo-seq reads via spectral analysis methods. We applied RiboTaper on newly generated, deep Ribo-seq data in HEK293 cells, to derive an extensive map of translation that covers Open Reading Frame (ORF) annotations for more than 11,000 protein- coding genes. We also find distinct ribosomal signatures for several hundred detected upstream ORFs and ORFs in annotated non-coding genes (ncORFs). Mass spectrometry data confirms that RiboTaper achieves excellent coverage of the cellular proteome and validates dozens of novel peptide products. Collectively, RiboTaper (available at https://ohlerlab.mdc-berlin.de/software/ ) is a powerful method for comprehensive de novo identification of actively used ORFs in the human genome.


2018 ◽  
Author(s):  
Federico Vita ◽  
Amedeo Alpi ◽  
Edoardo Bertolini

AbstractThe Italian white truffle (Tuber magnatum Pico) is a gastronomic delicacy that dominates the worldwide truffle market. Despite its importance, the genomic resources currently available for this species are still limited. Here we present the first de novo transcriptome assembly of T. magnatum. Illumina RNA-seq data were assembled using a single-k-mer approach into 22,932 transcripts with N50 of 1,524 bp. Our approach allowed to predict and annotate 12,367 putative protein coding sequences, reunited in 6,723 loci. In addition, we identified 2,581 gene-based SSR markers. This work provides the first publicly available reference transcriptome for genomics and genetic studies providing insight into the molecular mechanisms underlying the biology of this important species.


Sign in / Sign up

Export Citation Format

Share Document