scholarly journals The complexity of alternative splicing and landscape of tissue-specific expression in lotus (Nelumbo nucifera) unveiled by Illumina- and single-molecule real-time-based RNA-sequencing

DNA Research ◽  
2019 ◽  
Vol 26 (4) ◽  
pp. 301-311 ◽  
Author(s):  
Yue Zhang ◽  
Tonny Maraga Nyong'A ◽  
Tao Shi ◽  
Pingfang Yang

Abstract Alternative splicing (AS) plays a critical role in regulating different physiological and developmental processes in eukaryotes, by dramatically increasing the diversity of the transcriptome and the proteome. However, the saturation and complexity of AS remain unclear in lotus due to its limitation of rare obtainment of full-length multiple-splice isoforms. In this study, we apply a hybrid assembly strategy by combining single-molecule real-time sequencing and Illumina RNA-seq to get a comprehensive insight into the lotus transcriptomic landscape. We identified 211,802 high-quality full-length non-chimeric reads, with 192,690 non-redundant isoforms, and updated the lotus reference gene model. Moreover, our analysis identified a total of 104,288 AS events from 16,543 genes, with alternative 3ʹ splice-site being the predominant model, following by intron retention. By exploring tissue datasets, 370 tissue-specific AS events were identified among 12 tissues. Both the tissue-specific genes and isoforms might play important roles in tissue or organ development, and are suitable for ‘ABCE’ model partly in floral tissues. A large number of AS events and isoform variants identified in our study enhance the understanding of transcriptional diversity in lotus, and provide valuable resource for further functional genomic studies.

2019 ◽  
Vol 19 (1) ◽  
Author(s):  
Chong Tan ◽  
Hongxin Liu ◽  
Jie Ren ◽  
Xueling Ye ◽  
Hui Feng ◽  
...  

Abstract Background Anther development has been extensively studied at the transcriptional level, but a systematic analysis of full-length transcripts on a genome-wide scale has not yet been published. Here, the Pacific Biosciences (PacBio) Sequel platform and next-generation sequencing (NGS) technology were combined to generate full-length sequences and completed structures of transcripts in anthers of Chinese cabbage. Results Using single-molecule real-time sequencing (SMRT), a total of 1,098,119 circular consensus sequences (CCSs) were generated with a mean length of 2664 bp. More than 75% of the CCSs were considered full-length non-chimeric (FLNC) reads. After error correction, 725,731 high-quality FLNC reads were estimated to carry 51,501 isoforms from 19,503 loci, consisting of 38,992 novel isoforms from known genes and 3691 novel isoforms from novel genes. Of the novel isoforms, we identified 407 long non-coding RNAs (lncRNAs) and 37,549 open reading frames (ORFs). Furthermore, a total of 453,270 alternative splicing (AS) events were identified and the majority of AS models in anther were determined to be approximate exon skipping (XSKIP) events. Of the key genes regulated during anther development, AS events were mainly identified in the genes SERK1, CALS5, NEF1, and CESA1/3. Additionally, we identified 104 fusion transcripts and 5806 genes that had alternative polyadenylation (APA). Conclusions Our work demonstrated the transcriptome diversity and complexity of anther development in Chinese cabbage. The findings provide a basis for further genome annotation and transcriptome research in Chinese cabbage.


2021 ◽  
Vol 22 (19) ◽  
pp. 10443
Author(s):  
Yong Wang ◽  
Jialei Ji ◽  
Long Tong ◽  
Zhiyuan Fang ◽  
Limei Yang ◽  
...  

Cabbage (Brassica oleracea L. var. capitata L.) is an important vegetable crop cultivated around the world. Previous studies of cabbage gene transcripts were primarily based on next-generation sequencing (NGS) technology which cannot provide accurate information concerning transcript assembly and structure analysis. To overcome these issues and analyze the whole cabbage transcriptome at the isoform level, PacBio RS II Single-Molecule Real-Time (SMRT) sequencing technology was used for a global survey of the full-length transcriptomes of five cabbage tissue types (root, stem, leaf, flower, and silique). A total of 77,048 isoforms, capturing 18,183 annotated genes, were discovered from the sequencing data generated through SMRT. The patterns of both alternative splicing (AS) and alternative polyadenylation (APA) were comprehensively analyzed. In total, we detected 13,468 genes which had isoforms containing APA sites and 8978 genes which underwent AS events. Moreover, 5272 long non-coding RNAs (lncRNAs) were discovered, and most exhibited tissue-specific expression. In total, 3147 transcription factors (TFs) were detected and 10 significant gene co-expression network modules were identified. In addition, we found that Fusarium wilt, black rot and clubroot infection significantly influenced AS in resistant cabbage. In summary, this study provides abundant cabbage isoform transcriptome data, which promotes reannotation of the cabbage genome, deepens our understanding of their post-transcriptional regulation mechanisms, and can be used for future functional genomic research.


2018 ◽  
Author(s):  
Yuehui Chao ◽  
Jianbo Yuan ◽  
Sifeng Li ◽  
Siqiao Jia ◽  
Liebao Han ◽  
...  

AbstractRed clover (Trifolium pratense L.) is an important cool-season legume plant, which is the most widely planted forage legume after alfalfa. Although a draft genome sequence was published already, the sequences and completed structure of mRNA transcripts remain unclear, which limit further explore on red clover. In this study, the red clover transcriptome was sequenced using single-molecule long-read sequencing to identify full-length splice isoforms, and 29,730 novel isoforms from known genes and 2,194 novel isoforms from novel genes were identified. A total of 5,492 alternative splicing events was identified and the majority of alter spliced events in red clover was corrected as intron retention. In addition, of the 15,229 genes detected by SMRT, 8,719 including 1,86,517 transcripts have at least one poly(A) site. Furthermore, we identified 4,333 long non-coding RNAs and 3,762 fusion transcripts. Our results show the feasibility of deep sequencing full-length RNA from red clover transcriptome on a single-molecule level.


2016 ◽  
Author(s):  
Zheng Kuang ◽  
Jef D. Boeke ◽  
Stefan Canzar

AbstractAlternative splicing increases the diversity of transcriptomes and proteomes in metazoans. The extent to which alternative splicing is active and functional in unicellular organisms is less understood. Here we exploit a single-molecule long-read sequencing technique and develop an open-source software program called SpliceHunter, to characterize the transcriptome in the meiosis of fission yeast. We reveal 17017 alternative splicing events in 19741 novel isoforms at different stages of meiosis, including antisense and read-through transcripts. Intron retention is the major type of alternative splicing, followed by “alternate intron in exon”. 887 novel transcription units are detected; 60 of the predicted proteins show homology in other species and form theoretical stable structures. We compare the dynamics of novel isoforms based on the number of supporting full-length reads with those of annotated isoforms and explore the translational capacity and quality of novel isoforms. The evaluation of these factors indicates that the majority of novel isoforms are unlikely to be both condition-specific and translatable but the possibility of functional novel isoforms is not excluded. Moreover, the co-option of these unusual transcripts into newly born genes seems likely. Together, this study highlights the diversity and dynamics at the isoform level in the sexual development of fission yeast.


PeerJ ◽  
2019 ◽  
Vol 7 ◽  
pp. e7933 ◽  
Author(s):  
Na Ding ◽  
Huihui Cui ◽  
Ying Miao ◽  
Jun Tang ◽  
Qinghe Cao ◽  
...  

Background Sweet potato (Ipomoea batatas (L.) Lam.) is one of the most important crops in many developing countries and provides a candidate source of bioenergy. However, neither a complete reference genome nor large-scale full-length cDNA sequences for this outcrossing hexaploid crop are available, which in turn impedes progress in research studies in I. batatas functional genomics and molecular breeding. Methods In this study, we sequenced full-length transcriptomes in I. batatas and its diploid ancestor I. trifida by single-molecule real-time sequencing and Illumina second-generation sequencing technologies. With the generated datasets, we conducted comprehensive intraspecific and interspecific sequence analyses and experimental characterization. Results A total of 53,861/51,184 high-quality long-read transcripts were obtained, which covered about 10,439/10,452 loci in the I. batatas/I. trifida genome. These datasets enabled us to predict open reading frames successfully in 96.83%/96.82% of transcripts and identify 34,963/33,637 full-length cDNA sequences, 1,401/1,457 transcription factors, 25,315/27,090 simple sequence repeats, 1,656/1,389 long non-coding RNAs, and 5,251/8,901 alternative splicing events. Approximately, 32.34%/38.54% of transcripts and 46.22%/51.18% multi-exon transcripts underwent alternative splicing in I. batatas/I. trifida. Moreover, we validated one alternative splicing event in each of 10 genes and identified tuberous-root-specific expressed isoforms from a starch-branching enzyme, an alpha-glucan phosphorylase, a neutral invertase, and several ABC transporters. Overall, the collection and analysis of large-scale long-read transcripts generated in this study will serve as a valuable resource for the I. batatas research community, which may accelerate the progress in its structural, functional, and comparative genomics studies.


2019 ◽  
Vol 14 (7) ◽  
pp. 566-573 ◽  
Author(s):  
Yubang Gao ◽  
Feihu Xi ◽  
Hangxiao zhang ◽  
Xuqing Liu ◽  
Huiyuan Wang ◽  
...  

Background: The advent of the Single-Molecule Real-time (SMRT) Isoform Sequencing (Iso-Seq) has paved the way to obtain longer full-length transcripts. This method was found to be much superior in identifying full-length splice variants and other post-transcriptional events as compared to the Next Generation Sequencing (NGS)-based short read sequencing (RNA-Seq). Several different bioinformatics tools to analyze the Iso-Seq data have been developed and some of them are still being refined to address different aspects of transcriptome complexity. However, a comprehensive summary of the available tools and their utility is still lacking. Objective: Here, we summarized the existing Iso-Seq analysis tools and presented an integrated bioinformatics pipeline for Iso-Seq analysis, which overcomes the limitations of NGS and generates long contiguous Full-Length Non-Chimeric (FLNC) reads for the analysis of posttranscriptional events. Results: In this review, we summarized recent applications of Iso-Seq in plants, which include improved genome annotations, identification of novel genes and lncRNAs, identification of fulllength splice isoforms, detection of novel Alternative Splicing (AS) and Alternative Polyadenylation (APA) events. In addition, we also discussed the bioinformatics pipeline for comprehensive Iso-Seq data analysis, including how to reduce the error rate in the reads and how to identify and quantify post-transcriptional events. Furthermore, the visualization approach of Iso-Seq was discussed as well. Finally, we discussed methods to combine Iso-Seq data with RNA-Seq for transcriptome quantification. Conclusion: Overall, this review demonstrates that the Iso-Seq is pivotal for analyzing transcriptome complexity and this new method offers unprecedented opportunities to comprehensively understand transcripts diversity.


PeerJ ◽  
2019 ◽  
Vol 7 ◽  
pp. e7062 ◽  
Author(s):  
Jun Ma ◽  
Yixuan Xiang ◽  
Yingyuan Xiong ◽  
Zhen Lin ◽  
Yanbin Xue ◽  
...  

Background Ananas comosus var. bracteatus is an herbaceous perennial monocot cultivated as an ornamental plant for its chimeric leaves. Because of its genomic complexity, and because no genomic information is available in the public GenBank database, the complete structure of the mRNA transcript is unclear and there are limited molecular mechanism studies for Ananas comosus var. bracteatus. Methods Three size fractionated full-length cDNA libraries (1–2 kb, 2–3 kb, and 3–6 kb) were constructed and subsequently sequenced in five single-molecule real-time (SMRT) cells (2 cells, 2 cells, and 1 cell, respectively). Results In total, 19,838 transcripts were identified for alternative splicing (AS) analysis. Among them, 19,185 (96.7%) transcripts were functionally annotated. A total of 9,921 genes were identified by mapping the non-redundant isoforms to the reference genome. A total of 10,649 AS events were identified, the majority of which were intron retention events. The alternatively spliced genes had functions in the basic metabolism processes of the plant such as carbon metabolism, amino acid biosynthesis, and glycolysis. Fourteen genes related to chlorophyll biosynthesis were identified as having AS events. The distribution of the splicing sites and the percentage of conventional and non-canonical AS sites of the genes categorized in pathways related to the albino leaf phenotype (ko00860, ko00195, ko00196, and ko00710) varied greatly. The present results showed that there were 8,316 genes carrying at least one poly (A) site, which generated 21,873 poly (A) sites. These findings indicated that the quality of the gene structure and functional information of the obtained genome was greatly improved, which may facilitate further genetic study of Ananas comosus var. bracteatus.


2021 ◽  
Vol 12 ◽  
Author(s):  
Tingyu Ma ◽  
Han Gao ◽  
Dong Zhang ◽  
Wei Sun ◽  
Qinggang Yin ◽  
...  

Artemisinin is currently the most effective ingredient in the treatment of malaria, which is thus of great significance to study the genetic regulation of Artemisia annua. Alternative splicing (AS) is a regulatory process that increases the complexity of transcriptome and proteome. The most common mechanism of alternative splicing (AS) in plant is intron retention (IR). However, little is known about whether the IR isoforms produced by light play roles in regulating biosynthetic pathways. In this work we would explore how the level of AS in A. annua responds to light regulation. We obtained a new dataset of AS by analyzing full-length transcripts using both Illumina- and single molecule real-time (SMRT)-based RNA-seq as well as analyzing AS on various tissues. A total of 5,854 IR isoforms were identified, with IR accounting for the highest proportion (48.48%), affirming that IR is the most common mechanism of AS. We found that the number of up-regulated IR isoforms (1534/1378, blue and red light, respectively) was more than twice that of down-regulated (636/682) after treatment of blue or red light. In the artemisinin biosynthetic pathway, 10 genes produced 16 differentially expressed IR isoforms. This work demonstrated that the differential expression of IR isoforms induced by light has the potential to regulate sesquiterpenoid biosynthesis. This study also provides high accuracy full-length transcripts, which can be a valuable genetic resource for further research of A. annua, including areas of development, breeding, and biosynthesis of active compounds.


2020 ◽  
Vol 41 (Supplement_2) ◽  
Author(s):  
D Oehler ◽  
A Goedecke ◽  
A Spychala ◽  
K Lu ◽  
N Gerdes ◽  
...  

Abstract Background Alternative splicing is a process by which exons within a pre-mRNA are joined or skipped, resulting in isoforms being encoded by a single gene. Alternative Splicing affecting transcription factors may have substantial impact on cellular dynamics. The PPARG Coactivator 1 Alpha (PGC1-α), is a major modulator in energy metabolism. Data from murine skeletal muscle revealed distinctive isoform patterns giving rise to different phenotypes, i.e. mitogenesis and hypertrophy. Here, we aimed to establish a complete dataset of isoforms in murine and human heart applying single-molecule real-time (SMRT)-sequencing as novel approach to identify transcripts without need for assembly, resulting in true full-length sequences. Moreover, we aimed to unravel functional relevance of the various isoforms during experimental ischemia reperfusion (I/R). Methods RNA-Isolation was performed in murine (C57Bl/6J) or human heart tissue (obtained during LVAD-surgery), followed by library preparation and SMRT-Sequencing. Bioinformatic analysis was done using a modified IsoSeq3-Pipeline and OS-tools. Identification of PGC1-α isoforms was fulfilled by similarity search against exonic sequences within the full-length, non-concatemere (FLNC) reads. Isoforms with Open-Reading-Frame (ORF) were manually curated and validated by PCR and Sanger-Sequencing. I/R was induced by ligature of the LAD for 45 min in mice on standard chow as well as on high-fat-high-sucrose diet. Area At Risk (AAR) and remote tissue were collected three and 16 days after I/R or sham-surgery (n=4 per time point). Promotor patterns were analyzed by qPCR. Results Deciphering the full-length transcriptome of murine and human heart resulted in ∼60000 Isoforms with 99% accuracy on mRNA-sequence. Focusing on murine PGC1-α-isoforms we discovered and verified 15 novel transcripts generated by hitherto unknown splicing events. Additionally, we identified a novel Exon 1 originating between the known promoters followed by a valid ORF, suggesting the discovery of a novel promoter. Remarkably, we found a homologous novel Exon1 in human heart, suggesting conservation of the postulated promoter. In I/R the AAR exhibited a significant lower expression of established and novel promoters compared to remote under standard chow 3d post I/R. 16d post I/R, the difference between AAR & Remote equalized in standard chow while remaining under High-Fat-Diet. Conclusion Applying SMRT-technique, we generated the first time a complete full-length-transcriptome of the murine and human heart, identifying 15 novel potentially coding transcripts of PGC1-α and a novel exon 1. These transcripts are differentially regulated in experimental I/R in AAR and remote myocardium, suggesting transcriptional regulation and alternative splicing modulating PGC1-α function in heart. Differences between standard chow and high fat diet suggest impact of impaired glucose metabolism on regulatory processes after myocardial infarction. Funding Acknowledgement Type of funding source: Public grant(s) – National budget only. Main funding source(s): Collaborative Research Centre 1116 (German Research Foundation)


Sign in / Sign up

Export Citation Format

Share Document