Long-read sequencing of Chrysanthemum morifolium cv. ‘Hangju’ transcriptome reveals flavonoid biosynthesis and regulation

Abstract Background: The inflorescence of Chrysanthemum morifolium cv. ‘Hangju’ has been widely used in China due to its antioxidant and anti-inflammatory properties. The biosynthesis and regulation of flavonoids, a group of bioactive components, in C. morifolium are poorly understood. Transcriptome sequencing is an effective method for obtaining transcript information. Therefore, single-molecule real-time (SMRT) sequencing was performed to obtain the full-length genes involved in flavonoid biosynthesis and regulation in C. morifolium. Results: High-quality RNA was extracted from the inflorescence of C. morifolium at different developmental stages and used to construct two libraries (0-5 kb and 4.5-10 kb) for sequencing. Finally, 125,532 non-redundant isoforms with a mean length of 2,009 bp were obtained. Of these, 2,083 transcripts were annotated to pathways related to flavonoid biosynthesis, and 56 isoforms were annotated as CHS, CHI, F3H, F3’H, FNS Ⅱ, FLS, DFR and ANS genes. Based on gene expression levels at different stages, we predicted the major genes involved in flavonoid biosynthesis. By phylogenetic analysis, we found two candidate MYB transcription factors (CmMYBF1 and CmMYBF2) activating flavonol biosynthesis. Conclusions: Based on the full-length transcriptomic data and further quantitative analysis, the major genes involved in flavonoid biosynthesis and regulation in C. morifolium were predicted in our study. The results provide a valuable theoretical basis for the introduction and cultivation of C. morifolium cv. ‘Hangju’.

Download Full-text

Long-read sequencing of Chrysanthemum morifolium cv. ‘Hangju’ transcriptome reveals flavonoid biosynthesis and regulation

10.21203/rs.2.19942/v1 ◽

2020 ◽

Author(s):

Tao Wang ◽

Feng Yang ◽

Qiaosheng Guo ◽

Qingjun Zou ◽

Wenyan Zhang ◽

...

Keyword(s):

Single Molecule ◽

Developmental Stages ◽

Chrysanthemum Morifolium ◽

Flavonoid Biosynthesis ◽

Full Length ◽

Bioactive Components ◽

Smrt Sequencing ◽

Major Genes ◽

Long Read ◽

Gene Expression Levels

Abstract Background: The inflorescence of Chrysanthemum morifolium cv. ‘Hangju’ has been widely used in China due to its antioxidant and anti-inflammatory properties. The biosynthesis and regulation of flavonoids, a group of bioactive components, in C. morifolium are poorly understood. Transcriptome sequencing is an effective method for obtaining transcript information. Therefore, single-molecule real-time (SMRT) sequencing was performed to obtain the full-length genes involved in flavonoid biosynthesis and regulation in C. morifolium.Results: High-quality RNA was extracted from the inflorescence of C. morifolium at different developmental stages and used to construct two libraries (0-5 kb and 4.5-10 kb) for sequencing. Finally, 125,532 non-redundant isoforms with a mean length of 2,009 bp were obtained. Of these, 2,083 transcripts were annotated to pathways related to flavonoid biosynthesis, and 56 isoforms were annotated as CHS, CHI, F3H, F3’H, FNS Ⅱ, FLS, DFR and ANS genes. Based on gene expression levels at different stages, we predicted the major genes involved in flavonoid biosynthesis. By phylogenetic analysis, we found two candidate MYB transcription factors (CmMYBF1 and CmMYBF2) activating flavonol biosynthesis.Conclusions: Based on the full-length transcriptomic data and further quantitative analysis, the major genes involved in flavonoid biosynthesis and regulation in C. morifolium were predicted in our study. The results provide a valuable theoretical basis for the introduction and cultivation of C. morifolium cv. ‘Hangju’.

Download Full-text

Long-read sequencing of Chrysanthemum morifolium cv. ‘Hangju’ transcriptome reveals flavonoid biosynthesis and regulation

10.21203/rs.2.15455/v1 ◽

2019 ◽

Author(s):

Tao Wang ◽

Feng Yang ◽

Qiaosheng Guo ◽

Qingjun Zou ◽

Wenyan Zhang ◽

...

Keyword(s):

Single Molecule ◽

Chrysanthemum Morifolium ◽

Flavonoid Biosynthesis ◽

Full Length ◽

Transcriptome Data ◽

Bioactive Components ◽

Smrt Sequencing ◽

Major Genes ◽

Poor Understanding ◽

Long Read

Abstract Abstract Background : The capitulum of Chrysanthemum morifolium cv. ‘Hangju’ has been widely used in China for antioxidant and anti-inflammatory. Flavonoids as one of the bioactive components in C . morifolium have a poor understanding in their biosynthesis and regulation. Nowadays, transcriptome sequencing as an effective method was used in capturing the transcripts information. So, single-molecule real-time (SMRT) sequencing was performed to obtain the full length of genes involved in flavonoid biosynthesis and regulation in C . morifolium . Results : The high-quality RNA was extracted from the capitulum of C . morifolium at different development stages, and it was constructed into two libraries (0-5 kb and 4.5-10 kb) for sequencing. Finally, 125,532 non-redundant isoforms with mean length of 2,009 bp were captured. Of which, 2,083 transcripts were annotated in the pathway related to the flavonoid biosynthesis and 56 isoforms were annotated as CHS , CHI , F3H , F3’H , FNS Ⅱ , FLS , DFR and ANS genes. Based on the gene expression level at different stages, we predicted the major genes involved in the flavonoid biosynthesis. And we found two candidate MYB factors (CmMYBF1 and CmMYBF2) activating the flavonol biosynthesis by phylogenetic analysis. Conclusions : Based on the full-length transcriptome data and further quantitative analysis, the major genes involved in flavonoid biosynthesis and regulation in C . morifolium were predicted in our study. The results provide a valuable theoretical basis for introduction and cultivation of C. morifolium cv. ‘Hangju’.

Download Full-text

Single-Molecule Long-Read Sequencing Reveals the Diversity of Full-Length Transcripts in Leaves of Gnetum (Gnetales)

International Journal of Molecular Sciences ◽

10.3390/ijms20246350 ◽

2019 ◽

Vol 20 (24) ◽

pp. 6350 ◽

Cited By ~ 2

Author(s):

Nan Deng ◽

Chen Hou ◽

Fengfeng Ma ◽

Caixia Liu ◽

Yuxin Tian

Keyword(s):

Single Molecule ◽

Developmental Stages ◽

Alternative Polyadenylation ◽

Full Length ◽

Stomatal Development ◽

Rna Seq ◽

Leaf Transcriptome ◽

Long Read ◽

Non Coding Rnas ◽

A Site

The limitations of RNA sequencing make it difficult to accurately predict alternative splicing (AS) and alternative polyadenylation (APA) events and long non-coding RNAs (lncRNAs), all of which reveal transcriptomic diversity and the complexity of gene regulation. Gnetum, a genus with ambiguous phylogenetic placement in seed plants, has a distinct stomatal structure and photosynthetic characteristics. In this study, a full-length transcriptome of Gnetum luofuense leaves at different developmental stages was sequenced with the latest PacBio Sequel platform. After correction by short reads generated by Illumina RNA-Seq, 80,496 full-length transcripts were obtained, of which 5269 reads were identified as isoforms of novel genes. Additionally, 1660 lncRNAs and 12,998 AS events were detected. In total, 5647 genes in the G. luofuense leaves had APA featured by at least one poly(A) site. Moreover, 67 and 30 genes from the bHLH gene family, which play an important role in stomatal development and photosynthesis, were identified from the G. luofuense genome and leaf transcripts, respectively. This leaf transcriptome supplements the reference genome of G. luofuense, and the AS events and lncRNAs detected provide valuable resources for future studies of investigating low photosynthetic capacity of Gnetum.

Download Full-text

Single-Molecule Real-Time Sequencing of the Madhuca pasquieri (Dubard) Lam. Transcriptome Reveals the Diversity of Full-Length Transcripts

Forests ◽

10.3390/f11080866 ◽

2020 ◽

Vol 11 (8) ◽

pp. 866

Author(s):

Lei Kan ◽

Qicong Liao ◽

Zhiyao Su ◽

Yushan Tan ◽

Shuyu Wang ◽

...

Keyword(s):

Seed Germination ◽

Single Molecule ◽

Developmental Stages ◽

De Novo ◽

Full Length ◽

Wild Plant ◽

Transcript Isoforms ◽

Long Read ◽

Full Length Transcript ◽

Generation Sequencing

Madhuca pasquieri (Dubard) Lam. is a tree on the International Union for Conservation of Nature Red List and a national key protected wild plant (II) of China, known for its seed oil and timber. However, lacking of genomic and transcriptome data for this species hampers study of its reproduction, utilization, and conservation. Here, single-molecule long-read sequencing (PacBio) and next-generation sequencing (Illumina) were combined to obtain the transcriptome from five developmental stages of M. pasquieri. Overall, 25,339 transcript isoforms were detected by PacBio, including 24,492 coding sequences (CDSs), 9440 simple sequence repeats (SSRs), 149 long non-coding RNAs (lncRNAs), and 182 alternative splicing (AS) events, a majority was retained intron (RI). A further 1058 transcripts were identified as transcriptional factors (TFs) from 51 TF families. PacBio recovered more full-length transcript isoforms with a longer length, and a higher expression level, whereas larger number of transcripts (124,405) was captured in de novo from Illumina. Using Nr, Swissprot, KOG, and KEGG databases, 24,405 transcripts (96.31%) were annotated by PacBio. Functional annotation revealed a role for the auxin, abscisic acid, gibberellin, and cytokinine metabolic pathways in seed germination and post-germination. These findings support further studies on seed germination mechanism and genome of M. pasquieri, and better protection of this endangered species.

Download Full-text

Single-Molecule Real-Time Transcript Sequencing of Turnips Unveiling the Complexity of the Turnip Transcriptome

G3 Genes|Genome|Genetics ◽

10.1534/g3.120.401434 ◽

2020 ◽

Vol 10 (10) ◽

pp. 3505-3514

Author(s):

Hongmei Zhuang ◽

Qiang Wang ◽

Hongwei Han ◽

Huifang Liu ◽

Hao Wang

Keyword(s):

Real Time ◽

Brassica Rapa ◽

Single Molecule ◽

Developmental Stages ◽

Full Length ◽

Sequencing Data ◽

Smrt Sequencing ◽

High Quality ◽

Transcript Structure ◽

Novel Transcripts

To generate the full-length transcriptome of Xinjiang green and purple turnips, Brassica rapa var. Rapa, using single-molecule real-time (SMRT) sequencing. The samples of two varieties of Brassica rapa var. Rapa at five developmental stages were collected and combined to perform SMRT sequencing. Meanwhile, next generation sequencing was performed to correct SMRT sequencing data. A series of analyses were performed to investigate the transcript structure. Finally, the obtained transcripts were mapped to the genome of Brassica rapa ssp. pekinesis Chiifu to identify potential novel transcripts. For green turnip (F01), a total of 19.54 Gb clean data were obtained from 8 cells. The number of reads of insert (ROI) and full-length non-chimeric (FLNC) reads were 510,137 and 267,666. In addition, 82,640 consensus isoforms were obtained in the isoform sequences clustering, of which 69,480 were high-quality, and 13,160 low-quality sequences were corrected using Illumina RNA seq data. For purple turnip (F02), there were 20.41 Gb clean data, 552,829 ROIs, and 274,915 FLNC sequences. A total of 93,775 consensus isoforms were obtained, of which 78,798 were high-quality, and the 14,977 low-quality sequences were corrected. Following the removal of redundant sequences, there were 46,516 and 49,429 non-redundant transcripts for F01 and F02, respectively; 7,774 and 9,385 alternative splicing events were predicted for F01 and F02; 63,890 simple sequence repeats, 59,460 complete coding sequences, and 535 long-non coding RNAs were predicted. Moreover, 5,194 and 5,369 novel transcripts were identified by mapping to Brassica rapa ssp. pekinesis Chiifu. The obtained transcriptome data may improve turnip genome annotation and facilitate further study of the Brassica rapa var. Rapa genome and transcriptome.

Download Full-text

SMRT sequencing of the full-length transcriptome of the white-backed planthopper Sogatella furcifera

PeerJ ◽

10.7717/peerj.9320 ◽

2020 ◽

Vol 8 ◽

pp. e9320

Author(s):

Jing Chen ◽

Yaya Yu ◽

Kui Kang ◽

Daowei Zhang

Keyword(s):

Single Molecule ◽

Developmental Stages ◽

Virus Transmission ◽

Full Length ◽

Transcriptome Data ◽

Smrt Sequencing ◽

Sogatella Furcifera ◽

Consensus Sequences ◽

Host Interactions ◽

Ssr Analysis

The white-backed planthopper Sogatella furcifera is an economically important rice pest distributed throughout Asia. It damages rice crops by sucking phloem sap, resulting in stunted growth and plant virus transmission. We aimed to obtain the full-length transcriptome data of S. furcifera using PacBio single-molecule real-time (SMRT) sequencing. Total RNA extracted from S. furcifera at various developmental stages (egg, larval, and adult stages) was mixed and used to generate a full-length transcriptome for SMRT sequencing. Long non-coding RNA (lncRNA) identification, full-length coding sequence prediction, full-length non-chimeric (FLNC) read detection, simple sequence repeat (SSR) analysis, transcription factor detection, and transcript functional annotation were performed. A total of 12,514,449 subreads (15.64 Gbp, clean reads) were generated, including 630,447 circular consensus sequences and 388,348 FLNC reads. Transcript cluster analysis of the FLNC reads revealed 251,109 consensus reads including 29,700 high-quality reads. Additionally, 100,360 SSRs and 121,395 coding sequences were identified using SSR analysis and ANGEL software, respectively. Furthermore, 44,324 lncRNAs were annotated using four tools and 1,288 transcription factors were identified. In total, 95,495 transcripts were functionally annotated based on searches of seven different databases. To the best of our knowledge, this is the first study of the full-length transcriptome of the white-backed planthopper obtained using SMRT sequencing. The acquired transcriptome data can facilitate further studies on the ecological and viral-host interactions of this agricultural pest.

Download Full-text

Comparative Transcriptome Analysis Combining SMRT- and Illumina-Based RNA-Seq Identifies Potential Candidate Genes Involved in Betalain Biosynthesis in Pitaya Fruit

International Journal of Molecular Sciences ◽

10.3390/ijms21093288 ◽

2020 ◽

Vol 21 (9) ◽

pp. 3288

Author(s):

Yawei Wu ◽

Juan Xu ◽

Xiumei Han ◽

Guang Qiao ◽

Kun Yang ◽

...

Keyword(s):

Candidate Genes ◽

Single Molecule ◽

Developmental Stages ◽

Full Length ◽

Reference Database ◽

White Pulp ◽

Potential Candidate ◽

Rna Seq ◽

Smrt Sequencing ◽

Novel Genes

To gain more valuable genomic information about betalain biosynthesis, the full-length transcriptome of pitaya pulp from ‘Zihonglong’ (red pulp) and ‘Jinghonglong’ (white pulp) in four fruit developmental stages was analyzed using Single-Molecule Real-Time (SMRT) sequencing corrected by Illumina RNA-sequence (Illumina RNA-Seq). A total of 65,317 and 91,638 genes were identified in ‘Zihonglong’ and ‘Jinghonglong’, respectively. A total of 11,377 and 15,551 genes with more than two isoforms were investigated from ‘Zihonglong’ and ‘Jinghonglong’, respectively. In total, 156,955 genes were acquired after elimination of redundancy, of which, 120,604 genes (79.63%) were annotated, and 30,875 (20.37%) sequences without hits to reference database were probably novel genes in pitaya. A total of 31,169 and 53,024 simple sequence repeats (SSRs) were uncovered from the genes of ‘Zihonglong’ and ‘Jinghonglong’, and 11,650 long non-coding RNAs (lncRNAs) in ‘Zihonglong’ and 11,113 lncRNAs in ‘Jinghonglong’ were obtained herein. qRT-PCR was conducted on ten candidate genes, the expression level of six novel genes were consistent with the Fragments Per Kilobase of transcript per Million mapped reads (FPKM) values. In conclusion, we firstly undertook SMRT sequencing of the full-length transcriptome of pitaya, and the valuable resource that was acquired through this sequencing facilitated the identification of additional betalain-related genes. Notably, a list of novel putative genes related to the synthesis of betalain in pitaya fruits was assembled. This may provide new insights into betalain synthesis in pitaya.

Download Full-text

PacBio single-molecule long-read sequencing shed new light on the complexity of the Carex breviculmis transcriptome

BMC Genomics ◽

10.1186/s12864-019-6163-6 ◽

2019 ◽

Vol 20 (1) ◽

Cited By ~ 2

Author(s):

Ke Teng ◽

Wenjun Teng ◽

Haifeng Wen ◽

Yuesen Yue ◽

Weier Guo ◽

...

Keyword(s):

Single Molecule ◽

Average Length ◽

Transcriptome Profiling ◽

Full Length ◽

Regulation Mechanism ◽

Forage Production ◽

Smrt Sequencing ◽

Reference Transcriptome ◽

Long Read ◽

Ecological Conservation

Abstract Background Carex L., a grass genus commonly known as sedges, is distributed worldwide and contributes constructively to turf management, forage production, and ecological conservation. The development of next-generation sequencing (NGS) technologies has considerably improved our understanding of transcriptome complexity of Carex L. and provided a valuable genetic reference. However, the current transcriptome is not satisfactory mainly because of the enormous difficulty in obtaining full-length transcripts. Results In this study, we employed PacBio single-molecule long-read sequencing (SMRT) technology for whole-transcriptome profiling in Carex breviculmis. We generated 60,353 high-confidence non-redundant transcripts with an average length of 2302-bp. A total of 3588 alternative splicing events, and 1273 long non-coding RNAs were identified. Furthermore, 40,347 complete coding sequences were predicted, providing an informative reference transcriptome. In addition, the transcriptional regulation mechanism of C. breviculmis in response to shade stress was further explored by mapping the NGS data to the reference transcriptome constructed by SMRT sequencing. Conclusions This study provided a full-length reference transcriptome of C. breviculmis using the SMRT sequencing method for the first time. The transcriptome atlas obtained will not only facilitate future functional genomics studies but also pave the way for further selective and genic engineering breeding projects for C. breviculmis.

Download Full-text

Comparative analysis of transcriptional regulation of betalain biosynthesis based on SMRT sequencing of full-length transcriptome in two pitaya cultivars (red pulp and white pulp)

10.21203/rs.2.14828/v1 ◽

2019 ◽

Author(s):

Yawei Wu ◽

Juan Xu ◽

Xiumei Han ◽

Guang Qiao ◽

Kun yang ◽

...

Keyword(s):

Single Molecule ◽

Developmental Stages ◽

Full Length ◽

Reference Database ◽

White Pulp ◽

Rna Seq ◽

Genomic Information ◽

Smrt Sequencing ◽

And Function ◽

Red Pulp

Abstract Background: In order to gain more valuable genomic information involved in betalain biosynthesis, the full-length transcriptome of pitaya was analyzed using Single-Molecule Real-Time (SMRT) sequencing corrected by RNA-seq in the present study. Two pitaya cultivars, ‘Zihonglong’ (red pulp) and ‘Jinghonglong’ (white pulp) were selected to analyze betalain transcriptome in four fruit developmental stages. Results: A total of 65,317 and 91,638 genes coding proteins were identified in ‘Zihonglong’ and ‘Jinghonglong’, respectively. A total of 11,377 and 15,551 genes with more than two isoforms were investigated from ‘Zihonglong’ and ‘Jinghonglong’, respectively. Also, 156,955 genes were acquired after elimination of redundancy , of which, 120,604 genes (79.63%) were annotated, and 30,875 (20.37%) sequences without hits to reference database were probably novel genes in pitaya. Totally, 31,169 and 53,024 SSRs were uncovered from the genes of ‘Zihonglong’ and ‘Jinghonglong’, and 11,650 lncRNAs in ‘Zihonglong’ and 11,113 lncRNAs in ‘Jinghonglong’ were obtained herein. Further, 104 genes involved in betalain metabolism were identified, and HpCYP76AD4 and HpDODA probably responded to betalains biosynthesis. Conclusions: Conclusively, this is the first study to perform SMRT sequencing of the full-length transcriptome of pitaya, which provides a useful genomic clue for exploring the structure and function of genes in pitaya, particularly for betalain biosynthesis.

Download Full-text

Full-length-transcriptomic analysis in mice and human heart using Single-Molecule Real-time Sequencing (SMRT) identified 15 novel isoforms and a novel promoter region of PGC1-alpha

European Heart Journal ◽

10.1093/ehjci/ehaa946.3582 ◽

2020 ◽

Vol 41 (Supplement_2) ◽

Author(s):

D Oehler ◽

A Goedecke ◽

A Spychala ◽

K Lu ◽

N Gerdes ◽

...

Keyword(s):

Alternative Splicing ◽

Single Molecule ◽

High Fat Diet ◽

Human Heart ◽

Full Length ◽

Funding Source ◽

Smrt Sequencing ◽

High Fat ◽

Exon 1 ◽

Novel Exon

Abstract Background Alternative splicing is a process by which exons within a pre-mRNA are joined or skipped, resulting in isoforms being encoded by a single gene. Alternative Splicing affecting transcription factors may have substantial impact on cellular dynamics. The PPARG Coactivator 1 Alpha (PGC1-α), is a major modulator in energy metabolism. Data from murine skeletal muscle revealed distinctive isoform patterns giving rise to different phenotypes, i.e. mitogenesis and hypertrophy. Here, we aimed to establish a complete dataset of isoforms in murine and human heart applying single-molecule real-time (SMRT)-sequencing as novel approach to identify transcripts without need for assembly, resulting in true full-length sequences. Moreover, we aimed to unravel functional relevance of the various isoforms during experimental ischemia reperfusion (I/R). Methods RNA-Isolation was performed in murine (C57Bl/6J) or human heart tissue (obtained during LVAD-surgery), followed by library preparation and SMRT-Sequencing. Bioinformatic analysis was done using a modified IsoSeq3-Pipeline and OS-tools. Identification of PGC1-α isoforms was fulfilled by similarity search against exonic sequences within the full-length, non-concatemere (FLNC) reads. Isoforms with Open-Reading-Frame (ORF) were manually curated and validated by PCR and Sanger-Sequencing. I/R was induced by ligature of the LAD for 45 min in mice on standard chow as well as on high-fat-high-sucrose diet. Area At Risk (AAR) and remote tissue were collected three and 16 days after I/R or sham-surgery (n=4 per time point). Promotor patterns were analyzed by qPCR. Results Deciphering the full-length transcriptome of murine and human heart resulted in ∼60000 Isoforms with 99% accuracy on mRNA-sequence. Focusing on murine PGC1-α-isoforms we discovered and verified 15 novel transcripts generated by hitherto unknown splicing events. Additionally, we identified a novel Exon 1 originating between the known promoters followed by a valid ORF, suggesting the discovery of a novel promoter. Remarkably, we found a homologous novel Exon1 in human heart, suggesting conservation of the postulated promoter. In I/R the AAR exhibited a significant lower expression of established and novel promoters compared to remote under standard chow 3d post I/R. 16d post I/R, the difference between AAR & Remote equalized in standard chow while remaining under High-Fat-Diet. Conclusion Applying SMRT-technique, we generated the first time a complete full-length-transcriptome of the murine and human heart, identifying 15 novel potentially coding transcripts of PGC1-α and a novel exon 1. These transcripts are differentially regulated in experimental I/R in AAR and remote myocardium, suggesting transcriptional regulation and alternative splicing modulating PGC1-α function in heart. Differences between standard chow and high fat diet suggest impact of impaired glucose metabolism on regulatory processes after myocardial infarction. Funding Acknowledgement Type of funding source: Public grant(s) – National budget only. Main funding source(s): Collaborative Research Centre 1116 (German Research Foundation)

Download Full-text