scholarly journals PacBio single-molecule long-read sequencing shed new light on the complexity of the Carex breviculmis transcriptome

BMC Genomics ◽  
2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Ke Teng ◽  
Wenjun Teng ◽  
Haifeng Wen ◽  
Yuesen Yue ◽  
Weier Guo ◽  
...  

Abstract Background Carex L., a grass genus commonly known as sedges, is distributed worldwide and contributes constructively to turf management, forage production, and ecological conservation. The development of next-generation sequencing (NGS) technologies has considerably improved our understanding of transcriptome complexity of Carex L. and provided a valuable genetic reference. However, the current transcriptome is not satisfactory mainly because of the enormous difficulty in obtaining full-length transcripts. Results In this study, we employed PacBio single-molecule long-read sequencing (SMRT) technology for whole-transcriptome profiling in Carex breviculmis. We generated 60,353 high-confidence non-redundant transcripts with an average length of 2302-bp. A total of 3588 alternative splicing events, and 1273 long non-coding RNAs were identified. Furthermore, 40,347 complete coding sequences were predicted, providing an informative reference transcriptome. In addition, the transcriptional regulation mechanism of C. breviculmis in response to shade stress was further explored by mapping the NGS data to the reference transcriptome constructed by SMRT sequencing. Conclusions This study provided a full-length reference transcriptome of C. breviculmis using the SMRT sequencing method for the first time. The transcriptome atlas obtained will not only facilitate future functional genomics studies but also pave the way for further selective and genic engineering breeding projects for C. breviculmis.

2019 ◽  
Vol 20 (17) ◽  
pp. 4117 ◽  
Author(s):  
Yu Ge ◽  
Zhihao Cheng ◽  
Xiongyuan Si ◽  
Weihong Ma ◽  
Lin Tan ◽  
...  

Avocado (Persea americana Mill.) is an economically important crop because of its high nutritional value. However, the absence of a sequenced avocado reference genome has hindered investigations of secondary metabolism. For next-generation high-throughput transcriptome sequencing, we obtained 365,615,152 and 348,623,402 clean reads as well as 109.13 and 104.10 Gb of sequencing data for avocado mesocarp and seed, respectively, during five developmental stages. High-quality reads were assembled into 100,837 unigenes with an average length of 847.40 bp (N50 = 1725 bp). Additionally, 16,903 differentially expressed genes (DEGs) were detected, 17 of which were related to carotenoid biosynthesis. The expression levels of most of these 17 DEGs were higher in the mesocarp than in the seed during five developmental stages. In this study, the avocado mesocarp and seed transcriptome were also sequenced using single-molecule long-read sequencing to acquired 25.79 and 17.67 Gb clean data, respectively. We identified 233,014 and 238,219 consensus isoforms in avocado mesocarp and seed, respectively. Furthermore, 104 and 59 isoforms were found to correspond to the putative 11 carotenoid biosynthetic-related genes in the avocado mesocarp and seed, respectively. The isoform numbers of 10 out of the putative 11 genes involved in the carotenoid biosynthetic pathway were higher in the mesocarp than those in the seed. Besides, alpha- and beta-carotene contents in the avocado mesocarp and seed during five developmental stages were also measured, and they were higher in the mesocarp than in the seed, which validated the results of transcriptome profiling. Gene expression changes and the associated variations in gene dosage could influence carotenoid biosynthesis. These results will help to further elucidate carotenoid biosynthesis in avocado.


2019 ◽  
Author(s):  
Tao Wang ◽  
Feng Yang ◽  
Qiaosheng Guo ◽  
Qingjun Zou ◽  
Wenyan Zhang ◽  
...  

Abstract Abstract Background : The capitulum of Chrysanthemum morifolium cv. ‘Hangju’ has been widely used in China for antioxidant and anti-inflammatory. Flavonoids as one of the bioactive components in C . morifolium have a poor understanding in their biosynthesis and regulation. Nowadays, transcriptome sequencing as an effective method was used in capturing the transcripts information. So, single-molecule real-time (SMRT) sequencing was performed to obtain the full length of genes involved in flavonoid biosynthesis and regulation in C . morifolium . Results : The high-quality RNA was extracted from the capitulum of C . morifolium at different development stages, and it was constructed into two libraries (0-5 kb and 4.5-10 kb) for sequencing. Finally, 125,532 non-redundant isoforms with mean length of 2,009 bp were captured. Of which, 2,083 transcripts were annotated in the pathway related to the flavonoid biosynthesis and 56 isoforms were annotated as CHS , CHI , F3H , F3’H , FNS Ⅱ , FLS , DFR and ANS genes. Based on the gene expression level at different stages, we predicted the major genes involved in the flavonoid biosynthesis. And we found two candidate MYB factors (CmMYBF1 and CmMYBF2) activating the flavonol biosynthesis by phylogenetic analysis. Conclusions : Based on the full-length transcriptome data and further quantitative analysis, the major genes involved in flavonoid biosynthesis and regulation in C . morifolium were predicted in our study. The results provide a valuable theoretical basis for introduction and cultivation of C. morifolium cv. ‘Hangju’.


2021 ◽  
Author(s):  
Mingwei Sun ◽  
Yilian Zhao ◽  
Xiaobin Shao ◽  
Jintao Ge ◽  
Xueyan Tang ◽  
...  

Abstract It is well known that transcriptional diversity plays important roles in plant biological regulation. But for the difficulty in full-length transcripts obtainment, the available tiger lily (Lilium lancifolium Thunb) transcriptome characterization are still not complete. To improve the integrity of tiger lily transcriptome information, (SMRT PacBio single-molecule long-read sequencing technology) was employed to accomplish the whole transcriptome profiling. A total of 815,624 CCS (Circular Consensus Sequence) reads with mean length of 1,295 bp were obtained. Based on these transcripts, 61,744 reads were full-length reads containing both the 5’ primer, 3’ primer and the poly (A) tail and 3,319 EST-derived SSRs were developed from 2968 unigenes. With the obtained informative reference transcriptome,768 transcription factors and 6,852 long non-coding RNAs were identified, providing a comprehensive framework of the transcriptional regulation network. Of all the annotated transcripts, 15,608 were distributed into 25 various Clusters of euKaryotic Orthologous Groups (KOG), and 10,706 unigenes were categorized into 52 functional groups which were divided into three categories. These results would provide a comprehensive set of reference transcripts and further improve our understanding of the tiger lily transcriptomes.


2020 ◽  
Author(s):  
Tao Wang ◽  
Feng Yang ◽  
Qiaosheng Guo ◽  
Qingjun Zou ◽  
Wenyan Zhang ◽  
...  

Abstract Background: The inflorescence of Chrysanthemum morifolium cv. ‘Hangju’ has been widely used in China due to its antioxidant and anti-inflammatory properties. The biosynthesis and regulation of flavonoids, a group of bioactive components, in C. morifolium are poorly understood. Transcriptome sequencing is an effective method for obtaining transcript information. Therefore, single-molecule real-time (SMRT) sequencing was performed to obtain the full-length genes involved in flavonoid biosynthesis and regulation in C. morifolium.Results: High-quality RNA was extracted from the inflorescence of C. morifolium at different developmental stages and used to construct two libraries (0-5 kb and 4.5-10 kb) for sequencing. Finally, 125,532 non-redundant isoforms with a mean length of 2,009 bp were obtained. Of these, 2,083 transcripts were annotated to pathways related to flavonoid biosynthesis, and 56 isoforms were annotated as CHS, CHI, F3H, F3’H, FNS Ⅱ, FLS, DFR and ANS genes. Based on gene expression levels at different stages, we predicted the major genes involved in flavonoid biosynthesis. By phylogenetic analysis, we found two candidate MYB transcription factors (CmMYBF1 and CmMYBF2) activating flavonol biosynthesis.Conclusions: Based on the full-length transcriptomic data and further quantitative analysis, the major genes involved in flavonoid biosynthesis and regulation in C. morifolium were predicted in our study. The results provide a valuable theoretical basis for the introduction and cultivation of C. morifolium cv. ‘Hangju’.


2021 ◽  
Vol 12 ◽  
Author(s):  
Aiping Deng ◽  
Jinpeng Li ◽  
Zebin Yao ◽  
Gyamfua Afriyie ◽  
Ziyang Chen ◽  
...  

Coelomactra antiquata is an important aquatic economic shellfish with high medicinal value. However, because C. antiquata has no reference genome, a lot of molecular biology research cannot be carried out, so the analysis of its transcripts is an important step to study the regulatory genes of various substances in C. antiquata. In the present study, we conducted the first full-length transcriptome analysis of C. antiquata by using PacBio single-molecule real-time (SMRT) sequencing technology. The results identified a total of 39,209 unigenes with an average length of 2,732 bp, 23,338 CDSs, 251 AS events, 9,881 lncRNAs, 20,106 SSRs, and 2,316 TFs. Subsequently, 59.22% (23,220) of the unigenes were successfully annotated, of which 23,164, 18,711, 15,840, 13,534, and 13,474 unigenes could be annotated using NR, Swiss-prot, KOG, GO, and KEGG databases, respectively. This study lays the foundation for the follow-up research of molecular biology and provides a reference for studying the more medicinal value of C. antiquata.


PeerJ ◽  
2021 ◽  
Vol 9 ◽  
pp. e12069
Author(s):  
Lei Yang ◽  
Binglin Xing ◽  
Fen Li ◽  
Li Kui Wang ◽  
Linlin Yuan ◽  
...  

Background Spodoptera frugiperda (J. E. Smith), commonly known as fall armyworm (FAW), is one of the most destructive agricultural pests in the world and has posed a great threat to crops. The improper use of insecticides has led to rapid development of resistance. However, the genetic data available for uncovering the insecticide resistance mechanisms are scarce. Methods In this study, we used PacBio single-molecule real-time (SMRT) sequencing aimed at revealing the full-length transcriptome profiling of the FAW larval brain to obtain detoxification genes. Results A total of 18,642 high-quality transcripts were obtained with an average length of 2,371 bp, and 11,230 of which were successfully annotated in six public databases. Among these, 5,692 alternative splicing events were identified.


2021 ◽  
Vol 12 ◽  
Author(s):  
Tianpeng Chang ◽  
Bingxing An ◽  
Mang Liang ◽  
Xinghai Duan ◽  
Lili Du ◽  
...  

Cattle (Bos taurus) is one of the most widely distributed livestock species in the world, and provides us with high-quality milk and meat which have a huge impact on the quality of human life. Therefore, accurate and complete transcriptome and genome annotation are of great value to the research of cattle breeding. In this study, we used error-corrected PacBio single-molecule real-time (SMRT) data to perform whole-transcriptome profiling in cattle. Then, 22.5 Gb of subreads was generated, including 381,423 circular consensus sequences (CCSs), among which 276,295 full-length non-chimeric (FLNC) sequences were identified. After correction by Illumina short reads, we obtained 22,353 error-corrected isoforms. A total of 305 alternative splicing (AS) events and 3,795 alternative polyadenylation (APA) sites were detected by transcriptome structural analysis. Furthermore, we identified 457 novel genes, 120 putative transcription factors (TFs), and 569 novel long non-coding RNAs (lncRNAs). Taken together, this research improves our understanding and provides new insights into the complexity of full-length transcripts in cattle.


2019 ◽  
Author(s):  
Dafu Chen ◽  
Yu Du ◽  
Xiaoxue Fan ◽  
Zhiwei Zhu ◽  
Haibin Jiang ◽  
...  

AbstractAscosphaera apis is a widespread fungal pathogen of honeybee larvae that results in chalkbrood disease, leading to heavy losses for the beekeeping industry in China and many other countries. This work was aimed at generating a full-length transcriptome of A. apis using PacBio single-molecule real-time (SMRT) sequencing. Here, more than 23.97 Gb of clean reads was generated from long-read sequencing of A. apis mecylia, including 464,043 circular consensus sequences (CCS) and 394,142 full-length non-chimeric (FLNC) reads. In total, we identified 174,095 high-confidence transcripts covering 5141 known genes with an average length of 2728 bp. We also discovered 2405 genic loci and 11,623 isoforms that have not been annotated yet within the current reference genome. Additionally, 16,049, 10,682, 4520 and 7253 of the discovered transcripts have annotations in the Non-redundant protein (Nr), Clusters of Eukaryotic Orthologous Groups (KOG), Gene Ontology (GO), and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Moreover, 1205 long non-coding RNAs (lncRNAs) were identified, which have less exons, shorter exon and intron lengths, shorter transcript lengths, lower GC percent, lower expression levels, and fewer alternative splicing (AS) evens, compared with protein-coding transcripts. A total of 253 members from 17 transcription factor (TF) families were identified from our transcript datasets. Finally, the expression of A. apis isoforms was validated using a molecular approach. Overall, this is the first report of a full-length transcriptome of entomogenous fungi including A. apis. Our data offer a comprehensive set of reference transcripts and hence contributes to improving the genome annotation and transcriptomic study of A. apis.


Pathogens ◽  
2021 ◽  
Vol 10 (8) ◽  
pp. 919
Author(s):  
Dóra Tombácz ◽  
István Prazsák ◽  
Gábor Torma ◽  
Zsolt Csabai ◽  
Zsolt Balázs ◽  
...  

Viral transcriptomes that are determined using first- and second-generation sequencing techniques are incomplete. Due to the short read length, these methods are inefficient or fail to distinguish between transcript isoforms, polycistronic RNAs, and transcriptional overlaps and readthroughs. Additionally, these approaches are insensitive for the identification of splice and transcriptional start sites (TSSs) and, in most cases, transcriptional end sites (TESs), especially in transcript isoforms with varying transcript ends, and in multi-spliced transcripts. Long-read sequencing is able to read full-length nucleic acids and can therefore be used to assemble complete transcriptome atlases. Although vaccinia virus (VACV) does not produce spliced RNAs, its transcriptome has a high diversity of TSSs and TESs, and a high degree of polycistronism that leads to enormous complexity. We applied single-molecule, real-time, and nanopore-based sequencing methods to investigate the time-lapse transcriptome patterns of VACV gene expression.


2019 ◽  
Author(s):  
Tao Wang ◽  
Feng Yang ◽  
Qiaosheng Guo ◽  
Qingjun Zou ◽  
Wenyan Zhang ◽  
...  

Abstract Background: The inflorescence of Chrysanthemum morifolium cv. ‘Hangju’ has been widely used in China due to its antioxidant and anti-inflammatory properties. The biosynthesis and regulation of flavonoids, a group of bioactive components, in C. morifolium are poorly understood. Transcriptome sequencing is an effective method for obtaining transcript information. Therefore, single-molecule real-time (SMRT) sequencing was performed to obtain the full-length genes involved in flavonoid biosynthesis and regulation in C. morifolium. Results: High-quality RNA was extracted from the inflorescence of C. morifolium at different developmental stages and used to construct two libraries (0-5 kb and 4.5-10 kb) for sequencing. Finally, 125,532 non-redundant isoforms with a mean length of 2,009 bp were obtained. Of these, 2,083 transcripts were annotated to pathways related to flavonoid biosynthesis, and 56 isoforms were annotated as CHS, CHI, F3H, F3’H, FNS Ⅱ, FLS, DFR and ANS genes. Based on gene expression levels at different stages, we predicted the major genes involved in flavonoid biosynthesis. By phylogenetic analysis, we found two candidate MYB transcription factors (CmMYBF1 and CmMYBF2) activating flavonol biosynthesis. Conclusions: Based on the full-length transcriptomic data and further quantitative analysis, the major genes involved in flavonoid biosynthesis and regulation in C. morifolium were predicted in our study. The results provide a valuable theoretical basis for the introduction and cultivation of C. morifolium cv. ‘Hangju’.


Sign in / Sign up

Export Citation Format

Share Document