scholarly journals PacBio Single-Molecule Long-Read Sequencing Reveals Genes Tolerating Manganese Stress in Schima superba Saplings

2021 ◽  
Vol 12 ◽  
Author(s):  
Fiza Liaquat ◽  
Muhammad Farooq Hussain Munis ◽  
Samiah Arif ◽  
Urooj Haroon ◽  
Jianxin Shi ◽  
...  

Schima superba (Theaceae) is a subtropical evergreen tree and is used widely for forest firebreaks and gardening. It is a plant that tolerates salt and typically accumulates elevated amounts of manganese in the leaves. With large ecological amplitude, this tree species grows quickly. Due to its substantial biomass, it has a great potential for soil remediation. To evaluate the thorough framework of the mRNA, we employed PacBio sequencing technology for the first time to generate S. Superba transcriptome. In this analysis, overall, 511,759 full length non-chimeric reads were acquired, and 163,834 high-quality full-length reads were obtained. Overall, 93,362 open reading frames were obtained, of which 78,255 were complete. In gene annotation analyses, the Kyoto Encyclopedia of Genes and Genomes (KEGG), Clusters of Orthologous Genes (COG), Gene Ontology (GO), and Non-Redundant (Nr) databases were allocated 91,082, 71,839, 38,914, and 38,376 transcripts, respectively. To identify long non-coding RNAs (lncRNAs), we utilized four computational methods associated with protein families (Pfam), Cooperative Data Classification (CPC), Coding Assessing Potential Tool (CPAT), and Coding Non-Coding Index (CNCI) databases and observed 8,551, 9,174, 20,720, and 18,669 lncRNAs, respectively. Moreover, nine genes were randomly selected for the expression analysis, which showed the highest expression of Gene 6 (Na_Ca_ex gene), and CAX (CAX-interacting protein 4) was higher in manganese (Mn)-treated group. This work provided significant number of full-length transcripts and refined the annotation of the reference genome, which will ease advanced genetic analyses of S. superba.

Plants ◽  
2021 ◽  
Vol 10 (4) ◽  
pp. 649
Author(s):  
Qunlu Liu ◽  
Fiza Liaquat ◽  
Yefeng He ◽  
Muhammad Farooq Hussain Munis ◽  
Chunying Zhang

Rhododendronsimsii is one of the top ten famous flowers in China. Due to its historical value and high aesthetic, it is widely popular among Chinese people. Various colors are important breeding objectives in Rhododendron L. The understanding of the molecular mechanism of flower color formation can provide a theoretical basis for the improvement of flower color in Rhododendron L. To generate the R. simsii transcriptome, PacBio sequencing technology has been used. A total of 833,137 full-length non-chimeric reads were obtained and 726,846 high-quality full-length transcripts were found. Moreover, 40,556 total open reading frames were obtained; of which 36,018 were complete. In gene annotation analyses, 39,411, 18,565, 16,102 and 17,450 transcriptions were allocated to GO, Nr, KEGG and COG databases, correspondingly. To identify long non-coding RNAs (lncRNAs), we utilized four computational methods associated with Protein families (Pfam), Cooperative Data Classification (CPC), Coding Assessing Potential Tool (CPAT) and Coding Non Coding Index (CNCI) databases and observed 6170, 2265, 4084 and 1240 lncRNAs, respectively. Based on the results, most genes were enriched in the flavonoid biosynthetic pathway. The eight key genes on the anthocyanin biosynthetic pathway were further selected and analyzed by qRT-PCR. The F3′H and ANS showed an upward trend in the developmental stages of R. simsii. The highest expression of F3′5′H and FLS in the petal color formation of R. simsii was observed. This research provided a huge number of full-length transcripts, which will help to proceed genetic analyses of R. simsii. native, which is a semi-deciduous shrub.


Genes ◽  
2019 ◽  
Vol 10 (6) ◽  
pp. 481 ◽  
Author(s):  
Chen ◽  
Lin ◽  
Xie ◽  
Zhong ◽  
Zhang ◽  
...  

The damage caused by Bradysia odoriphaga is the main factor threatening the production of vegetables in the Liliaceae family. However, few genetic studies of B. odoriphaga have been conducted because of a lack of genomic resources. Many long-read sequencing technologies have been developed in the last decade; therefore, in this study, the transcriptome including all development stages of B. odoriphaga was sequenced for the first time by Pacific single-molecule long-read sequencing. Here, 39,129 isoforms were generated, and 35,645 were found to have annotation results when checked against sequences available in different databases. Overall, 18,473 isoforms were distributed in 25 various Clusters of Orthologous Groups, and 11,880 isoforms were categorized into 60 functional groups that belonged to the three main Gene Ontology classifications. Moreover, 30,610 isoforms were assigned into 44 functional categories belonging to six main Kyoto Encyclopedia of Genes and Genomes functional categories. Coding DNA sequence (CDS) prediction showed that 36,419 out of 39,129 isoforms were predicted to have CDS, and 4319 simple sequence repeats were detected in total. Finally, 266 insecticide resistance and metabolism-related isoforms were identified as candidate genes for further investigation of insecticide resistance and metabolism in B. odoriphaga.


2019 ◽  
Vol 19 (1) ◽  
Author(s):  
Tao Xue ◽  
Han Zhang ◽  
Yuanyuan Zhang ◽  
Shuqin Wei ◽  
Qiujie Chao ◽  
...  

Abstract Background Pinellia ternata is native to China and has been used as a traditional herb due to its antiemetic, antitussive, analgesic, and anxiolytic effects. When exposed to strong light intensity and high temperature during the reproductive growth process, P. ternata withers in a phenomenon known as “sprout tumble”, which largely limits tuber production. Shade was previously found to delay sprout tumble formation (STF); however, no information exists regarding this process at the molecular level. Hence, we determined the genes involved in tuber development and STF in P. ternata. Results Compared to that with natural sun-light (control), shade significantly induced chlorophyll accumulation, increased chlorophyll fluorescence parameters including initial fluorescence, maximal fluorescence, and qP, and dramatically repressed chlorophyll a:b and NPQ. Catalase (CAT) activity was largely induced by shade, and tuber products were largely increased in this environment. Transcriptome profiles of P. ternata grown in natural sun-light and shaded environments were analyzed by a combination of next generation sequencing (NGS) and third generation single-molecule real-time (SMRT) sequencing. Corrections of SMRT long reads based on NGS short reads yielded 136,163 non-redundant transcripts, with an average N50 length of 2578 bp. In total, 6738 deferentially-expressed genes (DEGs) were obtained from the comparisons, specifically D5S vs D5CK, D20S vs D20CK, D20S vs D5S, and D20CK vs D5CK, of which, 6384 DEGs (94.8%) were generated from the D20S vs D20CK comparison. Gene annotation and functional analyses revealed that these genes were related to auxin signal transduction, polysaccharide and sugar metabolism, phenylpropanoid biosynthesis, and photosynthesis. Moreover, the expression of genes enriched in photosynthesis appeared to be significantly altered by shade. The expression patterns of 16 candidate genes were consistent with changes in their transcript abundance as identified by RNA-Seq, and these might contribute to STF and tuber production. Conclusion The full-length transcripts identified in this study have provided a more accurate depiction of P. ternata gene transcription. Further, we identified potential genes involved in STF and tuber growth. Such data could serve as a genetic resource and a foundation for further research on this important traditional herb.


2019 ◽  
Vol 20 (24) ◽  
pp. 6350 ◽  
Author(s):  
Nan Deng ◽  
Chen Hou ◽  
Fengfeng Ma ◽  
Caixia Liu ◽  
Yuxin Tian

The limitations of RNA sequencing make it difficult to accurately predict alternative splicing (AS) and alternative polyadenylation (APA) events and long non-coding RNAs (lncRNAs), all of which reveal transcriptomic diversity and the complexity of gene regulation. Gnetum, a genus with ambiguous phylogenetic placement in seed plants, has a distinct stomatal structure and photosynthetic characteristics. In this study, a full-length transcriptome of Gnetum luofuense leaves at different developmental stages was sequenced with the latest PacBio Sequel platform. After correction by short reads generated by Illumina RNA-Seq, 80,496 full-length transcripts were obtained, of which 5269 reads were identified as isoforms of novel genes. Additionally, 1660 lncRNAs and 12,998 AS events were detected. In total, 5647 genes in the G. luofuense leaves had APA featured by at least one poly(A) site. Moreover, 67 and 30 genes from the bHLH gene family, which play an important role in stomatal development and photosynthesis, were identified from the G. luofuense genome and leaf transcripts, respectively. This leaf transcriptome supplements the reference genome of G. luofuense, and the AS events and lncRNAs detected provide valuable resources for future studies of investigating low photosynthetic capacity of Gnetum.


Forests ◽  
2020 ◽  
Vol 11 (8) ◽  
pp. 866
Author(s):  
Lei Kan ◽  
Qicong Liao ◽  
Zhiyao Su ◽  
Yushan Tan ◽  
Shuyu Wang ◽  
...  

Madhuca pasquieri (Dubard) Lam. is a tree on the International Union for Conservation of Nature Red List and a national key protected wild plant (II) of China, known for its seed oil and timber. However, lacking of genomic and transcriptome data for this species hampers study of its reproduction, utilization, and conservation. Here, single-molecule long-read sequencing (PacBio) and next-generation sequencing (Illumina) were combined to obtain the transcriptome from five developmental stages of M. pasquieri. Overall, 25,339 transcript isoforms were detected by PacBio, including 24,492 coding sequences (CDSs), 9440 simple sequence repeats (SSRs), 149 long non-coding RNAs (lncRNAs), and 182 alternative splicing (AS) events, a majority was retained intron (RI). A further 1058 transcripts were identified as transcriptional factors (TFs) from 51 TF families. PacBio recovered more full-length transcript isoforms with a longer length, and a higher expression level, whereas larger number of transcripts (124,405) was captured in de novo from Illumina. Using Nr, Swissprot, KOG, and KEGG databases, 24,405 transcripts (96.31%) were annotated by PacBio. Functional annotation revealed a role for the auxin, abscisic acid, gibberellin, and cytokinine metabolic pathways in seed germination and post-germination. These findings support further studies on seed germination mechanism and genome of M. pasquieri, and better protection of this endangered species.


2019 ◽  
Author(s):  
Bo Wang ◽  
Elizabeth Tseng ◽  
Primo Baybayan ◽  
Kevin Eng ◽  
Michael Regulski ◽  
...  

AbstractHaplotype phasing of genetic variants in maize is important for interpretation of the genome, population genetic analysis and functional genomic analysis of allelic activity. Accordingly, accurate methods for phasing the full-length isoforms are essential for functional genomics studies. We performed an isoform-level phasing study in maize, using two inbred lines and their reciprocal crosses, based on the single-molecule full-length cDNA sequencing. To phase and analyze the full-length transcripts between hybrids and parents, we developed a tool called IsoPhase. Using this tool, we validated the majority of SNPs called against matching short-read data and identified cases of allele-specific, gene-level and isoform-level expression. Our results revealed that maize parental lines and hybrid lines exhibit different splicing activities. After phasing 6,907 genes in two reciprocal hybrids using embryo, endosperm and root tissues, we annotated the SNPs and identified large-effect genes. In addition, based on single-molecule sequencing, we identified parent-of-origin isoforms in maize hybrids, distinct novel isoforms in maize parent and hybrid lines, and imprinted genes from different tissues. Finally, we characterized variation in cis- and trans-regulatory effects. Our study provides measures of haplotypic expression that could increase accuracy in studies of allelic expression.


Author(s):  
Chengcai Zhang ◽  
Huadong Ren ◽  
Xiaohua Yao ◽  
Kailiang Wang ◽  
Jun Chang

Abstract Pecan is rich in bioactive components such as fatty acids and flavonoids and is an important nut type worldwide. Therefore, the molecular mechanisms of phytochemical biosynthesis in pecan are a focus of research. Recently, a draft genome and several transcriptomes have been published. However, the full-length mRNA transcripts remain unclear, and the regulatory mechanisms behind the quality components biosynthesis and accumulation have not been fully investigated. In this study, single-molecule long read sequencing technology was used to obtain full-length transcripts of pecan kernels. In total, 37 504 isoforms of 16 702 genes were mapped to the reference genome. The numbers of known isoforms, new isoforms, and novel isoforms were 9013 (24.03%), 26 080 (69.54%), and 2411 (6.51%), respectively. Over 80% of the transcripts (30 751, 81.99%) had functional annotations. A total of 15 465 alternative splicing (AS) events and 65 761 alternative polyadenylation events were detected; wherein, the retained intron was the predominant type (5652, 36.55%) of AS. Furthermore, 1894 long non-coding RNAs and 1643 transcription factors were predicted using bioinformatics methods. Finally, the structural genes associated with fatty acid (FA) and flavonoid biosynthesis were characterized. A high frequency of AS accuracy (70.31%) was observed in FA synthesis-associated genes. The present study provides a full-length transcriptome dataset of pecan kernels, which will significantly enhance the understanding of the regulatory basis of phytochemical biosynthesis during pecan kernel maturation.


2019 ◽  
Author(s):  
Mitchell R. Vollger ◽  
Glennis A. Logsdon ◽  
Peter A. Audano ◽  
Arvis Sulovari ◽  
David Porubsky ◽  
...  

AbstractThe sequence and assembly of human genomes using long-read sequencing technologies has revolutionized our understanding of structural variation and genome organization. We compared the accuracy, continuity, and gene annotation of genome assemblies generated from either high-fidelity (HiFi) or continuous long-read (CLR) datasets from the same complete hydatidiform mole human genome. We find that the HiFi sequence data assemble an additional 10% of duplicated regions and more accurately represent the structure of tandem repeats, as validated with orthogonal analyses. As a result, an additional 5 Mbp of pericentromeric sequences are recovered in the HiFi assembly, resulting in a 2.5-fold increase in the NG50 within 1 Mbp of the centromere (HiFi 480.6 kbp, CLR 191.5 kbp). Additionally, the HiFi genome assembly was generated in significantly less time with fewer computational resources than the CLR assembly. Although the HiFi assembly has significantly improved continuity and accuracy in many complex regions of the genome, it still falls short of the assembly of centromeric DNA and the largest regions of segmental duplication using existing assemblers. Despite these shortcomings, our results suggest that HiFi may be the most effective stand-alone technology for de novo assembly of human genomes.


2020 ◽  
Vol 3 (1) ◽  
Author(s):  
Yueming Hu ◽  
Xing-Sheng Shu ◽  
Jiaxian Yu ◽  
Ming-an Sun ◽  
Zewei Chen ◽  
...  

AbstractHuman genes form a large variety of isoforms after transcription, encoding distinct transcripts to exert different functions. Single-molecule RNA sequencing facilitates accurate identification of the isoforms by extending nucleotide read length significantly. However, the gene or isoform diversity is lowly represented by the mRNA molecules captured by single-molecule RNA sequencing. Here, we show that a cDNA normalization procedure before the library preparation for PacBio RS II sequencing captures 3.2–6.0 fold more full-length high-quality isoform species for different human samples, as compared to the non-normalized capture procedure. Many lowly expressed, functionally important isoforms can be detected. In addition, normalized PacBio RNA sequencing also resolves more allele-specific haplotype transcripts. Finally, we apply the cDNA normalization based long-read RNA sequencing method to profile the transcriptome of human gastric signet-ring cell carcinomas, identify new cancer-specific transcriptome signatures, and thus, bring out the utility of the improved protocols in gene expression studies.


Sign in / Sign up

Export Citation Format

Share Document