scholarly journals De novo transcriptome assembly of Premnotrypes vorax (Coleoptera: Curculionidae)

2020 ◽  
Author(s):  
Luisa-Fernanda Velásquez C. ◽  
Pablo Emiliano Canton ◽  
Alejandro Sánchez-Flores ◽  
Alejandra Bravo ◽  
Jairo Cerón

Abstract Objective: Premnotrypes vorax (P. vorax) is an insect pest that causes significant losses to potato crops in Colombia. Currently, the insect control is mainly done by using highly toxic chemical insecticides and there are no reports of any commercial biological control strategy against this pest. Hence, the objective of this study was to characterize the insect genetic expression to search for genes that could codify for Bacillus thuringiensis Cry toxin receptors. Using an RNA-seq approach, we sequenced the mRNA from the insect tissue, performed a de novo assembly and analyzed the reconstructed transcriptome of P. vorax. To our knowledge, this is the first genetic report of this endemic insect which will set the basis of a possible biological control strategy.Results: The transcriptome data was obtained from dissected midgut tissue samples of P. vorax larvae. The isolated RNA was isolated and sequenced using the Illumina HiSeq platform with a configuration of 2x150pb reads. A total of 383,552,246 reads were obtained and subsequently a quality and cleaning process was performed through FastQC and Trimmomatic software, respectively. A novo assembly was done using the Trinity software, obtaining a transcriptome assembly with 25,631 genes that showed at least one annotation record, resulting in 74,984 transcript isoforms.

2020 ◽  
Vol 10 (1) ◽  
Author(s):  
D. N. U. Naranpanawa ◽  
C. H. W. M. R. B. Chandrasekara ◽  
P. C. G. Bandaranayake ◽  
A. U. Bandaranayake

Abstract Recent advances in next-generation sequencing technologies have paved the path for a considerable amount of sequencing data at a relatively low cost. This has revolutionized the genomics and transcriptomics studies. However, different challenges are now created in handling such data with available bioinformatics platforms both in assembly and downstream analysis performed in order to infer correct biological meaning. Though there are a handful of commercial software and tools for some of the procedures, cost of such tools has made them prohibitive for most research laboratories. While individual open-source or free software tools are available for most of the bioinformatics applications, those components usually operate standalone and are not combined for a user-friendly workflow. Therefore, beginners in bioinformatics might find analysis procedures starting from raw sequence data too complicated and time-consuming with the associated learning-curve. Here, we outline a procedure for de novo transcriptome assembly and Simple Sequence Repeats (SSR) primer design solely based on tools that are available online for free use. For validation of the developed workflow, we used Illumina HiSeq reads of different tissue samples of Santalum album (sandalwood), generated from a previous transcriptomics project. A portion of the designed primers were tested in the lab with relevant samples and all of them successfully amplified the targeted regions. The presented bioinformatics workflow can accurately assemble quality transcriptomes and develop gene specific SSRs. Beginner biologists and researchers in bioinformatics can easily utilize this workflow for research purposes.


Author(s):  
Boyun Yang ◽  
Huolin Luo ◽  
Yuan Tao ◽  
Wenjing Yu ◽  
Liping Luo

Cymbidium kanran is an important commercially grown member of the Chinese orchid family. However, little information regarding the molecular biology of this species is available. In this study, the C. kanran root, shoot, stem, leaf, and flower transcriptomes were sequenced with the Illumina HiSeq 4000 system, which resulted in 8.9 Gb of clean reads that were assembled into 74,620 unigenes, with an average length and N50 of 983 bp and 1,640 bp, respectively. The screening of seven databases (NR, NT, GO, KOG, KEGG, Swiss-Prot, and InterPro) for similar sequences resulted in the functional annotation of 49,813 unigenes. Additionally, 173 MADS-box genes, which help to control major aspects of plant development, were identified and their codon usage bias was analyzed. Only 26 genes had a low ENC (less than or equal to 35), suggesting the codon usage bias was weak. Base mutations were the major determinants of codon usage, although natural selection pressure also influenced codon usage bias. Moreover, 22 optimal codons were identified based on ΔRSCU, and 20 codons ended with A/U. The results of this study provide the foundation for the molecular breeding of new varieties


2018 ◽  
Vol 5 (12) ◽  
pp. 181247 ◽  
Author(s):  
Tengfei Liu ◽  
Ziyao Liu ◽  
Xueyan Yao ◽  
Ying Huang ◽  
Qingsong Qu ◽  
...  

Cordyceps cicadae (Chanhua) is a parasitic fungus that grows on Cicada flammata larvae and is used to relieve exhaustion and treat numerous diseases, in part through its active constituent, cordycepin. We used de novo Illumina HiSeq 4000 sequencing to obtain transcriptomes of C. cicadae mycelium, fruiting body, and sclerotium, and identify differentially expressed genes. In the mycelium versus sclerotium libraries, 1576 upregulated and 2300 downregulated genes were identified. In the mycelium versus fruiting body and fruiting body versus sclerotium body libraries, 1604 and 1474 upregulated and 1365 and 1320 downregulated genes, respectively, were identified. Gene Ontology and Kyoto Encyclopedia of Genes and Genomes analyses identified 19 genes differentially expressed in mycelium versus fruiting body as related to the purine pathway, along with 28 and 16 genes differentially expressed in the mycelium versus sclerotium and fruiting body versus sclerotium groups, respectively. Gene expression of six key enzymes was validated by quantitative polymerase chain reaction. Specifically, 5′-nucleotidase (c62060g1) and adenosine deaminase (c35629g1) in purine nucleotide metabolism, which are involved in cordycepin biosynthesis, were significantly upregulated in the sclerotium group. These findings improved our understanding of genes involved in the biosynthesis of cordycepin and other characteristic secondary metabolites in C. cicadae .


2018 ◽  
Vol 54 (No. 1) ◽  
pp. 17-25 ◽  
Author(s):  
D.-D. Vu ◽  
T.T.-X. Bui ◽  
T.H.-N. Nguyen ◽  
S.N.M. Shah ◽  
N.-H. Vu ◽  
...  

A total 20 074 230 sequencing reads were generated by Illumina HiSeq<sup>™ </sup>2500 from three different Toxicodendron vernicifluum tissue samples. In total, 48 693 unigenes with an average length of 703.34 bp were obtained by de novo assembly. 3392 potential EST-SSRs (expressed sequence tag-simple sequence repeat) were identified as potential molecular markers from unigenes with lengths exceeding 1 kb. A total of 80 pairs of PCR primers were randomly selected to validate the assembly quality and develop EST-SSR markers from genomic DNA. Of these primer pairs, 14 primer pairs successfully amplified DNA fragments and detected significant amounts of polymorphism within the lacquer tree population in Langao, Shaanxi province, China. There were high genetic diversities (number of alleles per locus (A) = 2.93, polymorphic information content (PIC) = 0.53, observed heterozygosity (Ho) = 0.62 and expected heterozygosity (He) = 0.85) in the lacquer tree natural population. The four loci were significantly deviated from Hardy-Weinberg equilibrium. These results suggested high homozygosity in the population and low or deficiency in heterozygosity (inbreeding coefficient (Fis) = 0.27). These polymorphic EST-SSR markers will provide the base for further studies of genetic structure and breeding in T. vernicifluum.


2020 ◽  
Author(s):  
Guolin Zhou ◽  
Ping Zhu

Abstract Background: Rhododendron molle (Ericaceae) is a traditional Chinese medicinal plant, its flower and root have been widely used to treat rheumatism and relieve pain for thousands of years in China. Chemical studies have revealed that R. molle contains abundant secondary metabolites such as terpenoinds, flavonoids and lignans, some of which have exhibited various bioactivities including antioxidant, hypotension and analgesic activity. In spite of immense pharmaceutical importance, the mechanism underlying the biosynthesis of secondary metabolites remains unknown and the genomic information is unavailable. Results: To gain molecular insight into this plant, especially on the information of pharmaceutically important secondary metabolites including grayanane diterpenoids, we conducted deep transcriptome sequencing for R. molle flower and root using the Illumina Hiseq platform. In total, 100,603 unigenes were generated through de novo assembly with mean length of 778 bp, 57.1% of these unigenes were annotated in public databases and 17,906 of those unigenes showed significant match in the KEGG database. Unigenes involved in the biosynthesis of secondary metabolites were annotated, including the TPSs and CYPs that were potentially responsible for the biosynthesis of grayanoids. Moreover, 3,376 transcription factors and 10,828 simple sequence repeats (SSRs) were also identified. Additionally, we further performed differential gene expression (DEG) analysis of the flower and root transcriptome libraries and identified numerous genes that were specifically expressed or up-regulated in flower.Conclusions: To the best of our knowledge, this is the first time to generate and thoroughly analyze the transcriptome data of both R. molle flower and root. This study provided an important genetic resource which will shed light on elucidating various secondary metabolite biosynthetic pathways in R. molle, especially for those with medicinal value and allow for drug development in this plant.


2019 ◽  
Author(s):  
Shifeng Cheng ◽  
Yuan Fu ◽  
Yaolei Zhang ◽  
Wenfei Xian ◽  
Hongli Wang ◽  
...  

Abstract BACKGROUND: The Mongolian gerbil (Meriones unguiculatus) has historically been used as a model organism for the auditory and visual systems, stroke/ischemia, epilepsy and aging related research since 1935 when laboratory gerbils were separated from their wild counterparts. In this study we report genome sequencing, assembly, and annotation further supported by transcriptome sequencing and assembly from 27 different tissues samples. RESULTS: The genome was assembled using Illumina HiSeq 2000 and resulted in a final genome size of 2.54 Gbp with contig and scaffold N50 values of 31.4 Kbp and 500.0 Kbp, respectively. Based on the k-mer estimated genome size of 2.48 Gbp, the assembly appears to be complete. The genome annotation was supported by transcriptome data that identified 31 769 (>2000bp) predicted protein-coding genes across 27 tissue samples. A BUSCO search of 3023 mammalian groups resulted in 86% of curated single copy orthologs present among predicted genes, indicating a high level of completeness of the genome. CONCLUSIONS: We report a de novo assembly of the Mongolian gerbil genome that was further enhanced by assembly of transcriptome data from several tissues. Sequencing of this genome increases the utility of the gerbil as a model organism, opening the availability of now widely used genetic tools.


2020 ◽  
Author(s):  
Guolin Zhou ◽  
Ping Zhu

Abstract Background: Rhododendron molle (Ericaceae) is a traditional Chinese medicinal plant, its flower and root have been widely used to treat rheumatism and relieve pain for thousands of years in China. Chemical studies have revealed that R. molle contains abundant secondary metabolites such as terpenoinds, flavonoids and lignans, some of which have exhibited various bioactivities including antioxidant, hypotension and analgesic activity. In spite of immense pharmaceutical importance, the mechanism underlying the biosynthesis of secondary metabolites remains unknown and the genomic information is unavailable. Results: To gain molecular insight into this plant, especially on the pharmaceutically important secondary metabolic information, we conducted deep transcriptome sequencing for R. molle flower and root using the Illumina Hiseq platform. In total, 100,603 unigenes were generated through de novo assembly with mean length of 778 bp, 57.1% of these unigenes were annotated in public databases and 17,906 of those unigenes showed significant match in the KEGG database. Unigenes involved in the biosynthesis of secondary metabolites were annotated, including the TPSs and CYPs that were potentially responsible for the biosynthesis of grayanoids. Moreover, 3,376 transcription factors and 10,828 simple sequence repeats (SSRs) were also identified. Additionally, we further performed differential gene expression (DEG) analysis of the flower and root transcriptome libraries and identified numerous genes that were specifically expressed or up-regulated in flower.Conclusions: To the best of our knowledge, this is the first time to generate and thoroughly analyze the transcriptome data of both R. molle flower and root. This study provided an important genetic resource which will shed light on elucidating various secondary metabolite biosynthetic pathways in R. molle, especially for those with medicinal value and allow for drug development in this plant.


2019 ◽  
Vol 6 (1) ◽  
Author(s):  
Narender K. Dhania ◽  
Vinod K. Chauhan ◽  
R. K. Chaitanya ◽  
Aparna Dutta-Gupta

2019 ◽  
Vol 19 (6) ◽  
Author(s):  
Xiao-Rong Zhou ◽  
Yan-Min Shan ◽  
Yao Tan ◽  
Zhuo-Ran Zhang ◽  
Bao-Ping Pang

Abstract Galeruca daurica (Joannis) has become a new insect pest in the Inner Mongolia grasslands since 2009, and its larvae and eggs have strong cold tolerance. To get a deeper insight into its molecular mechanisms of cold stress responses, we performed de novo transcriptome assembly for G. daurica by RNA-Seq and compared the transcriptomes of its larvae exposed to five different temperature treatments (−10, −5, 0, 5, and 25°C for 1 h and then recovered at 25°C for 1 h), respectively. Compared with the control (25°C), the numbers of differentially expressed genes (DEGs) decreased from 1,821 to 882, with the temperature declining from 5 to −10°C. Moreover, we obtained 323 coregulated DEGs under different low temperatures. Under four low temperatures (−10, −5, 0, and 5°C), a large number of genes were commonly upregulated during recovery from cold stresses, including those related to cuticle protein, followed by cytochrome P450, clock protein, fatty acid synthase, and fatty acyl-CoA reductase; meanwhile, lots of genes encoding cuticle protein, RNA replication protein, RNA-directed DNA polymerase, and glucose dehydrogenase were commonly downregulated. Our findings provide important clues for further investigations of key genes and molecular mechanisms involved in the adaptation of G. daurica to harsh environments.


2019 ◽  
Vol 2019 ◽  
pp. 1-14
Author(s):  
Fahad Al-Qurainy ◽  
Aref Alshameri ◽  
Abdel-Rhman Gaafar ◽  
Salim Khan ◽  
Mohammad Nadeem ◽  
...  

The forage crop Guar (Cyamopsis tetragonoloba (L.) Taub.) has the ability to endure heat, drought, and mild salinity. A complete image on its genic architecture will promote our understanding about gene expression networks and different tolerance mechanisms at the molecular level. Therefore, whole mRNA sequence approach on the Guar plant was conducted to provide a snapshot of the mRNA information in the cell under salinity, heat, and drought stresses to be integrated with previous transcriptomic studies. RNA-Seq technology was employed to perform a 2×100 paired-end sequencing using an Illumina HiSeq 2500 platform for the transcriptome of leaves of C. tetragonoloba under normal, heat, drought, and salinity conditions. Trinity was used to achieve a de novo assembly followed by gene annotation, functional classification, metabolic pathway analysis, and identification of SSR markers. A total of 218.2 million paired-end raw reads (~44 Gbp) were generated. Of those, 193.5M paired-end reads of high quality were used to reconstruct a total of 161,058 transcripts (~266 Mbp) with N50 of 2552 bp and 61,508 putative genes. There were 6463 proteins having >90% full-length coverage against the Swiss-Prot database and 94% complete orthologs against Embryophyta. Approximately, 62.87% of transcripts were blasted, 50.46% mapped, and 43.50% annotated. A total of 4715 InterProScan families, 3441 domains, 74 repeats, and 490 sites were detected. Biological processes, molecular functions, and cellular components comprised 64.12%, 25.42%, and 10.4%, respectively. The transcriptome was associated with 985 enzymes and 156 KEGG pathways. A total of 27,066 SSRs were gained with an average frequency of one SSR/9.825 kb in the assembled transcripts. This resulting data will be helpful for the advanced analysis of Guar to multi-stress tolerance.


Sign in / Sign up

Export Citation Format

Share Document