scholarly journals Isolation of Microsatellite Markers from De Novo Whole Genome Sequences of Coptotermes gestroi (Wasmann) (Blattodea: Rhinotermitidae)

Data ◽  
2021 ◽  
Vol 6 (4) ◽  
pp. 40
Author(s):  
Li Yang Lim ◽  
Shawn Cheng ◽  
Abdul Hafiz Ab Majid

Coptotermes gestroi (Wasmann) (Blattodea: Rhinotermitidae) is a subterranean termite species from Southeast Asia which has been unintentionally introduced to many parts of the world through commerce and modern transportation. Known for causing extensive damage to timber used in the built environment, the termite also has a habit of nesting in carton nests in wood and wooden structures in buildings. As so little is known of its breeding system, colony, and genetic structure, we initiated work to sequence its genome with an Illumina HiSeq™ 2000 sequencer. In this publication, we announce our paired-end sequencing data and report the isolation of 119,190 microsatellite markers from our DNA assembly. The microsatellite marker reported in this publication can be used to elucidate the mating system and genetic structure of this highly invasive termite species. Additionally, in this announcement the study authors make the Bio Project sequence accession number SRR13105492 accessible from the Sequence Read Archive database.

2016 ◽  
Author(s):  
Ying Wang ◽  
Kun Liu ◽  
De Bi ◽  
Biao Shou Zhou ◽  
Wen Jian Shao

Background. Resurrection plants constitute a unique cadre within angiosperms. Boea clarkeana Hemsl. (Boea, Gesneriaceae) is a desiccation-tolerant dicotyledonous herb that is endemic to China. Although research on angiosperms with DT could be instructive for crops, genomic resources for B. clarkeana remain scarce. In addition, transcriptome sequencing could be an effective way to study desiccation-tolerant plants. Methods. In the present study, we used the platform Illumina HiSeqTM 2000 and de novo assembly technology to obtain leaf transcriptomes of B. clarkeana and conducted a BLASTX alignment of the sequencing data and protein databases for sequence classification and annotation. Then, based on the sequence information obtained, we developed EST-SSR markers by means of EST-SSR mining, primer design and polymorphism identification. Results. A total of 91,449 unigenes were generated from the leaf cDNA library of B. clarkeana in this study. Based on a sequence similarity search with a known protein database, 72,087 unigenes were annotated. Among the annotated unigenes, a total of 71,170 unigenes showed significant similarity to known proteins of 463 popular model species in the Nr database, and 59,962 unigenes and 32,336 unigenes were assigned to GO classifications and COG, respectively. In addition, 44,924 unigenes were mapped in 128 KEGG pathways. Furthermore, a total of 7,610 unigenes with 8,563 microsatellites were found. Seventy-four primer pairs were selected from 436 primer pairs designed for polymorphism validation. SSRs with higher polymorphism rates were concentrated on dinucleotides, pentanucleotides and hexanucleotides. Finally, 17 pairs with highly polymorphic and stable loci were selected for polymorphism screening. There were a total of 65 alleles, with 2–6 alleles at each locus. Mainly due to the unique biological characteristics of plants, the HE, HO and PIC per locus were very low, ranging from 0 to 0.196, 0.082 to 0.14 and 0 to 0.155, respectively. Discussion. A substantial fraction transcriptome sequences of B. clarkeana were generated in this study, which is the first molecular-level analysis of this plant. These sequences are valuable resources for gene annotation and discovery and molecular marker development. These sequences could also provide a valuable basis for the future molecular study of B. clarkeana.


2020 ◽  
Author(s):  
Huiyan Wang ◽  
Ning Wang ◽  
Yixin Huo

Abstract Background: Azadirachtin A is a triterpenoid from neem tree exhibiting excellent activities against over 600 insect species in agriculture. The manufacture of azadirachtin A depends on extraction from neem tissues, which is not ecofriendly and sustainable. The low yield and discontinuous supply impeded the further application. The biosynthetic pathway of azadirachtin A is still well-known.Results: We attempted to explore azadirachtin A biosynthetic pathway and identified key involved genes by analyzing transcriptome data of five neem tissues through hybrid-seq (Illumina HiSeq and Pacific Biosciences Single Molecule Real Time (PacBio SMRT)) technology. A total 219 and 397 up-regulated differentially expressed genes (DEGs) in leaf and fruit tissues than other tissues (root, stem and flower) were isolated. After phylogenetic analysis and domain prediction, 22 candidates encoding 2,3-oxidosqualene cyclase (OSC), alcohol dehydrogenase (ADH), cytochrome P450 (CYP450), acyltransferase (ACT) and esterase (EST) proposed to be involved in azadirachtin A biosynthesis were finally selected. De novo assembled sequences were verified by Quantitative Real-Time PCR (qRT-PCR) analysis.Conclusions: By integrating and analysis data from Illumina HiSeq and PacBio SMRT platform, 22 DEGs were finally selected as candidates involved in azadirachtin A biosynthesis. The obtained reliable and accurate sequencing data provided important novel information for understanding neem genome. Our data shed new light on the understanding of other triterpenoids biosynthesis in neem trees and provide reference for exploring other valuable natural product biosynthesis in plants.


2020 ◽  
Author(s):  
Duy Dinh Vu ◽  
Syed Noor Muhammad Shah ◽  
Mai Phuong Pham ◽  
Van Thang Bui ◽  
Minh Tam Nguyen ◽  
...  

Abstract Background: Understanding the genetic diversity in threatened species that occur in forest remnants is necessary to establish efficient strategies for the species conservation, restoration and management. Panax vietnamensis Ha et Grushv. is medicinally important, endemic and endangered species of Vietnam. However, genetic diversity and structure of population is unknown due to lack of efficient molecular markers.Results: In this study, we employed Illumina HiSeq TM 4000 sequencing to analyze the transcriptomes of P. vietnamensis (roots, leaves and stems). A total of 23,741,783 raw reads were obtained and assembled, from which, 89,271 unigenes with an average length of 598.3191 nt were generated. During functional annotation, 31,686 unigenes were annotated in Gene Ontology categories, Kyoto Encyclopedia of Genes and Genomes pathways, Swiss-Prot database, and Nucleotide Collection (NR/NT) database. In addition, 11,343 expressed sequence tag-simple sequence repeat (EST-SSRs) were detected. From 7,774 primer pairs, 101 were selected for polymorphism validation, in which, 20 primer pairs were successfully amplified to DNA fragments and significant amounts of polymorphism was observed within population. The nine polymorphic microsatellite loci were used to analyze genetic diversity and structure of the natural populations. The obtained results revealed that the shows high levels of genetic diversity in populations, the average observed and expected heterozygosity were H O = 0.422 and H E = 0.479. During the Bottleneck analysis using TPM and SMM models (p < 0.01) shows that targeted population is significantly heterozygote deficient. This suggests sign of bottleneck in all populations. Genetic differentiation among populations was moderate (F ST = 0.133) and indicating limited gene flow (Nm = 1.63). Analysis of molecular variance (AMOVA) showed 63.17% of variation within individuals and 12.45% among populations. These results showed a moderate genetic structure of P. vietnamensis. STRUCTURE analysis and the unweighted pair-group method with arithmetic means (UPGMA) tree revealed strong genetic structure and two genetic clusters related to geographical distances, as well. Conclusion: Our study will assist conservators in future conservation management, breeding, production and habitats restoration of the species.


Genes ◽  
2018 ◽  
Vol 9 (8) ◽  
pp. 383 ◽  
Author(s):  
Hyun-Oh Lee ◽  
Ji-Weon Choi ◽  
Jeong-Ho Baek ◽  
Jae-Hyeon Oh ◽  
Sang-Choon Lee ◽  
...  

Platycodon grandiflorus (balloon flower) and Codonopsis lanceolata (bonnet bellflower) are important herbs used in Asian traditional medicine, and both belong to the botanical family Campanulaceae. In this study, we designed and implemented a de novo DNA sequencing and assembly strategy to map the complete mitochondrial genomes of the first two members of the Campanulaceae using low-coverage Illumina DNA sequencing data. We produced a total of 28.9 Gb of paired-end sequencing data from the genomic DNA of P. grandiflorus (20.9 Gb) and C. lanceolata (8.0 Gb). The assembled mitochondrial genome of P. grandiflorus was found to consist of two circular chromosomes; the master circle contains 56 genes, and the minor circle contains 42 genes. The C. lanceolata mitochondrial genome consists of a single circle harboring 54 genes. Using a comparative genome structure and a pattern of repeated sequences, we show that the P. grandiflorus minor circle resulted from a recombination event involving the direct repeats of the master circle. Our dataset will be useful for comparative genomics and for evolutionary studies, and will facilitate further biological and phylogenetic characterization of species in the Campanulaceae.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
D. N. U. Naranpanawa ◽  
C. H. W. M. R. B. Chandrasekara ◽  
P. C. G. Bandaranayake ◽  
A. U. Bandaranayake

Abstract Recent advances in next-generation sequencing technologies have paved the path for a considerable amount of sequencing data at a relatively low cost. This has revolutionized the genomics and transcriptomics studies. However, different challenges are now created in handling such data with available bioinformatics platforms both in assembly and downstream analysis performed in order to infer correct biological meaning. Though there are a handful of commercial software and tools for some of the procedures, cost of such tools has made them prohibitive for most research laboratories. While individual open-source or free software tools are available for most of the bioinformatics applications, those components usually operate standalone and are not combined for a user-friendly workflow. Therefore, beginners in bioinformatics might find analysis procedures starting from raw sequence data too complicated and time-consuming with the associated learning-curve. Here, we outline a procedure for de novo transcriptome assembly and Simple Sequence Repeats (SSR) primer design solely based on tools that are available online for free use. For validation of the developed workflow, we used Illumina HiSeq reads of different tissue samples of Santalum album (sandalwood), generated from a previous transcriptomics project. A portion of the designed primers were tested in the lab with relevant samples and all of them successfully amplified the targeted regions. The presented bioinformatics workflow can accurately assemble quality transcriptomes and develop gene specific SSRs. Beginner biologists and researchers in bioinformatics can easily utilize this workflow for research purposes.


2020 ◽  
Author(s):  
Duy Dinh Vu ◽  
Syed Noor Muhammad Shah ◽  
Mai Phuong Pham ◽  
Van Thang Bui ◽  
Minh Tam Nguyen ◽  
...  

Abstract Background: Understanding the genetic diversity in threatened species that occur in forest remnants is necessary to establish efficient strategies for the species conservation, restoration and management. Panax vietnamensis Ha et Grushv. is medicinally important, endemic and endangered species of Vietnam. However, genetic diversity and structure of population is unknown due to lack of efficient molecular markers. Results: In this study, we employed Illumina HiSeq TM 4000 sequencing to analyze the transcriptomes of P. vietnamensis (roots, leaves and stems). A total of 23,741,783 raw reads were obtained and assembled, from which, 89,271 unigenes with an average length of 598.3191 nt were generated. During functional annotation, 31,686 unigenes were annotated in Gene Ontology categories, Kyoto Encyclopedia of Genes and Genomes pathways, Swiss-Prot database, and Nucleotide Collection (NR/NT) database. In addition, 11,343 expressed sequence tag-simple sequence repeat (EST-SSRs) were detected. From 7,774 primer pairs, 101 were selected for polymorphism validation, in which, 20 primer pairs were successfully amplified to DNA fragments and significant amounts of polymorphism was observed within population. The nine polymorphic microsatellite loci were used to analyze genetic diversity and structure of the natural populations. The obtained results revealed that the shows high levels of genetic diversity in populations, the average observed and expected heterozygosity were H O = 0.422 and H E = 0.479. During the Bottleneck analysis using TPM and SMM models (p < 0.01) shows that targeted population is significantly heterozygote deficient. This suggests sign of bottleneck in all populations. Genetic differentiation among populations was moderate (F ST = 0.133) and indicating limited gene flow (Nm = 1.63). Analysis of molecular variance (AMOVA) showed 63.17% of variation within individuals and 12.45% among populations. These results showed a moderate genetic structure of P. vietnamensis. STRUCTURE analysis and the unweighted pair-group method with arithmetic means (UPGMA) tree revealed strong genetic structure and two genetic clusters related to geographical distances, as well. Conclusion: Our study will assist conservators in future conservation management, breeding, production and habitats restoration of the species.


2019 ◽  
Vol 12 (2) ◽  
pp. 49-58
Author(s):  
Adam D Miller ◽  
Inka Veltheim ◽  
Timothy Nevard ◽  
Han Ming Gan ◽  
Martin Haase

The Brolga ( Antigone rubicunda) is a large Australian crane species with a broad distribution spanning from the tropical north to the south-eastern regions of the continent. Brolga populations throughout New South Wales, Victoria and South Australia have been in decline since the early twentieth century, with the species being listed as vulnerable in each state. To aid future conservation of the species, its taxonomic status needs to be validated, and patterns of gene flow and population connectivity across the species distribution need to be understood. To assist future genetic studies, we developed a suite of polymorphic microsatellite markers and the complete mitochondrial genome sequence by next-generation sequencing. A total of 18 polymorphic loci were characterised using DNA extractions from 47 individuals, comprising 30 and 17 individuals from Victoria and northern Australia, respectively. We observed moderate genetic variation across loci with only a single locus deviating significantly from Hardy–Weinberg equilibrium. De novo and reference-based genome assemblies were used to assemble the A. rubicunda mitochondrial genome sequence, which consists of 16,700 base pairs, and a typical metazoan mitochondrial gene content and arrangement. We test these new markers by conducting a preliminary analysis of genetic structure between south-eastern and northern Australian Brolga populations. Mitochondrial analyses provided evidence of shared haplotypes across the species range supporting the conspecific status of extant populations, while microsatellite markers indicated weak but significant genetic differentiation suggesting gene flow is limited. We discuss the implications of these findings and the benefits that these genetic markers will provide for future population genetic research on this iconic Australian bird species.


GigaScience ◽  
2021 ◽  
Vol 10 (1) ◽  
Author(s):  
Monica M Sheffer ◽  
Anica Hoppe ◽  
Henrik Krehenwinkel ◽  
Gabriele Uhl ◽  
Andreas W Kuss ◽  
...  

Abstract Background Argiope bruennichi, the European wasp spider, has been investigated intensively as a focal species for studies on sexual selection, chemical communication, and the dynamics of rapid range expansion at a behavioral and genetic level. However, the lack of a reference genome has limited insights into the genetic basis for these phenomena. Therefore, we assembled a high-quality chromosome-level reference genome of the European wasp spider as a tool for more in-depth future studies. Findings We generated, de novo, a 1.67 Gb genome assembly of A. bruennichi using 21.8× Pacific Biosciences sequencing, polished with 19.8× Illumina paired-end sequencing data, and proximity ligation (Hi-C)-based scaffolding. This resulted in an N50 scaffold size of 124 Mb and an N50 contig size of 288 kb. We found 98.4% of the genome to be contained in 13 scaffolds, fitting the expected number of chromosomes (n = 13). Analyses showed the presence of 91.1% of complete arthropod BUSCOs, indicating a high-quality assembly. Conclusions We present the first chromosome-level genome assembly in the order Araneae. With this genomic resource, we open the door for more precise and informative studies on evolution and adaptation not only in A. bruennichi but also in arachnids overall, shedding light on questions such as the genomic architecture of traits, whole-genome duplication, and the genomic mechanisms behind silk and venom evolution.


2018 ◽  
Vol 16 (4) ◽  
pp. 334-342 ◽  
Author(s):  
Tantri Dyah Ayu Anggraeni ◽  
Dani Satyawan ◽  
Yang Jae Kang ◽  
Jungmin Ha ◽  
Moon Young Kim ◽  
...  

AbstractJatropha curcas L. is a potential bioenergy crop but has a lack of improved cultivars with high yields and oil content. Therefore, increasing our understanding of J. curcas germplasm is important for designing breeding strategies. This study was performed to investigate the genetic diversity and population structure of Indonesian J. curcas populations from six different islands. To construct a reference, we de novo assembled the scaffolds (N50 = 355.5 kb) using 182 Gb Illumina HiSeq sequencing data from Thai J. curcas variety Chai Nat. Genetic diversity analysis among 52 Indonesian J. curcas accessions was conducted based on yield traits and single nucleotide polymorphism (SNP) markers detected by mapping genotyping-by-sequencing reads from Indonesian population to Chai Nat scaffolds. Strong variation in yield traits was detected among accessions. Using J. integerrima as an outgroup, 13,916 SNPs were detected. Among J. curcas accessions, including accessions from other countries (Thailand, the Philippines and China), 856 SNPs were detected, but only 297 SNPs were detected among Indonesian J. curcas populations, representing low genetic diversity. Through phylogenetic and structural analysis, the populations were clustered into two major groups. Group one consists of populations from Bangka and Sulawesi in the northern part of Indonesia, which are located at a distance of 1572.59 km. Group two contains populations from islands in the southern part: Java, Lombok-Sumbawa, Flores and Timor. These results indicate that introduction of diverse J. curcas germplasms is necessary for the improvement of the genetic variation in the Indonesian collections.


2021 ◽  
Vol 6 ◽  
pp. 141
Author(s):  
Oscar G Wilkins ◽  
Charlotte Capitanchik ◽  
Nicholas M. Luscombe ◽  
Jernej Ule

Background: The first step of virtually all next generation sequencing analysis involves the splitting of the raw sequencing data into separate files using sample-specific barcodes, a process known as “demultiplexing”. However, we found that existing software for this purpose was either too inflexible or too computationally intensive for fast, streamlined processing of raw, single end fastq files containing combinatorial barcodes. Results: Here, we introduce a fast and uniquely flexible demultiplexer, named Ultraplex, which splits a raw FASTQ file containing barcodes either at a single end or at both 5’ and 3’ ends of reads, trims the sequencing adaptors and low-quality bases, and moves unique molecular identifiers (UMIs) into the read header, allowing subsequent removal of PCR duplicates. Ultraplex is able to perform such single or combinatorial demultiplexing on both single- and paired-end sequencing data, and can process an entire Illumina HiSeq lane, consisting of nearly 500 million reads, in less than 20 minutes. Conclusions: Ultraplex greatly reduces computational burden and pipeline complexity for the demultiplexing of complex sequencing libraries, such as those produced by various CLIP and ribosome profiling protocols, and is also very user friendly, enabling streamlined, robust data processing. Ultraplex is available on PyPi and Conda and via Github.


Sign in / Sign up

Export Citation Format

Share Document