Next-Generation Sequencing Techniques for Eukaryotic Microorganisms: Sequencing-Based Solutions to Biological Problems

ABSTRACT Over the past 5 years, large-scale sequencing has been revolutionized by the development of several so-called next-generation sequencing (NGS) technologies. These have drastically increased the number of bases obtained per sequencing run while at the same time decreasing the costs per base. Compared to Sanger sequencing, NGS technologies yield shorter read lengths; however, despite this drawback, they have greatly facilitated genome sequencing, first for prokaryotic genomes and within the last year also for eukaryotic ones. This advance was possible due to a concomitant development of software that allows the de novo assembly of draft genomes from large numbers of short reads. In addition, NGS can be used for metagenomics studies as well as for the detection of sequence variations within individual genomes, e.g., single-nucleotide polymorphisms (SNPs), insertions/deletions (indels), or structural variants. Furthermore, NGS technologies have quickly been adopted for other high-throughput studies that were previously performed mostly by hybridization-based methods like microarrays. This includes the use of NGS for transcriptomics (RNA-seq) or the genome-wide analysis of DNA/protein interactions (ChIP-seq). This review provides an overview of NGS technologies that are currently available and the bioinformatics analyses that are necessary to obtain information from the flood of sequencing data as well as applications of NGS to address biological questions in eukaryotic microorganisms.

Download Full-text

Molecular marker information from de novo assembled transcriptomes of chilli pepper (Capsicum annuum L.) varieties based on next-generation sequencing technology

Plant Genetic Resources ◽

10.1017/s147926211400032x ◽

2014 ◽

Vol 12 (S1) ◽

pp. S83-S86 ◽

Cited By ~ 1

Author(s):

Yul-Kyun Ahn ◽

Swati Tripathi ◽

Young-Il Cho ◽

Jeong-Ho Kim ◽

Hye-Eun Lee ◽

...

Keyword(s):

Molecular Markers ◽

Next Generation Sequencing ◽

De Novo ◽

Transcriptome Assembly ◽

Sequence Variant ◽

Nucleotide Polymorphisms ◽

Next Generation ◽

Chilli Pepper ◽

Next Generation Sequencing Technology ◽

Generation Sequencing

Next-generation sequencing technique has been known as a useful tool for de novo transcriptome assembly, functional annotation of genes and identification of molecular markers. This study was carried out to mine molecular markers from de novo assembled transcriptomes of four chilli pepper varieties, the highly pungent ‘Saengryeg 211’ and non-pungent ‘Saengryeg 213’ and variably pigmented ‘Mandarin’ and ‘Blackcluster’. Pyrosequencing of the complementary DNA library resulted in 361,671, 274,269, 279,221, and 316,357 raw reads, which were assembled in 23,607, 19,894, 18,340 and 20,357 contigs, for the four varieties, respectively. Detailed sequence variant analysis identified numerous potential single-nucleotide polymorphisms (SNPs) and simple sequence repeats (SSRs) for all the varieties for which the primers were designed. The transcriptome information and SNP/SSR markers generated in this study provide valuable resources for high-density molecular genetic mapping in chilli pepper and Quantitative trait loci analysis related to fruit qualities. These markers for pepper will be highly valuable for marker-assisted breeding and other genetic studies.

Download Full-text

De Novo Genome Assembly of Next-Generation Sequencing Data

Compendium of Plant Genomes - The Brassica rapa Genome ◽

10.1007/978-3-662-47901-8_4 ◽

2015 ◽

pp. 41-51

Author(s):

Min Liu ◽

Dongyuan Liu ◽

Hongkun Zheng

Keyword(s):

Next Generation Sequencing ◽

Genome Assembly ◽

De Novo ◽

Next Generation Sequencing Data ◽

Next Generation ◽

Sequencing Data ◽

De Novo Genome Assembly ◽

Generation Sequencing

Download Full-text

Speeding Up Large-Scale Next Generation Sequencing Data Analysis with pBWA

Journal of Applied Bioinformatics & Computational Biology ◽

10.4172/2329-9533.1000101 ◽

2017 ◽

Vol 01 (01) ◽

Cited By ~ 4

Author(s):

Darren Peters ◽

Xuemei Luo ◽

Ke Qiu ◽

Ping Liang

Keyword(s):

Data Analysis ◽

Next Generation Sequencing ◽

Large Scale ◽

Next Generation Sequencing Data ◽

Next Generation ◽

Sequencing Data ◽

Generation Sequencing ◽

Sequencing Data Analysis

Download Full-text

Validation of variants using cost effective highresolution melting (HRM) analysis predicted from target re-sequencing in Eucalyptus

Acta Botanica Croatica ◽

10.37427/botcro-2020-019 ◽

2020 ◽

Vol 79 (2) ◽

pp. 105-113

Author(s):

Abdul Bari Muneera Parveen ◽

Divya Lakshmanan ◽

Modhumita Ghosh Dasgupta

Keyword(s):

Next Generation Sequencing ◽

Large Scale ◽

Sequence Data ◽

Cost Effective ◽

Nucleotide Polymorphisms ◽

Next Generation ◽

Time Saving ◽

Hrm Analysis ◽

The Cost ◽

Generation Sequencing

The advent of next-generation sequencing has facilitated large-scale discovery and mapping of genomic variants for high-throughput genotyping. Several research groups working in tree species are presently employing next generation sequencing (NGS) platforms for marker discovery, since it is a cost effective and time saving strategy. However, most trees lack a chromosome level genome map and validation of variants for downstream application becomes obligatory. The cost associated with identifying potential variants from the enormous amount of sequence data is a major limitation. In the present study, high resolution melting (HRM) analysis was optimized for rapid validation of single nucleotide polymorphisms (SNPs), insertions or deletions (InDels) and simple sequence repeats (SSRs) predicted from exome sequencing of parents and hybrids of Eucalyptus tereticornis Sm. ? Eucalyptus grandis Hill ex Maiden generated from controlled hybridization. The cost per data point was less than 0.5 USD, providing great flexibility in terms of cost and sensitivity, when compared to other validation methods. The sensitivity of this technology in variant detection can be extended to other applications including Bar-HRM for species authentication and TILLING for detection of mutants.

Download Full-text

Next generation sequencing allows deeper analysis and understanding of genomes and transcriptomes including aspects to fertility

Reproduction Fertility and Development ◽

10.1071/rd10247 ◽

2011 ◽

Vol 23 (1) ◽

pp. 75 ◽

Cited By ~ 7

Author(s):

Thomas Werner

Keyword(s):

Next Generation Sequencing ◽

Transcriptional Control ◽

Target Genes ◽

De Novo ◽

Alternative Promoters ◽

Next Generation ◽

Sequencing Data ◽

Genome Wide ◽

A Genome ◽

Generation Sequencing

Reproduction and fertility are controlled by specific events naturally linked to oocytes, testes and early embryonal tissues. A significant part of these events involves gene expression, especially transcriptional control and alternative transcription (alternative promoters and alternative splicing). While methods to analyse such events for carefully predetermined target genes are well established, until recently no methodology existed to extend such analyses into a genome-wide de novo discovery process. With the arrival of next generation sequencing (NGS) it becomes possible to attempt genome-wide discovery in genomic sequences as well as whole transcriptomes at a single nucleotide level. This does not only allow identification of the primary changes (e.g. alternative transcripts) but also helps to elucidate the regulatory context that leads to the induction of transcriptional changes. This review discusses the basics of the new technological and scientific concepts arising from NGS, prominent differences from microarray-based approaches and several aspects of its application to reproduction and fertility research. These concepts will then be illustrated in an application example of NGS sequencing data analysis involving postimplantation endometrium tissue from cows.

Download Full-text

A support vector machine for identification of single-nucleotide polymorphisms from next-generation sequencing data

Bioinformatics ◽

10.1093/bioinformatics/btt172 ◽

2013 ◽

Vol 29 (11) ◽

pp. 1361-1366 ◽

Cited By ~ 26

Author(s):

B. D. O'Fallon ◽

W. Wooderchak-Donahue ◽

D. K. Crockett

Keyword(s):

Support Vector Machine ◽

Next Generation Sequencing ◽

Single Nucleotide Polymorphisms ◽

Next Generation Sequencing Data ◽

Support Vector ◽

Nucleotide Polymorphisms ◽

Next Generation ◽

Sequencing Data ◽

Single Nucleotide ◽

Generation Sequencing

Download Full-text

NGSPERL: a semi-automated framework for large scale next generation sequencing data analysis

International Journal of Computational Biology and Drug Design ◽

10.1504/ijcbdd.2015.072082 ◽

2015 ◽

Vol 8 (3) ◽

pp. 203

Author(s):

Quanhu Sheng ◽

Shilin Zhao ◽

Mingsheng Guo ◽

Yu Shyr

Keyword(s):

Data Analysis ◽

Next Generation Sequencing ◽

Large Scale ◽

Next Generation Sequencing Data ◽

Next Generation ◽

Sequencing Data ◽

Generation Sequencing ◽

Sequencing Data Analysis

Download Full-text

An ensemble strategy that significantly improves de novo assembly of microbial genomes from metagenomic next-generation sequencing data

Nucleic Acids Research ◽

10.1093/nar/gkv002 ◽

2015 ◽

Vol 43 (7) ◽

pp. e46-e46 ◽

Cited By ~ 125

Author(s):

Xutao Deng ◽

Samia N. Naccache ◽

Terry Ng ◽

Scot Federman ◽

Linlin Li ◽

...

Keyword(s):

Next Generation Sequencing ◽

De Novo Assembly ◽

De Novo ◽

Next Generation Sequencing Data ◽

De Bruijn Graph ◽

Next Generation ◽

Sequencing Data ◽

Short Reads ◽

Ensemble Strategy ◽

Generation Sequencing

Abstract Next-generation sequencing (NGS) approaches rapidly produce millions to billions of short reads, which allow pathogen detection and discovery in human clinical, animal and environmental samples. A major limitation of sequence homology-based identification for highly divergent microorganisms is the short length of reads generated by most highly parallel sequencing technologies. Short reads require a high level of sequence similarities to annotated genes to confidently predict gene function or homology. Such recognition of highly divergent homologues can be improved by reference-free (de novo) assembly of short overlapping sequence reads into larger contigs. We describe an ensemble strategy that integrates the sequential use of various de Bruijn graph and overlap-layout-consensus assemblers with a novel partitioned sub-assembly approach. We also proposed new quality metrics that are suitable for evaluating metagenome de novo assembly. We demonstrate that this new ensemble strategy tested using in silico spike-in, clinical and environmental NGS datasets achieved significantly better contigs than current approaches.

Download Full-text

GenoPheno: cataloging large-scale phenotypic and next-generation sequencing data within human datasets

Briefings in Bioinformatics ◽

10.1093/bib/bbaa033 ◽

2020 ◽

Author(s):

Alba Gutiérrez-Sacristán ◽

Carlos De Niz ◽

Cartik Kothari ◽

Sek Won Kong ◽

Kenneth D Mandl ◽

...

Keyword(s):

Next Generation Sequencing ◽

Web Application ◽

Large Scale ◽

Human Subjects ◽

Next Generation Sequencing Data ◽

Next Generation ◽

Sequencing Data ◽

Phenotypic Data ◽

Data Repositories ◽

Generation Sequencing

Abstract Precision medicine promises to revolutionize treatment, shifting therapeutic approaches from the classical one-size-fits-all to those more tailored to the patient’s individual genomic profile, lifestyle and environmental exposures. Yet, to advance precision medicine’s main objective—ensuring the optimum diagnosis, treatment and prognosis for each individual—investigators need access to large-scale clinical and genomic data repositories. Despite the vast proliferation of these datasets, locating and obtaining access to many remains a challenge. We sought to provide an overview of available patient-level datasets that contain both genotypic data, obtained by next-generation sequencing, and phenotypic data—and to create a dynamic, online catalog for consultation, contribution and revision by the research community. Datasets included in this review conform to six specific inclusion parameters that are: (i) contain data from more than 500 human subjects; (ii) contain both genotypic and phenotypic data from the same subjects; (iii) include whole genome sequencing or whole exome sequencing data; (iv) include at least 100 recorded phenotypic variables per subject; (v) accessible through a website or collaboration with investigators and (vi) make access information available in English. Using these criteria, we identified 30 datasets, reviewed them and provided results in the release version of a catalog, which is publicly available through a dynamic Web application and on GitHub. Users can review as well as contribute new datasets for inclusion (Web: https://avillachlab.shinyapps.io/genophenocatalog/; GitHub: https://github.com/hms-dbmi/GenoPheno-CatalogShiny).

Download Full-text

Bioinformatics for Clinical Next Generation Sequencing

Clinical Chemistry ◽

10.1373/clinchem.2014.224360 ◽

2015 ◽

Vol 61 (1) ◽

pp. 124-135 ◽

Cited By ~ 56

Author(s):

Gavin R Oliver ◽

Steven N Hart ◽

Eric W Klee

Keyword(s):

Next Generation Sequencing ◽

Service Providers ◽

Clinical Laboratory ◽

Work Flow ◽

Next Generation ◽

Sequencing Data ◽

Regulatory Requirements ◽

Bioinformatics Analyses ◽

Next Generation Sequencing Ngs ◽

Generation Sequencing

Abstract BACKGROUND Next generation sequencing (NGS)-based assays continue to redefine the field of genetic testing. Owing to the complexity of the data, bioinformatics has become a necessary component in any laboratory implementing a clinical NGS test. CONTENT The computational components of an NGS-based work flow can be conceptualized as primary, secondary, and tertiary analytics. Each of these components addresses a necessary step in the transformation of raw data into clinically actionable knowledge. Understanding the basic concepts of these analysis steps is important in assessing and addressing the informatics needs of a molecular diagnostics laboratory. Equally critical is a familiarity with the regulatory requirements addressing the bioinformatics analyses. These and other topics are covered in this review article. SUMMARY Bioinformatics has become an important component in clinical laboratories generating, analyzing, maintaining, and interpreting data from molecular genetics testing. Given the rapid adoption of NGS-based clinical testing, service providers must develop informatics work flows that adhere to the rigor of clinical laboratory standards, yet are flexible to changes as the chemistry and software for analyzing sequencing data mature.

Download Full-text