Nematode.net update 2011: addition of data sets and tools featuring next-generation sequencing data

J. Martin; S. Abubucker; E. Heizer; C. M. Taylor; M. Mitreva

doi:10.1093/nar/gkr1194

VikNGS: A C ++ Variant Integration Kit for Next Generation Sequencing Association Analysis

Bioinformatics ◽

10.1093/bioinformatics/btz716 ◽

2019 ◽

Cited By ~ 1

Author(s):

Zeynep Baskurt ◽

Scott Mastromatteo ◽

Jiafen Gong ◽

Richard F Wintle ◽

Stephen W Scherer ◽

...

Keyword(s):

Next Generation Sequencing ◽

Genetic Association ◽

Association Analysis ◽

Supplementary Information ◽

Next Generation Sequencing Data ◽

Data Sets ◽

Next Generation ◽

Sequencing Data ◽

Combining Data ◽

Generation Sequencing

Abstract Integration of next generation sequencing data (NGS) across different research studies can improve the power of genetic association testing by increasing sample size and can obviate the need for sequencing controls. If differential genotype uncertainty across studies is not accounted for, combining data sets can produce spurious association results. We developed the Variant Integration Kit for NGS (VikNGS), a fast cross-platform software package, to enable aggregation of several data sets for rare and common variant genetic association analysis of quantitative and binary traits with covariate adjustment. VikNGS also includes a graphical user interface, power simulation functionality and data visualization tools. Availability The VikNGS package can be downloaded at http://www.tcag.ca/tools/index.html. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Application of SNPViz v2.0 using next-generation sequencing data sets in the discovery of potential causative mutations in candidate genes associated with phenotypes

International Journal of Data Mining and Bioinformatics ◽

10.1504/ijdmb.2021.116886 ◽

2021 ◽

Vol 25 (1/2) ◽

pp. 65

Author(s):

Shuai Zeng ◽

Mária Škrabišová ◽

Zhen Lyu ◽

Yen On Chan ◽

Nicholas Dietz ◽

...

Keyword(s):

Next Generation Sequencing ◽

Candidate Genes ◽

Next Generation Sequencing Data ◽

Data Sets ◽

Next Generation ◽

Sequencing Data ◽

Generation Sequencing

Download Full-text

Adaptation and Validation of E-Probe Diagnostic Nucleic Acid Analysis for Detection of Escherichia coli O157:H7 in Metagenomic Data from Complex Food Matrices

Journal of Food Protection ◽

10.4315/0362-028x.jfp-15-440 ◽

2016 ◽

Vol 79 (4) ◽

pp. 574-581 ◽

Cited By ~ 5

Author(s):

TRENNA BLAGDEN ◽

WILLIAM SCHNEIDER ◽

ULRICH MELCHER ◽

JON DANIELS ◽

JACQUELINE FLETCHER

Keyword(s):

Escherichia Coli ◽

Next Generation Sequencing ◽

Nucleic Acid ◽

Next Generation Sequencing Data ◽

Data Sets ◽

Escherichia Coli O157 ◽

Next Generation ◽

Sequencing Data ◽

Nucleic Acid Analysis ◽

Generation Sequencing

ABSTRACT The Centers for Disease Control and Prevention recently emphasized the need for enhanced technologies to use in investigations of outbreaks of foodborne illnesses. To address this need, e-probe diagnostic nucleic acid analysis (EDNA) was adapted and validated as a tool for the rapid, effective identification and characterization of multiple pathogens in a food matrix. In EDNA, unassembled next generation sequencing data sets from food sample metagenomes are queried using pathogen-specific sequences known as electronic probes (e-probes). In this study, the query of mock sequence databases demonstrated the potential of EDNA for the detection of foodborne pathogens. The method was then validated using next generation sequencing data sets created by sequencing the metagenome of alfalfa sprouts inoculated with Escherichia coli O157:H7. Nonspecific hits in the negative control sample indicated the need for additional filtration of the e-probes to enhance specificity. There was no significant difference in the ability of an e-probe to detect the target pathogen based upon the length of the probe set oligonucleotides. The results from the queries of the sample database using E. coli e-probe sets were significantly different from those obtained using random decoy probe sets and exhibited 100% precision. The results support the use of EDNA as a rapid response methodology in foodborne outbreaks and investigations for establishing comprehensive microbial profiles of complex food samples.

Download Full-text

SMaSH: Sample matching using SNPs in humans

BMC Genomics ◽

10.1186/s12864-019-6332-7 ◽

2019 ◽

Vol 20 (S12) ◽

Cited By ~ 2

Author(s):

Maximillian Westphal ◽

David Frankhouser ◽

Carmine Sonzone ◽

Peter G. Shields ◽

Pearlly Yan ◽

...

Keyword(s):

Next Generation Sequencing ◽

Next Generation Sequencing Data ◽

Data Sets ◽

Omics Data ◽

Rna Seq ◽

Next Generation ◽

Sequencing Data ◽

Data Types ◽

Two Samples ◽

Generation Sequencing

Abstract Background Inadvertent sample swaps are a real threat to data quality in any medium to large scale omics studies. While matches between samples from the same individual can in principle be identified from a few well characterized single nucleotide polymorphisms (SNPs), omics data types often only provide low to moderate coverage, thus requiring integration of evidence from a large number of SNPs to determine if two samples derive from the same individual or not. Methods We select about six thousand SNPs in the human genome and develop a Bayesian framework that is able to robustly identify sample matches between next generation sequencing data sets. Results We validate our approach on a variety of data sets. Most importantly, we show that our approach can establish identity between different omics data types such as Exome, RNA-Seq, and MethylCap-Seq. We demonstrate how identity detection degrades with sample quality and read coverage, but show that twenty million reads of a fairly low quality RNA-Seq sample are still sufficient for reliable sample identification. Conclusion Our tool, SMASH, is able to identify sample mismatches in next generation sequencing data sets between different sequencing modalities and for low quality sequencing data.

Download Full-text

The allele distribution in next-generation sequencing data sets is accurately described as the result of a stochastic branching process

Nucleic Acids Research ◽

10.1093/nar/gkr1073 ◽

2011 ◽

Vol 40 (6) ◽

pp. 2426-2431 ◽

Cited By ~ 25

Author(s):

Verena Heinrich ◽

Jens Stange ◽

Thorsten Dickhaus ◽

Peter Imkeller ◽

Ulrike Krüger ◽

...

Keyword(s):

Next Generation Sequencing ◽

Branching Process ◽

Next Generation Sequencing Data ◽

Data Sets ◽

Next Generation ◽

Sequencing Data ◽

Allele Distribution ◽

Generation Sequencing

Download Full-text

Application of SNPViz v2.0 using next-generation sequencing data sets in the discovery of potential causative mutations in candidate genes associated with phenotypes

International Journal of Data Mining and Bioinformatics ◽

10.1504/ijdmb.2021.10039922 ◽

2021 ◽

Vol 25 (1/2) ◽

pp. 65

Author(s):

Nicholas Dietz ◽

Trupti Joshi ◽

Kristin Bilyeu ◽

Shuai Zeng ◽

Zhen Lyu ◽

...

Keyword(s):

Next Generation Sequencing ◽

Candidate Genes ◽

Next Generation Sequencing Data ◽

Data Sets ◽

Next Generation ◽

Sequencing Data ◽

Generation Sequencing

Download Full-text

Faculty Opinions recommendation of VarWalker: personalized mutation network analysis of putative cancer genes from next-generation sequencing data.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.718272765.793499663 ◽

2014 ◽

Author(s):

Gary Bader ◽

Mohamed Helmy

Keyword(s):

Next Generation Sequencing ◽

Network Analysis ◽

Next Generation Sequencing Data ◽

Cancer Genes ◽

Next Generation ◽

Sequencing Data ◽

Generation Sequencing

Download Full-text

Faculty Opinions recommendation of Bioinformatory-assisted analysis of next-generation sequencing data for precision medicine in pancreatic cancer.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.727775566.793536095 ◽

2017 ◽

Author(s):

Steve Pereira

Keyword(s):

Pancreatic Cancer ◽

Next Generation Sequencing ◽

Precision Medicine ◽

Next Generation Sequencing Data ◽

Next Generation ◽

Sequencing Data ◽

Assisted Analysis ◽

Generation Sequencing

Download Full-text

NGSremix: A software tool for estimating pairwise relatedness between admixed individuals from next-generation sequencing data

G3 Genes|Genome|Genetics ◽

10.1093/g3journal/jkab174 ◽

2021 ◽

Author(s):

Anne Krogh Nøhr ◽

Kristian Hanghøj ◽

Genis Garcia Erill ◽

Zilong Li ◽

Ida Moltke ◽

...

Keyword(s):

Next Generation Sequencing ◽

Genetic Research ◽

Likelihood Estimation ◽

Software Tool ◽

Estimation Methods ◽

Next Generation Sequencing Data ◽

Next Generation ◽

Sequencing Data ◽

Ngs Data ◽

Generation Sequencing

Abstract Estimation of relatedness between pairs of individuals is important in many genetic research areas. When estimating relatedness, it is important to account for admixture if this is present. However, the methods that can account for admixture are all based on genotype data as input, which is a problem for low-depth next-generation sequencing (NGS) data from which genotypes are called with high uncertainty. Here we present a software tool, NGSremix, for maximum likelihood estimation of relatedness between pairs of admixed individuals from low-depth NGS data, which takes the uncertainty of the genotypes into account via genotype likelihoods. Using both simulated and real NGS data for admixed individuals with an average depth of 4x or below we show that our method works well and clearly outperforms all the commonly used state-of-the-art relatedness estimation methods PLINK, KING, relateAdmix, and ngsRelate that all perform quite poorly. Hence, NGSremix is a useful new tool for estimating relatedness in admixed populations from low-depth NGS data. NGSremix is implemented in C/C ++ in a multi-threaded software and is freely available on Github https://github.com/KHanghoj/NGSremix.

Download Full-text

recoup: flexible and versatile signal visualization from next generation sequencing

BMC Bioinformatics ◽

10.1186/s12859-020-03902-x ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Panagiotis Moulos

Keyword(s):

Next Generation Sequencing ◽

Next Generation Sequencing Data ◽

Special Focus ◽

Next Generation ◽

Sequencing Data ◽

User Friendliness ◽

Computational Environment ◽

Level Data ◽

Data Signal ◽

Generation Sequencing

Abstract Background The relentless continuing emergence of new genomic sequencing protocols and the resulting generation of ever larger datasets continue to challenge the meaningful summarization and visualization of the underlying signal generated to answer important qualitative and quantitative biological questions. As a result, the need for novel software able to reliably produce quick, comprehensive, and easily repeatable genomic signal visualizations in a user-friendly manner is rapidly re-emerging. Results recoup is a Bioconductor package for quick, flexible, versatile, and accurate visualization of genomic coverage profiles generated from Next Generation Sequencing data. Coupled with a database of precalculated genomic regions for multiple organisms, recoup offers processing mechanisms for quick, efficient, and multi-level data interrogation with minimal effort, while at the same time creating publication-quality visualizations. Special focus is given on plot reusability, reproducibility, and real-time exploration and formatting options, operations rarely supported in similar visualization tools in a profound way. recoup was assessed using several qualitative user metrics and found to balance the tradeoff between important package features, including speed, visualization quality, overall friendliness, and the reusability of the results with minimal additional calculations. Conclusion While some existing solutions for the comprehensive visualization of NGS data signal offer satisfying results, they are often compromised regarding issues such as effortless tracking of processing and preparation steps under a common computational environment, visualization quality and user friendliness. recoup is a unique package presenting a balanced tradeoff for a combination of assessment criteria while remaining fast and friendly.

Download Full-text