scholarly journals Fast P(RMNE): Fast Forensic DNA Probability of Random Man Not Excluded Calculation

2017 ◽  
Author(s):  
Darrell O. Ricke ◽  
Steven Schwartz

AbstractHigh throughput sequencing (HTS) of DNA forensic samples is expanding from the sizing of short tandem repeats (STRs) to massively parallel sequencing (MPS). HTS panels are expanding from the FBI 20 core Combined DNA Index System (CODIS) loci to include SNPs. The calculation of random man not excluded, P(RMNE), is used in DNA mixture analysis to estimate the probability that a person is present in a DNA mixture. This calculation encounters calculation artifacts with expansion to larger panel sizes. Increasing the floating-point precision of the calculations allows for increased panel sizes but with a corresponding increase in computation time. The Taylor series higher precision libraries used fail on some input data sets leading to algorithm unreliability. Herein, a new formula is introduced for calculating P(RMNE) that scales to larger SNP panel sizes while being computationally efficient (patent pending)[1].

F1000Research ◽  
2017 ◽  
Vol 6 ◽  
pp. 2154 ◽  
Author(s):  
Darrell O. Ricke ◽  
Steven Schwartz

High throughput sequencing (HTS) of DNA forensic samples is expanding from the sizing of short tandem repeats (STRs) to massively parallel sequencing (MPS).  HTS panels are expanding from the FBI 20 core Combined DNA Index System (CODIS) loci to include SNPs.  The calculation of random man not excluded, P(RMNE), is used in DNA mixture analysis to estimate the probability that a person is present in a DNA mixture.  This calculation encounters calculation artifacts with expansion to larger panel sizes.  Increasing the floating-point precision of the calculations allows for increased panel sizes but with a corresponding increase in computation time.  The Taylor series higher precision libraries used fail on some input data sets leading to algorithm unreliability.  Herein, a new formula is introduced for calculating P(RMNE) that scales to larger SNP panel sizes while being computationally efficient (patent pending).


F1000Research ◽  
2018 ◽  
Vol 6 ◽  
pp. 2154
Author(s):  
Darrell O. Ricke ◽  
Steven Schwartz

High throughput sequencing (HTS) of DNA forensic samples is expanding from the sizing of short tandem repeats (STRs) to massively parallel sequencing (MPS).  HTS panels are expanding from the FBI 20 core Combined DNA Index System (CODIS) loci to include SNPs.  The calculation of random man not excluded, P(RMNE), is used in DNA mixture analysis to estimate the probability that a person is present in a DNA mixture.  This calculation encounters calculation artifacts with expansion to larger panel sizes.  Increasing the floating-point precision of the calculations allows for increased panel sizes but with a corresponding increase in computation time.  The Taylor series higher precision libraries used fail on some input data sets leading to algorithm unreliability.  Herein, a new formula is introduced for calculating P(RMNE) that scales to larger SNP panel sizes while being computationally efficient (patent pending).


2017 ◽  
Author(s):  
Darrell O. Ricke ◽  
Joe Isaacson ◽  
James Watkins ◽  
Philip Fremont-Smith ◽  
Tara Boettcher ◽  
...  

AbstractIdentification of individuals in complex DNA mixtures remains a challenge for forensic analysts. Recent advances in high throughput sequencing (HTS) are enabling analysis of DNA mixtures with expanded panels of Short Tandem Repeats (STRs) and/or Single Nucleotide Polymorphisms (SNPs). We present the plateau method for direct SNP DNA mixture deconvolution into sub-profiles based on differences in contributors’ DNA concentrations in the mixtures in the absence of matching reference profiles. The Plateau method can detect profiles of individuals whose contribution is as low as 1/200 in a DNA mixture (patent pending)1.


MycoKeys ◽  
2018 ◽  
Vol 39 ◽  
pp. 29-40 ◽  
Author(s):  
Sten Anslan ◽  
R. Henrik Nilsson ◽  
Christian Wurzbacher ◽  
Petr Baldrian ◽  
Leho Tedersoo ◽  
...  

Along with recent developments in high-throughput sequencing (HTS) technologies and thus fast accumulation of HTS data, there has been a growing need and interest for developing tools for HTS data processing and communication. In particular, a number of bioinformatics tools have been designed for analysing metabarcoding data, each with specific features, assumptions and outputs. To evaluate the potential effect of the application of different bioinformatics workflow on the results, we compared the performance of different analysis platforms on two contrasting high-throughput sequencing data sets. Our analysis revealed that the computation time, quality of error filtering and hence output of specific bioinformatics process largely depends on the platform used. Our results show that none of the bioinformatics workflows appears to perfectly filter out the accumulated errors and generate Operational Taxonomic Units, although PipeCraft, LotuS and PIPITS perform better than QIIME2 and Galaxy for the tested fungal amplicon dataset. We conclude that the output of each platform requires manual validation of the OTUs by examining the taxonomy assignment values.


2014 ◽  
Vol 13s1 ◽  
pp. CIN.S13890 ◽  
Author(s):  
Changjin Hong ◽  
Solaiappan Manimaran ◽  
William Evan Johnson

Quality control and read preprocessing are critical steps in the analysis of data sets generated from high-throughput genomic screens. In the most extreme cases, improper preprocessing can negatively affect downstream analyses and may lead to incorrect biological conclusions. Here, we present PathoQC, a streamlined toolkit that seamlessly combines the benefits of several popular quality control software approaches for preprocessing next-generation sequencing data. PathoQC provides a variety of quality control options appropriate for most high-throughput sequencing applications. PathoQC is primarily developed as a module in the PathoScope software suite for metagenomic analysis. However, PathoQC is also available as an open-source Python module that can run as a stand-alone application or can be easily integrated into any bioinformatics workflow. PathoQC achieves high performance by supporting parallel computation and is an effective tool that removes technical sequencing artifacts and facilitates robust downstream analysis. The PathoQC software package is available at http://sourceforge.net/projects/PathoScope/ .


2018 ◽  
Author(s):  
Sten Anslan ◽  
Henrik Nilsson ◽  
Christian Wurzbacher ◽  
Petr Baldrian ◽  
Leho Tedersoo ◽  
...  

Along with recent developments in high-throughput sequencing (HTS) technologies and thus fast accumulation of HTS data, there has been a growing need and interest for developing tools for HTS data processing and communication. In particular, a number of bioinformatics tools have been designed for analysing metabarcoding data, each with specific features, assumptions and outputs. To evaluate the potential effect of the application of different bioinformatics workflow on the results, we compared the performance of different analysis platforms on two contrasting high-throughput sequencing data sets. Our analysis revealed that the computation time, quality of error filtering and hence output of specific bioinformatics process largely depends on the platform used. Our results show that none of the bioinformatics workflows appear to perfectly filter out the accumulated errors and generate Operational Taxonomic Units, although PipeCraft, LotuS and PIPITS perform better than QIIME2 and Galaxy for the tested fungal amplicon data set. We conclude that the output of each platform require manual validation of the OTUs by examining the taxonomy assignment values.


2018 ◽  
Author(s):  
Darrell O. Ricke ◽  
James Watkins ◽  
Philip Fremont-Smith ◽  
Tara Boettcher ◽  
Eric Schwoebel

AbstractHigh throughput sequencing (HTS) of complex DNA mixtures with single nucleotide polymorphisms (SNPs) panels can identify multiple individuals in forensic DNA mixture samples. SNP mixture analysis relies upon the exclusion of non-contributing individuals with the subset of SNP loci with no detected minor alleles in the mixture. Few, if any, individuals are anticipated to be detectable in saturated mixtures by this mixture analysis approach because of the increased probability of matching random individuals. Being able to identify a subset of the contributors in saturated HTS SNP mixtures is valuable for forensic investigations. A desaturated mixture can be created by treating a set of SNPs with the lowest minor allele ratios as having no minor alleles. Leveraging differences in DNA contributor concentrations in saturated mixtures, we introduce TranslucentID for the identification of a subset of individuals with high confidence who contributed DNA to saturated mixtures by desaturating the mixtures.


2018 ◽  
Author(s):  
Sten Anslan ◽  
Henrik Nilsson ◽  
Christian Wurzbacher ◽  
Petr Baldrian ◽  
Leho Tedersoo ◽  
...  

Along with recent developments in high-throughput sequencing (HTS) technologies and thus fast accumulation of HTS data, there has been a growing need and interest for developing tools for HTS data processing and communication. In particular, a number of bioinformatics tools have been designed for analysing metabarcoding data, each with specific features, assumptions and outputs. To evaluate the potential effect of the application of different bioinformatics workflow on the results, we compared the performance of different analysis platforms on two contrasting high-throughput sequencing data sets. Our analysis revealed that the computation time, quality of error filtering and hence output of specific bioinformatics process largely depends on the platform used. Our results show that none of the bioinformatics workflows appear to perfectly filter out the accumulated errors and generate Operational Taxonomic Units, although PipeCraft, LotuS and PIPITS perform better than QIIME2 and Galaxy for the tested fungal amplicon data set. We conclude that the output of each platform require manual validation of the OTUs by examining the taxonomy assignment values.


2018 ◽  
Author(s):  
Sten Anslan ◽  
Henrik Nilsson ◽  
Christian Wurzbacher ◽  
Petr Baldrian ◽  
Leho Tedersoo ◽  
...  

Along with recent developments in high-throughput sequencing (HTS) technologies and thus fast accumulation of HTS data, there has been a growing need and interest for developing tools for HTS data processing and communication. In particular, a number of bioinformatics tools have been designed for analysing metabarcoding data, each with specific features, assumptions and outputs. To evaluate the potential effect of the application of different bioinformatics workflow on the results, we compared the performance of different analysis platforms on two contrasting high-throughput sequencing data sets. Our analysis revealed that the computation time, quality of error filtering and hence output of specific bioinformatics process largely depends on the platform used. Our results show that none of the bioinformatics workflows appear to perfectly filter out the accumulated errors and generate Operational Taxonomic Units, although PipeCraft, LotuS and PIPITS perform better than QIIME2 and Galaxy for the tested fungal amplicon data set. We conclude that the output of each platform require manual validation of the OTUs by examining the taxonomy assignment values.


2018 ◽  
Author(s):  
Darrell O. Ricke ◽  
Philip Fremont-Smith ◽  
James Watkins ◽  
Tara Boettcher ◽  
Eric Schwoebel

ABSTRACTMixture analysis and deconvolution methods can identify both known and unknown individuals contributing to DNA mixtures. These methods may not identify all DNA contributors with the remaining fraction of the mixture being contributed by one or more unknown individuals. The proportion of DNA contributed by individuals to a forensic sample can be estimated using their quantified mixture alleles. For short tandem repeats (STRs), methods to estimate individual contribution concentrations compare capillary electrophoresis peak heights and or peak areas within a mixture. For single nucleotide polymorphisms (SNPs), the major:minor allele ratios or counts, unique to each contributor, can be compared to estimate contributor proportion within the mixture. This article introduces three approaches (mean, median, and slope methods) for estimating individual DNA contributions to forensic mixtures for high throughput sequencing (HTS)/massively parallel sequencing (MPS) SNP panels.


Sign in / Sign up

Export Citation Format

Share Document