scholarly journals Sentieon DNA pipeline for variant detection - Software-only solution, over 20× faster than GATK 3.3 with identical results

Author(s):  
Jessica A. Weber ◽  
Rafael Aldana ◽  
Brendan D. Gallagher ◽  
Jeremy S. Edwards

Sentieon DNA Software is a suite of tools that allow running DNA sequencing secondary analysis pipelines. The Sentieon DNA Software produces results identical to the Genome Analysis Toolkit (GATK) Best Practice Workflow using HaplotypeCaller, with more than 20x increase in processing speed on the same hardware. This paper presents a benchmark analysis of both speed comparison and output concordance between using GATK and Sentieon DNA software on publically available datasets from the 100 genomes database.

Author(s):  
Jessica A. Weber ◽  
Rafael Aldana ◽  
Brendan D. Gallagher ◽  
Jeremy S. Edwards

Sentieon DNAseq Software is a suite of tools for running DNA sequencing secondary analyses. The Sentieon DNAseq Software produces identical results to the Genome Analysis Toolkit (GATK) Best Practice Workflow using HaplotypeCaller, with more than a 20-fold increase in processing speed on the same hardware. This paper presents a benchmark analysis of the GATK and Sentieon DNAseq software packages using publically available datasets from the 1000 genomes database, and includes speed comparisons and output concordance analyses.


Author(s):  
Jessica A. Weber ◽  
Rafael Aldana ◽  
Brendan D. Gallagher ◽  
Jeremy S. Edwards

Sentieon DNAseq Software is a suite of tools for running DNA sequencing secondary analyses. The Sentieon DNAseq Software produces identical results to the Genome Analysis Toolkit (GATK) Best Practice Workflow using HaplotypeCaller, with more than a 20-fold increase in processing speed on the same hardware. This paper presents a benchmark analysis of the GATK and Sentieon DNAseq software packages using publically available datasets from the 1000 genomes database, and includes speed comparisons and output concordance analyses.


2020 ◽  
Author(s):  
Marcus H. Hansen ◽  
Anita T. Simonsen ◽  
Hans B. Ommen ◽  
Charlotte G. Nyvold

AbstractBackgroundRapid and practical DNA-sequencing processing has become essential for modern biomedical laboratories, especially in the field of cancer, pathology and genetics. While sequencing turn-over time has been, and still is, a bottleneck in research and diagnostics, the field of bioinformatics is moving at a rapid pace – both in terms of hardware and software development. Here, we benchmarked the local performance of three of the most important Spark-enabled Genome analysis toolkit 4 (GATK4) tools in a targeted sequencing workflow: Duplicate marking, base quality score recalibration (BQSR) and variant calling on targeted DNA sequencing using a modest hyperthreading 12-core single CPU and a high-speed PCI express solid-state drive.ResultsCompared to the previous GATK version the performance of Spark-enabled BQSR and HaplotypeCaller is shifted towards a more efficient usage of the available cores on CPU and outperforms the earlier GATK3.8 version with an order of magnitude reduction in processing time to analysis ready variants, whereas MarkDuplicateSpark was found to be thrice as fast. Furthermore, HaploTypeCallerSpark and BQSRPipelineSpark were significantly faster than the equivalent GATK4 standard tools with a combined ∼86% reduction in execution time, reaching a median rate of ten million processed bases per second, and duplicate marking was reduced ∼42%. The called variants were found to be in close agreement between the Spark and non-Spark versions, with an overall concordance of 98%. In this setup, the tools were also highly efficient when compared execution on a small 72 virtual CPU/18-node Google Cloud cluster.ConclusionIn conclusion, GATK4 offers practical parallelization possibilities for DNA sequence processing, and the Spark-enabled tools optimize performance and utilization of local CPUs. Spark utilizing GATK variant calling is several times faster than previous GATK3.8 multithreading with the same multi-core, single CPU, configuration. The improved opportunities for parallel computations not only hold implications for high-performance cluster, but also for modest laboratory or research workstations for targeted sequencing analysis, such as exome, panel or amplicon sequencing.


2010 ◽  
Vol 20 (9) ◽  
pp. 1297-1303 ◽  
Author(s):  
A. McKenna ◽  
M. Hanna ◽  
E. Banks ◽  
A. Sivachenko ◽  
K. Cibulskis ◽  
...  

2017 ◽  
Vol 26 (01) ◽  
pp. 59-67 ◽  
Author(s):  
P. J. Scott ◽  
M. Rigby ◽  
E. Ammenwerth ◽  
J. McNair ◽  
A. Georgiou ◽  
...  

Summary Objectives: To set the scientific context and then suggest principles for an evidence-based approach to secondary uses of clinical data, covering both evaluation of the secondary uses of data and evaluation of health systems and services based upon secondary uses of data. Method: Working Group review of selected literature and policy approaches. Results: We present important considerations in the evaluation of secondary uses of clinical data from the angles of governance and trust, theory, semantics, and policy. We make the case for a multi-level and multi-factorial approach to the evaluation of secondary uses of clinical data and describe a methodological framework for best practice. We emphasise the importance of evaluating the governance of secondary uses of health data in maintaining trust, which is essential for such uses. We also offer examples of the re-use of routine health data to demonstrate how it can support evaluation of clinical performance and optimize health IT system design. Conclusions: Great expectations are resting upon “Big Data” and innovative analytics. However, to build and maintain public trust, improve data reliability, and assure the validity of analytic inferences, there must be independent and transparent evaluation. A mature and evidence-based approach needs not merely data science, but must be guided by the broader concerns of applied health informatics.


Author(s):  
Geraldine A. Auwera ◽  
Mauricio O. Carneiro ◽  
Christopher Hartl ◽  
Ryan Poplin ◽  
Guillermo del Angel ◽  
...  

2020 ◽  
Vol 12 (18) ◽  
pp. 7804
Author(s):  
Dominika Šulyová ◽  
Gabriel Koman

The wood-processing industry currently does not sufficiently use modern technologies, unlike the automotive sector. The primary motive for writing this article was in cooperation with a Slovak wood processing company, which wanted to improve its logistics processes and increase competitiveness in the wood processing sector through the implementation of new technologies. The aim of this article was to identify the positives and limitations of the implementation of Internet of Things (IoT) technology into the wood processing industry, based on a secondary analysis of case studies and the best practice of American wood processing companies such as West Fraser Timber in Canada, and Weyerhaeuser in the USA. The selection of case studies was conditional on criteria of time relevance, size of the sawmills, and production volume in m3. These conditional criteria reflected the conditions for the introduction of similar concepts for wood-processing enterprises in Slovakia. The implementation of the IoT can reduce operating costs by up to 20%, increase added value for customers, and collect real-time data that can serve as the basis for support of management and decision-making at the operational, tactical, and strategic levels. In addition to the secondary analysis, methods of comparison of global wood processing companies, synthesis of knowledge, and summarization of positives and limitations of IoT implementation or deduction were used to reach our conclusions. The results were used as the basis for the design of a general model for the implementation of IoT technology for Slovak wood processing enterprises. This model may represent best practice for the selected locality and industry. The implications and verification of the designed model in practice will form part of other research activities, already underway in the form of a primary survey.


2020 ◽  
pp. medethics-2019-105819
Author(s):  
Dexter Penn ◽  
Anne Lanceley ◽  
Aviva Petrie ◽  
Jacqueline Nicholls

BackgroundThe Mental Capacity Act (MCA) (2005) was enacted in 2007 in England and Wales, but the assessment of mental capacity still remains an area of professional concern. Doctors’ compliance with legal and professional standards is inconsistent, but the reasons for poor compliance are not well understood. This preliminary study investigates doctors’ experiences of and attitudes toward mental capacity assessment (MCAx).MethodsThis is a descriptive, cross-sectional study where a two-domain, study-specific structured questionnaire was developed, piloted and digitally disseminated to doctors at differing career stages employed in a large, multi-site National Health Service Trust in London over 4 months in 2018. Descriptive statistics and frequency tables adjusted for missing data were generated and secondary analysis was conducted.ResultsParticipants (n=92) were predominantly UK trained (82%), female (58%) and between the ages of 30 and 44 years (45%). Less than half (45%) of the participants reported receiving formal MCAx training. Only one-third (32%) of the participants self-rated themselves as very competent (29%) or extremely competent (4%). Self-reported MCA confidence was significantly affected by career stage with Consultants with over 10 years of experience reporting lowest confidence (p=0.001).ConclusionsThis study describes significant variation in practice by doctors and low self-confidence in the practice of MCAx. These results raise concerns that MCAx continues to be inconsistently performed by doctors despite appropriate awareness of the law and professional guidance on best practice.


2017 ◽  
Vol 26 (01) ◽  
pp. 59-67 ◽  
Author(s):  
P. J. Scott ◽  
M. Rigby ◽  
E. Ammenwerth ◽  
J. McNair ◽  
A. Georgiou ◽  
...  

Summary Objectives: To set the scientific context and then suggest principles for an evidence-based approach to secondary uses of clinical data, covering both evaluation of the secondary uses of data and evaluation of health systems and services based upon secondary uses of data. Method: Working Group review of selected literature and policy approaches. Results: We present important considerations in the evaluation of secondary uses of clinical data from the angles of governance and trust, theory, semantics, and policy. We make the case for a multi-level and multi-factorial approach to the evaluation of secondary uses of clinical data and describe a methodological framework for best practice. We emphasise the importance of evaluating the governance of secondary uses of health data in maintaining trust, which is essential for such uses. We also offer examples of the re-use of routine health data to demonstrate how it can support evaluation of clinical performance and optimize health IT system design. Conclusions: Great expectations are resting upon “Big Data” and innovative analytics. However, to build and maintain public trust, improve data reliability, and assure the validity of analytic inferences, there must be independent and transparent evaluation. A mature and evidence-based approach needs not merely data science, but must be guided by the broader concerns of applied health informatics.


Sign in / Sign up

Export Citation Format

Share Document