scholarly journals FaStore – a space-saving solution for raw sequencing data

2017 ◽  
Author(s):  
Łukasz Roguski ◽  
Idoia Ochoa ◽  
Mikel Hernaez ◽  
Sebastian Deorowicz

AbstractThe affordability of DNA sequencing has led to the generation of unprecedented volumes of raw sequencing data. These data must be stored, processed, and transmitted, which poses significant challenges. To facilitate this effort, we introduce FaStore, a specialized compressor for FASTQ files. The proposed algorithm does not use any reference sequences for compression, and permits the user to choose from several lossy modes to improve the overall compression ratio, depending on the specific needs. We demonstrate through extensive simulations that FaStore achieves a significant improvement in compression ratio with respect to previously proposed algorithms for this task. In addition, we perform an analysis on the effect that the different lossy modes have on variant calling, the most widely used application for clinical decision making, especially important in the era of precision medicine. We show that lossy compression can offer significant compression gains, while preserving the essential genomic information and without affecting the variant calling performance.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Jeong Hoon Lee ◽  
Solbi Kweon ◽  
Yu Rang Park

AbstractGenetic variants causing underlying pharmacogenetic and disease phenotypes have been used as the basis for clinical decision-making. However, due to the lack of standards for next-generation sequencing (NGS) pipelines, reproducing genetic variants among institutions is still difficult. The aim of this study is to show how many important variants for clinical decisions can be individually detected using different pipelines. Genetic variants were derived from 105 breast cancer patient target DNA sequences via three different variant-calling pipelines. HaplotypeCaller, Mutect2 tumor-only mode in the Genome Analysis ToolKit (GATK), and VarScan were used in variant calling from the sequence read data processed by the same NGS preprocessing tools using Variant Effect Predictor. GATK HaplotypeCaller, VarScan, and MuTect2 found 25,130, 16,972, and 4232 variants, comprising 1491, 1400, and 321 annotated variants with ClinVar significance, respectively. The average number of ClinVar significant variants in the patients was 769.43, 16.50% of the variants were detected by only one variant caller. Despite variants with significant impact on clinical decision-making, the detected variants are different for each algorithm. To utilize genetic variants in the clinical field, a strict standard for NGS pipelines is essential.


2011 ◽  
Vol 20 (4) ◽  
pp. 121-123
Author(s):  
Jeri A. Logemann

Evidence-based practice requires astute clinicians to blend our best clinical judgment with the best available external evidence and the patient's own values and expectations. Sometimes, we value one more than another during clinical decision-making, though it is never wise to do so, and sometimes other factors that we are unaware of produce unanticipated clinical outcomes. Sometimes, we feel very strongly about one clinical method or another, and hopefully that belief is founded in evidence. Some beliefs, however, are not founded in evidence. The sound use of evidence is the best way to navigate the debates within our field of practice.


Sign in / Sign up

Export Citation Format

Share Document