KATK: fast genotyping of rare variants directly from unmapped sequencing reads

AbstractMotivationKATK is a fast and accurate software tool for calling variants directly from raw NGS reads. It uses predefined k-mers to retrieve only the reads of interest from the FASTQ file and calls genotypes by aligning retrieved reads locally. KATK does not use data about known polymorphisms and has NC (No Call) as default genotype. The reference or variant allele is called only if there is sufficient evidence for their presence in data. Thus it is not biased against rare variants or de novo mutations.ResultsWith simulated datasets, we achieved a false negative rate of 0.23% (sensitivity 99.77%) and a false discovery rate of 0.19%. Calling all human exonic regions with KATK requires 1-2 h, depending on sequencing coverage.AvailabilityKATK is distributed under the terms of GNU GPL v3. The k-mer databases are distributed under the Creative Commons CC BY-NC-SA license. The source code is available at GitHub as part of Genometester4 package (https://github.com/bioinfo-ut/GenomeTester4/). The binaries of KATK package and k-mer databases described in the current paper are available on http://bioinfo.ut.ee/KATK/.

Download Full-text

An Effective Method for Controlling False Discovery and False Nondiscovery Rates in Genome-Scale RNAi Screens

CrossRef Listing of Deleted DOIs ◽

10.1177/1087057110381783 ◽

2010 ◽

Vol 15 (9) ◽

pp. 1116-1122 ◽

Cited By ~ 15

Author(s):

Xiaohua Douglas Zhang

Keyword(s):

False Negative ◽

False Negative Rate ◽

False Positives ◽

P Value ◽

Statistical Control ◽

Small Interfering Rnas ◽

False Negatives ◽

Activation Effect ◽

False Discovery ◽

Genome Scale

In most genome-scale RNA interference (RNAi) screens, the ultimate goal is to select siRNAs with a large inhibition or activation effect. The selection of hits typically requires statistical control of 2 errors: false positives and false negatives. Traditional methods of controlling false positives and false negatives do not take into account the important feature in RNAi screens: many small-interfering RNAs (siRNAs) may have very small but real nonzero average effects on the measured response and thus cannot allow us to effectively control false positives and false negatives. To address for deficiencies in the application of traditional approaches in RNAi screening, the author proposes a new method for controlling false positives and false negatives in RNAi high-throughput screens. The false negatives are statistically controlled through a false-negative rate (FNR) or false nondiscovery rate (FNDR). FNR is the proportion of false negatives among all siRNAs examined, whereas FNDR is the proportion of false negatives among declared nonhits. The author also proposes new concepts, q*-value and p*-value, to control FNR and FNDR, respectively. The proposed method should have broad utility for hit selection in which one needs to control both false discovery and false nondiscovery rates in genome-scale RNAi screens in a robust manner.

Download Full-text

Tuning IMS station processing parameters and detection thresholds to increase detection precision and decrease detection miss rate

10.5194/egusphere-egu2020-8949 ◽

2020 ◽

Author(s):

Christos Saragiotis ◽

Ivan Kitov

Keyword(s):

False Discovery Rate ◽

Binary Classification ◽

False Negative ◽

Detection Threshold ◽

False Negative Rate ◽

Processing Parameters ◽

Detection Algorithm ◽

Detection Scheme ◽

Statistical Measures ◽

False Discovery

<p>Two principal performance measures of the International Monitoring System (IMS) stations detection capability are the rate of automatic detections associated with events in the Reviewed Event Bulletin (REB) and the rate of detections manually added to the REB. These two metrics roughly correspond to the precision (which is the complement of the false-discovery rate) and miss rate or false-negative rate statistical measures of a binary classification test, respectively. The false-discovery and miss rates are clearly significantly influenced by the number of phases detected by the detection algorithm, which in turn depends on prespecified slowness-, frequency- and azimuth- dependent threshold values used in the short-term average over long-term average ratio detection scheme of the IMS stations. In particular, the lower the threshold, the more the detections and therefore the lower the miss rate but the higher the false discovery rate; the higher the threshold, the less the detections and therefore the higher the miss rate but also the lower the false discovery rate. In that sense decreasing both the false-discovery rate and the miss rate are conflicting goals that need to be balanced. On one hand, it is essential that the miss rate is as low as possible since no nuclear explosion should go unnoticed by the IMS. On the other hand, a high false-discovery rate compromises the quality of the automatically generated event lists and adds heavy and unnecessary workload to the seismic analysts during the interactive processing stage.</p><p>A previous study concluded that a way to decrease both the miss and false-discovery rates as well as the analyst workload is to increase the retiming interval, i.e., the maximum allowable time that an analyst is allowed to move an arrival pick without having to declare a new arrival. Indeed, when a detection needs to be moved by an interval larger than the retiming interval, not only is this a much more time-consuming task for the analyst than just retiming it, but it also affects negatively both the associated rate (the automatic detection is deleted and therefore not associated to an event) and the added rate (a new arrival has to be added to arrival list). The International Data Centre has increased the retiming interval from 4 s to 10 s since October 2018. We show how this change affected the associated-detections and added-detections rates and how the values of these metrics can be further improved by tuning the detection threshold levels.</p>

Download Full-text

Using mixture models to detect differentially expressed genes

Australian Journal of Experimental Agriculture ◽

10.1071/ea05051 ◽

2005 ◽

Vol 45 (8) ◽

pp. 859 ◽

Cited By ~ 13

Author(s):

G. J. McLachlan ◽

R. W. Bean ◽

L. Ben-Tovim Jones ◽

J. X. Zhu

Keyword(s):

False Discovery Rate ◽

Mixture Models ◽

False Negative ◽

False Negative Rate ◽

Multiple Hypothesis Testing ◽

Differentially Expressed ◽

False Discovery ◽

Mixture Model Approach ◽

Number Of Classes ◽

Selection Of

An important and common problem in microarray experiments is the detection of genes that are differentially expressed in a given number of classes. As this problem concerns the selection of significant genes from a large pool of candidate genes, it needs to be carried out within the framework of multiple hypothesis testing. In this paper, we focus on the use of mixture models to handle the multiplicity issue. With this approach, a measure of the local false discovery rate is provided for each gene, and it can be implemented so that the implied global false discovery rate is bounded as with the Benjamini-Hochberg methodology based on tail areas. The latter procedure is too conservative, unless it is modified according to the prior probability that a gene is not differentially expressed. An attractive feature of the mixture model approach is that it provides a framework for the estimation of this probability and its subsequent use in forming a decision rule. The rule can also be formed to take the false negative rate into account.

Download Full-text

MIXTURE MODELS FOR DETECTING DIFFERENTIALLY EXPRESSED GENES IN MICROARRAYS

International Journal of Neural Systems ◽

10.1142/s0129065706000755 ◽

2006 ◽

Vol 16 (05) ◽

pp. 353-362 ◽

Cited By ~ 2

Author(s):

LIAT BEN-TOVIM JONES ◽

RICHARD BEAN ◽

GEOFFREY J. MCLACHLAN ◽

JUSTIN XI ZHU

Keyword(s):

Mixture Models ◽

False Negative ◽

False Negative Rate ◽

Multiple Hypothesis Testing ◽

Differentially Expressed ◽

Data Set ◽

False Discovery ◽

Mixture Model Approach ◽

Number Of Classes ◽

Selection Of

An important and common problem in microarray experiments is the detection of genes that are differentially expressed in a given number of classes. As this problem concerns the selection of significant genes from a large pool of candidate genes, it needs to be carried out within the framework of multiple hypothesis testing. In this paper, we focus on the use of mixture models to handle the multiplicity issue. With this approach, a measure of the local FDR (false discovery rate) is provided for each gene. An attractive feature of the mixture model approach is that it provides a framework for the estimation of the prior probability that a gene is not differentially expressed, and this probability can subsequently be used in forming a decision rule. The rule can also be formed to take the false negative rate into account. We apply this approach to a well-known publicly available data set on breast cancer, and discuss our findings with reference to other approaches.

Download Full-text

GRIDSS2: comprehensive characterisation of somatic structural variation using single breakend variants and structural variant phasing

Genome Biology ◽

10.1186/s13059-021-02423-x ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Daniel L. Cameron ◽

Jonathan Baber ◽

Charles Shale ◽

Jose Espejo Valle-Inclan ◽

Nicolle Besselink ◽

...

Keyword(s):

Copy Number ◽

Structural Variation ◽

False Negative ◽

False Negative Rate ◽

Structural Variants ◽

Structural Variant ◽

Complex Rearrangement ◽

False Discovery ◽

Copy Number Changes ◽

Paired End Sequencing

AbstractGRIDSS2 is the first structural variant caller to explicitly report single breakends—breakpoints in which only one side can be unambiguously determined. By treating single breakends as a fundamental genomic rearrangement signal on par with breakpoints, GRIDSS2 can explain 47% of somatic centromere copy number changes using single breakends to non-centromere sequence. On a cohort of 3782 deeply sequenced metastatic cancers, GRIDSS2 achieves an unprecedented 3.1% false negative rate and 3.3% false discovery rate and identifies a novel 32–100 bp duplication signature. GRIDSS2 simplifies complex rearrangement interpretation through phasing of structural variants with 16% of somatic calls phasable using paired-end sequencing.

Download Full-text

No evidence for phylostratigraphic bias impacting inferences on patterns of gene emergence and evolution

10.1101/060756 ◽

2016 ◽

Cited By ~ 2

Author(s):

Tomislav Domazet-Lošo ◽

Anne-Ruxandra Carvunis ◽

M.Mar Albà ◽

Martin Sebastijan Šestak ◽

Robert Bakarić ◽

...

Keyword(s):

De Novo ◽

Gene Evolution ◽

False Negative ◽

False Negative Rate ◽

Computational Framework ◽

Protein Coding ◽

Partial Representation ◽

Sensitive Tool ◽

Cast Doubt ◽

Origin Of Eukaryotes

AbstractPhylostratigraphy is a computational framework for dating the emergence of sequences (usually genes) in a phylogeny. It has been extensively applied to make inferences on patterns of genome evolution, including patterns of disease gene evolution, ontogeny and de novo gene origination. Phylostratigraphy typically relies on BLAST searches along a species tree, but new simulation studies have raised concerns about the ability of BLAST to detect remote homologues and its impact on phylostratigraphic inferences. These simulations called into question some of our previously published work on patterns of gene emergence and evolution inferred from phylostratigraphy. Here, we re-assessed these simulations and found major problems including unrealistic parameter choices, irreproducibility, statistical flaws and partial representation of results. We found that, even with a possible overall BLAST false negative rate between 5-15%, the large majority (>74%) of sequences assigned to a recent evolutionary origin by phylostratigraphy is unaffected by technical concerns about BLAST. Where the results of the simulations did cast doubt on our previous findings, we repeated our analyses but now excluded all questionable sequences. The originally described patterns remained essentially unchanged. These new analyses strongly support our published inferences, including: genes that emerged after the origin of eukaryotes are more likely to be expressed in the ectoderm than in the endoderm or mesoderm in Drosophila, and the de novo emergence of protein-coding genes from non-genic sequences occurs through proto-gene intermediates in yeast. We conclude that BLAST is an appropriate and sufficiently sensitive tool in phylostratigraphic analysis.

Download Full-text

De Novo Mutation Rate Estimation in Wolves of Known Pedigree

Molecular Biology and Evolution ◽

10.1093/molbev/msz159 ◽

2019 ◽

Vol 36 (11) ◽

pp. 2536-2547 ◽

Cited By ~ 4

Author(s):

Evan M Koch ◽

Rena M Schweizer ◽

Teia M Schweizer ◽

Daniel R Stahler ◽

Douglas W Smith ◽

...

Keyword(s):

Mutation Rate ◽

De Novo ◽

Demographic History ◽

Full Range ◽

False Negative ◽

False Negative Rate ◽

De Novo Mutation ◽

Mutation Rates ◽

Point Estimate ◽

De Novo Mutations

Abstract Knowledge of mutation rates is crucial for calibrating population genetics models of demographic history in units of years. However, mutation rates remain challenging to estimate because of the need to identify extremely rare events. We estimated the nuclear mutation rate in wolves by identifying de novo mutations in a pedigree of seven wolves. Putative de novo mutations were discovered by whole-genome sequencing and were verified by Sanger sequencing of parents and offspring. Using stringent filters and an estimate of the false negative rate in the remaining observable genome, we obtain an estimate of ∼4.5 × 10−9 per base pair per generation and provide conservative bounds between 2.6 × 10−9 and 7.1 × 10−9. Although our estimate is consistent with recent mutation rate estimates from ancient DNA (4.0 × 10−9 and 3.0–4.5 × 10−9), it suggests a wider possible range. We also examined the consequences of our rate and the accompanying interval for dating several critical events in canid demographic history. For example, applying our full range of rates to coalescent models of dog and wolf demographic history implies a wide set of possible divergence times between the ancestral populations of dogs and extant Eurasian wolves (16,000–64,000 years ago) although our point estimate indicates a date between 25,000 and 33,000 years ago. Aside from one study in mice, ours provides the only direct mammalian mutation rate outside of primates and is likely to be vital to future investigations of mutation rate evolution.

Download Full-text

Clinically Meaningful Change

Methodology ◽

10.1027/1614-2241/a000168 ◽

2019 ◽

Vol 15 (3) ◽

pp. 97-105

Author(s):

Rodrigo Ferrer ◽

Antonio Pardo

Keyword(s):

Effect Size ◽

False Negative ◽

False Negative Rate ◽

Point Of View ◽

Skewed Distribution ◽

Effect Sizes ◽

False Negatives ◽

Large Size ◽

Before And After ◽

Post Test

Abstract. In a recent paper, Ferrer and Pardo (2014) tested several distribution-based methods designed to assess when test scores obtained before and after an intervention reflect a statistically reliable change. However, we still do not know how these methods perform from the point of view of false negatives. For this purpose, we have simulated change scenarios (different effect sizes in a pre-post-test design) with distributions of different shapes and with different sample sizes. For each simulated scenario, we generated 1,000 samples. In each sample, we recorded the false-negative rate of the five distribution-based methods with the best performance from the point of view of the false positives. Our results have revealed unacceptable rates of false negatives even with effects of very large size, starting from 31.8% in an optimistic scenario (effect size of 2.0 and a normal distribution) to 99.9% in the worst scenario (effect size of 0.2 and a highly skewed distribution). Therefore, our results suggest that the widely used distribution-based methods must be applied with caution in a clinical context, because they need huge effect sizes to detect a true change. However, we made some considerations regarding the effect size and the cut-off points commonly used which allow us to be more precise in our estimates.

Download Full-text

Diagnostic Value of a Carpal Tunnel Corticosteroid Injection in Patients with Negative Electrodiagnostic Studies

Journal of Hand and Microsurgery ◽

10.1055/s-0040-1717830 ◽

2020 ◽

Author(s):

Brian M. Katt ◽

Casey Imbergamo ◽

Fortunato Padua ◽

Joseph Leider ◽

Daniel Fletcher ◽

...

Keyword(s):

Carpal Tunnel ◽

Corticosteroid Injection ◽

Carpal Tunnel Release ◽

False Negative ◽

False Negative Rate ◽

Signs And Symptoms ◽

Diagnostic Value ◽

Symptom Improvement ◽

Study Cohort ◽

Chi Square Analysis

Abstract Introduction There is a known false negative rate when using electrodiagnostic studies (EDS) to diagnose carpal tunnel syndrome (CTS). This can pose a management dilemma for patients with signs and symptoms that correlate with CTS but normal EDS. While corticosteroid injection into the carpal tunnel has been used in this setting for diagnostic purposes, there is little data in the literature supporting this practice. The purpose of this study is to evaluate the prognostic value of a carpal tunnel corticosteroid injection in patients with a normal electrodiagnostic study but exhibiting signs and symptoms suggestive of carpal tunnel, who proceed with a carpal tunnel release. Materials and Methods The group included 34 patients presenting to an academic orthopedic practice over the years 2010 to 2019 who had negative EDS, a carpal tunnel corticosteroid injection, and a carpal tunnel release. One patient (2.9%), where the response to the corticosteroid injection was not documented, was excluded from the study, yielding a study cohort of 33 patients. Three patients had bilateral disease, yielding 36 hands for evaluation. Statistical analysis was performed using Chi-square analysis for nonparametric data. Results Thirty-two hands (88.9%) demonstrated complete or partial relief of neuropathic symptoms after the corticosteroid injection, while four (11.1%) did not experience any improvement. Thirty-one hands (86.1%) had symptom improvement following surgery, compared with five (13.9%) which did not. Of the 32 hands that demonstrated relief following the injection, 29 hands (90.6%) improved after surgery. Of the four hands that did not demonstrate relief after the injection, two (50%) improved after surgery. This difference was statistically significant (p = 0.03). Conclusion Patients diagnosed with a high index of suspicion for CTS do well with operative intervention despite a normal electrodiagnostic test if they have had a positive response to a preoperative injection. The injection can provide reassurance to both the patient and surgeon before proceeding to surgery. Although patients with a normal electrodiagnostic test and no response to cortisone can still do well with surgical intervention, the surgeon should carefully review both the history and physical examination as surgical success may decrease when both diagnostic tests are negative. Performing a corticosteroid injection is an additional diagnostic tool to consider in the management of patients with CTS and normal electrodiagnostic testing.

Download Full-text

Comparative study of ultrasonographic findings with the operative findings of biliary surgery

Journal of Surgical Sciences ◽

10.3329/jss.v22i1.44011 ◽

2020 ◽

Vol 22 (1) ◽

pp. 25-29

Author(s):

Zubayer Ahmad ◽

Mohammad Ali ◽

Kazi lsrat Jahan ◽

ABM Khurshid Alam ◽

G M Morshed

Keyword(s):

Bile Duct ◽

Common Bile Duct ◽

Imaging Modality ◽

False Positive Rate ◽

Gallstone Disease ◽

False Negative ◽

False Negative Rate ◽

Biliary Disease ◽

Biliary Surgery ◽

Operative Findings

Background: Biliary disease is one of the most common surgical problems encountered all over the world. Ultrasound is widely accepted for the diagnosis of biliary system disease. However, it is a highly operator dependent imaging modality and its diagnostic success is also influenced by the situation, such as non-fasting, obesity, intestinal gas. Objective: To compare the ultrasonographic findings with the peroperative findings in biliary surgery. Methods: This prospective study was conducted in General Hospital, comilla between the periods of July 2006 to June 2008 among 300 patients with biliary diseases for which operative treatment is planned. Comparison between sonographic findings with operative findings was performed. Results: Right hypochondriac pain and jaundice were two significant symptoms (93% and 15%). Right hypochondriac tenderness, jaundice and palpable gallbladder were most valuable physical findings (respectively, 40%, 15% and 5%). Out of 252 ultrasonically positive gallbladder, stone were confirmed in 249 cases preoperatively. Sensitivity of USG in diagnosis of gallstone disease was 100%. There was, however, 25% false positive rate detection. Specificity was, however, 75% in this case. USG could demonstrate stone in common bile duct in only 12 out of 30 cases. Sensitivity of the test in diagnosing common bile duct stone was 40%, false negative rate 60%. In the series, ultrasonography sensitivity was 100% in diagnosing stone in cystic duct. USG could detect with relatively good but less sensitivity the presence of chronic cholecystitis (92.3%) and worm inside gallbladder (50%). Conclusion: Ultrasonography is the most important investigation in the diagnosis of biliary disease and a useful test for patients undergoing operative management for planning and anticipating technical difficulties. Journal of Surgical Sciences (2018) Vol. 22 (1): 25-29

Download Full-text