We Ran 9 Billion Regressions: Eliminating False Positives through Computational Model Robustness

False positive findings are a growing problem in many research literatures. We argue that excessive false positives often stem from model uncertainty. There are many plausible ways of specifying a regression model, but researchers typically report only a few preferred estimates. This raises the concern that such research reveals only a small fraction of the possible results and may easily lead to nonrobust, false positive conclusions. It is often unclear how much the results are driven by model specification and how much the results would change if a different plausible model were used. Computational model robustness analysis addresses this challenge by estimating all possible models from a theoretically informed model space. We use large-scale random noise simulations to show (1) the problem of excess false positive errors under model uncertainty and (2) that computational robustness analysis can identify and eliminate false positives caused by model uncertainty. We also draw on a series of empirical applications to further illustrate issues of model uncertainty and estimate instability. Computational robustness analysis offers a method for relaxing modeling assumptions and improving the transparency of applied research.

Download Full-text

The Difference Between Causal Analysis and Predictive Models: Response to “Comment on Young and Holsteen (2017)”

Sociological Methods & Research ◽

10.1177/0049124118782542 ◽

2018 ◽

Vol 48 (2) ◽

pp. 431-447 ◽

Cited By ~ 3

Author(s):

Cristobal Young

Keyword(s):

Treatment Effects ◽

Robustness Analysis ◽

Model Space ◽

Parameter Estimates ◽

Single Model ◽

Omitted Variable Bias ◽

The Difference ◽

Variable Bias ◽

Definition Of ◽

Model Robustness

The commenter’s proposal may be a reasonable method for addressing uncertainty in predictive modeling, where the goal is to predict y. In a treatment effects framework, where the goal is causal inference by conditioning-on-observables, the commenter’s proposal is deeply flawed. The proposal (1) ignores the definition of omitted-variable bias, thus systematically omitting critical kinds of controls; (2) assumes for convenience there are no bad controls in the model space, thus waving off the premise of model uncertainty; and (3) deletes virtually all alternative models to select a single model with the highest R 2. Rather than showing what model assumptions are necessary to support one’s preferred results, this proposal favors biased parameter estimates and deletes alternative results before anyone has a chance to see them. In a treatment effects framework, this is not model robustness analysis but simply biased model selection.

Download Full-text

BIOTAS: BIOTelemetry Analysis Software, for the semi-automated removal of false positives from radio telemetry data

Animal Biotelemetry ◽

10.1186/s40317-022-00273-3 ◽

2022 ◽

Vol 10 (1) ◽

Author(s):

K. Nebiolo ◽

T. Castro-Santos

Keyword(s):

Data Management ◽

False Positive ◽

Large Scale ◽

Cross Validation ◽

Radio Telemetry ◽

Wide Band ◽

False Positives ◽

Training Data ◽

Analysis Software ◽

Telemetry Data

Abstract Introduction Radio telemetry, one of the most widely used techniques for tracking wildlife and fisheries populations, has a false-positive problem. Bias from false-positive detections can affect many important derived metrics, such as home range estimation, site occupation, survival, and migration timing. False-positive removal processes have relied upon simple filters and personal opinion. To overcome these shortcomings, we have developed BIOTAS (BIOTelemetry Analysis Software) to assist with false-positive identification, removal, and data management for large-scale radio telemetry projects. Methods BIOTAS uses a naïve Bayes classifier to identify and remove false-positive detections from radio telemetry data. The semi-supervised classifier uses spurious detections from unknown tags and study tags as training data. We tested BIOTAS on four scenarios: wide-band receiver with a single Yagi antenna, wide-band receiver that switched between two Yagi antennas, wide-band receiver with a single dipole antenna, and single-band receiver that switched between five frequencies. BIOTAS has a built in a k-fold cross-validation and assesses model quality with sensitivity, specificity, positive and negative predictive value, false-positive rate, and precision-recall area under the curve. BIOTAS also assesses concordance with a traditional consecutive detection filter using Cohen’s $$\kappa$$ κ . Results Overall BIOTAS performed equally well in all scenarios and was able to discriminate between known false-positive detections and valid study tag detections with low false-positive rates (< 0.001) as determined through cross-validation, even as receivers switched between antennas and frequencies. BIOTAS classified between 94 and 99% of study tag detections as valid. Conclusion As part of a robust data management plan, BIOTAS is able to discriminate between detections from study tags and known false positives. BIOTAS works with multiple manufacturers and accounts for receivers that switch between antennas and frequencies. BIOTAS provides the framework for transparent, objective, and repeatable telemetry projects for wildlife conservation surveys, and increases the efficiency of data processing.

Download Full-text

Model Uncertainty and the Crisis in Science

Socius Sociological Research for a Dynamic World ◽

10.1177/2378023117737206 ◽

2018 ◽

Vol 4 ◽

pp. 237802311773720 ◽

Cited By ~ 6

Author(s):

Cristobal Young

Keyword(s):

Empirical Study ◽

Model Uncertainty ◽

Statistical Significance ◽

Model Specification ◽

Computational Power ◽

Research Results ◽

Model Robustness

The “crisis in science” today is rooted in genuine problems of model uncertainty and lack of transparency. Researchers estimate a large number of models in the course of their research but only publish a small number of preferred results. Authors have much influence on the results of an empirical study through their choices about model specification. I advance methods to quantify the influence of the author—or at least demonstrate the scope an author has to choose a preferred result. Multimodel analysis, combined with modern computational power, allows authors to present their preferred estimate alongside a distribution of estimates from many other plausible models. I demonstrate the method using new software and applied empirical examples. When evaluating research results, accounting for model uncertainty and model robustness is at least as important as statistical significance.

Download Full-text

Identification of and Correction for Publication Bias: Comment

10.31222/osf.io/dh87m ◽

2019 ◽

Author(s):

Amanda Kvarven ◽

Eirik Strømland ◽

Magnus Johannesson

Keyword(s):

Publication Bias ◽

False Positive ◽

Large Scale ◽

Meta Analysis ◽

False Positive Rate ◽

Effect Sizes ◽

Replication Studies ◽

Moderate Reduction ◽

Positive Rate ◽

Meta Analyses

Andrews & Kasy (2019) propose an approach for adjusting effect sizes in meta-analysis for publication bias. We use the Andrews-Kasy estimator to adjust the result of 15 meta-analyses and compare the adjusted results to 15 large-scale multiple labs replication studies estimating the same effects. The pre-registered replications provide precisely estimated effect sizes, which do not suffer from publication bias. The Andrews-Kasy approach leads to a moderate reduction of the inflated effect sizes in the meta-analyses. However, the approach still overestimates effect sizes by a factor of about two or more and has an estimated false positive rate of between 57% and 100%.

Download Full-text

Train Fast While Reducing False Positives: Improving Animal Classification Performance Using Convolutional Neural Networks

Geomatics ◽

10.3390/geomatics1010004 ◽

2021 ◽

Vol 1 (1) ◽

pp. 34-49

Author(s):

Mael Moreni ◽

Jerome Theau ◽

Samuel Foucher

Keyword(s):

False Positive ◽

Classification Performance ◽

Large Datasets ◽

False Positives ◽

Fine Tuning ◽

Training Time ◽

In The Wild ◽

Test Sets ◽

High Level ◽

Time Decrease

The combination of unmanned aerial vehicles (UAV) with deep learning models has the capacity to replace manned aircrafts for wildlife surveys. However, the scarcity of animals in the wild often leads to highly unbalanced, large datasets for which even a good detection method can return a large amount of false detections. Our objectives in this paper were to design a training method that would reduce training time, decrease the number of false positives and alleviate the fine-tuning effort of an image classifier in a context of animal surveys. We acquired two highly unbalanced datasets of deer images with a UAV and trained a Resnet-18 classifier using hard-negative mining and a series of recent techniques. Our method achieved sub-decimal false positive rates on two test sets (1 false positive per 19,162 and 213,312 negatives respectively), while training on small but relevant fractions of the data. The resulting training times were therefore significantly shorter than they would have been using the whole datasets. This high level of efficiency was achieved with little tuning effort and using simple techniques. We believe this parsimonious approach to dealing with highly unbalanced, large datasets could be particularly useful to projects with either limited resources or extremely large datasets.

Download Full-text

Evaluation of Positive T- and B-Cell Gene Rearrangement Studies Among Patients Without a Definitive Diagnosis by Other Assays

American Journal of Clinical Pathology ◽

10.1093/ajcp/aqz112.067 ◽

2019 ◽

Vol 152 (Supplement_1) ◽

pp. S35-S36

Author(s):

Hadrian Mendoza ◽

Christopher Tormey ◽

Alexa Siddon

Keyword(s):

T Cell ◽

False Positive ◽

Gene Rearrangement ◽

Hematologic Malignancy ◽

False Negative ◽

False Positives ◽

False Negatives ◽

True Negative ◽

Flow Cytometric ◽

Pathology Reports

Abstract In the evaluation of bone marrow (BM) and peripheral blood (PB) for hematologic malignancy, positive immunoglobulin heavy chain (IG) or T-cell receptor (TCR) gene rearrangement results may be detected despite unrevealing results from morphologic, flow cytometric, immunohistochemical (IHC), and/or cytogenetic studies. The significance of positive rearrangement studies in the context of otherwise normal ancillary findings is unknown, and as such, we hypothesized that gene rearrangement studies may be predictive of an emerging B- or T-cell clone in the absence of other abnormal laboratory tests. Data from all patients who underwent IG or TCR gene rearrangement testing at the authors’ affiliated VA hospital between January 1, 2013, and July 6, 2018, were extracted from the electronic medical record. Date of testing; specimen source; and morphologic, flow cytometric, IHC, and cytogenetic characterization of the tissue source were recorded from pathology reports. Gene rearrangement results were categorized as true positive, false positive, false negative, or true negative. Lastly, patient records were reviewed for subsequent diagnosis of hematologic malignancy in patients with positive gene rearrangement results with negative ancillary testing. A total of 136 patients, who had 203 gene rearrangement studies (50 PB and 153 BM), were analyzed. In TCR studies, there were 2 false positives and 1 false negative in 47 PB assays, as well as 7 false positives and 1 false negative in 54 BM assays. Regarding IG studies, 3 false positives and 12 false negatives in 99 BM studies were identified. Sensitivity and specificity, respectively, were calculated for PB TCR studies (94% and 93%), BM IG studies (71% and 95%), and BM TCR studies (92% and 83%). Analysis of PB IG gene rearrangement studies was not performed due to the small number of tests (3; all true negative). None of the 12 patients with false-positive IG/TCR gene rearrangement studies later developed a lymphoproliferative disorder, although 2 patients were later diagnosed with acute myeloid leukemia. Of the 14 false negatives, 10 (71%) were related to a diagnosis of plasma cell neoplasms. Results from the present study suggest that positive IG/TCR gene rearrangement studies are not predictive of lymphoproliferative disorders in the context of otherwise negative BM or PB findings. As such, when faced with equivocal pathology reports, clinicians can be practically advised that isolated positive IG/TCR gene rearrangement results may not indicate the need for closer surveillance.

Download Full-text

Interferometric imaging condition for wave-equation migration

Geophysics ◽

10.1190/1.2838043 ◽

2008 ◽

Vol 73 (2) ◽

pp. S47-S61 ◽

Cited By ~ 14

Author(s):

Paul Sava ◽

Oleg Poliannikov

Keyword(s):

Large Scale ◽

Wigner Distribution ◽

Random Noise ◽

Velocity Model ◽

Distribution Functions ◽

Small Scale ◽

Interferometric Imaging ◽

Imaging Data ◽

Imaging Condition ◽

Velocity Models

The fidelity of depth seismic imaging depends on the accuracy of the velocity models used for wavefield reconstruction. Models can be decomposed in two components, corresponding to large-scale and small-scale variations. In practice, the large-scale velocity model component can be estimated with high accuracy using repeated migration/tomography cycles, but the small-scale component cannot. When the earth has significant small-scale velocity components, wavefield reconstruction does not completely describe the recorded data, and migrated images are perturbed by artifacts. There are two possible ways to address this problem: (1) improve wavefield reconstruction by estimating more accurate velocity models and image using conventional techniques (e.g., wavefield crosscorrelation) or (2) reconstruct wavefields with conventional methods using the known background velocity model but improve the imaging condition to alleviate the artifacts caused by the imprecise reconstruction. Wedescribe the unknown component of the velocity model as a random function with local spatial correlations. Imaging data perturbed by such random variations is characterized by statistical instability, i.e., various wavefield components image at wrong locations that depend on the actual realization of the random model. Statistical stability can be achieved by preprocessing the reconstructed wavefields prior to the imaging condition. We use Wigner distribution functions to attenuate the random noise present in the reconstructed wavefields, parameterized as a function of image coordinates. Wavefield filtering using Wigner distribution functions and conventional imaging can be lumped together into a new form of imaging condition that we call an interferometric imaging condition because of its similarity to concepts from recent work on interferometry. The interferometric imaging condition can be formulated both for zero-offset and for multioffset data, leading to robust, efficient imaging procedures that effectively attenuate imaging artifacts caused by unknown velocity models.

Download Full-text