Estimating the statistical performance of different approaches to meta-analysis of data from animal studies in identifying the impact of aspects of study design

AbstractBackgroundMeta-analysis is increasingly used to summarise the findings identified in systematic reviews of animal studies modelling human disease. Such reviews typically identify a large number of individually small studies, testing efficacy under a variety of conditions. This leads to substantial heterogeneity, and identifying potential sources of this heterogeneity is an important function of such analyses. However, the statistical performance of different approaches (normalised compared with standardised mean difference estimates of effect size; stratified meta-analysis compared with meta-regression) is not known.MethodsUsing data from 3116 experiments in focal cerebral ischaemia to construct a linear model predicting observed improvement in outcome contingent on 25 independent variables. We used stochastic simulation to attribute these variables to simulated studies according to their prevalence. To ascertain the ability to detect an effect of a given variable we introduced in addition this “variable of interest” of given prevalence and effect. To establish any impact of a latent variable on the apparent influence of the variable of interest we also introduced a “latent confounding variable” with given prevalence and effect, and allowed the prevalence of the variable of interest to be different in the presence and absence of the latent variable.ResultsGenerally, the normalised mean difference (NMD) approach had higher statistical power than the standardised mean difference (SMD) approach. Even when the effect size and the number of studies contributing to the meta-analysis was small, there was good statistical power to detect the overall effect, with a low false positive rate. For detecting an effect of the variable of interest, stratified meta-analysis was associated with a substantial false positive rate with NMD estimates of effect size, while using an SMD estimate of effect size had very low statistical power. Univariate and multivariable meta-regression performed substantially better, with low false positive rate for both NMD and SMD approaches; power was higher for NMD than for SMD. The presence or absence of a latent confounding variables only introduced an apparent effect of the variable of interest when there was substantial asymmetry in the prevalence of the variable of interest in the presence or absence of the confounding variable.ConclusionsIn meta-analysis of data from animal studies, NMD estimates of effect size should be used in preference to SMD estimates, and meta-regression should, where possible, be chosen over stratified meta-analysis. The power to detect the influence of the variable of interest depends on the effect of the variable of interest and its prevalence, but unless effects are very large adequate power is only achieved once at least 100 experiments are included in the meta-analysis.

Download Full-text

Identification of and Correction for Publication Bias: Comment

10.31222/osf.io/dh87m ◽

2019 ◽

Author(s):

Amanda Kvarven ◽

Eirik Strømland ◽

Magnus Johannesson

Keyword(s):

Publication Bias ◽

False Positive ◽

Large Scale ◽

Meta Analysis ◽

False Positive Rate ◽

Effect Sizes ◽

Replication Studies ◽

Moderate Reduction ◽

Positive Rate ◽

Meta Analyses

Andrews & Kasy (2019) propose an approach for adjusting effect sizes in meta-analysis for publication bias. We use the Andrews-Kasy estimator to adjust the result of 15 meta-analyses and compare the adjusted results to 15 large-scale multiple labs replication studies estimating the same effects. The pre-registered replications provide precisely estimated effect sizes, which do not suffer from publication bias. The Andrews-Kasy approach leads to a moderate reduction of the inflated effect sizes in the meta-analyses. However, the approach still overestimates effect sizes by a factor of about two or more and has an estimated false positive rate of between 57% and 100%.

Download Full-text

Data Mining Validation of Fluconazole Breakpoints Established by the European Committee on Antimicrobial Susceptibility Testing

Antimicrobial Agents and Chemotherapy ◽

10.1128/aac.00081-09 ◽

2009 ◽

Vol 53 (7) ◽

pp. 2949-2954 ◽

Cited By ~ 18

Author(s):

Isabel Cuesta ◽

Concha Bielza ◽

Pedro Larrañaga ◽

Manuel Cuenca-Estrella ◽

Fernando Laguna ◽

...

Keyword(s):

Machine Learning ◽

Antimicrobial Susceptibility ◽

Roc Curve ◽

False Positive ◽

Statistical Power ◽

Susceptibility Testing ◽

False Positive Rate ◽

Antimicrobial Susceptibility Testing ◽

European Committee ◽

Positive Rate

ABSTRACT European Committee on Antimicrobial Susceptibility Testing (EUCAST) breakpoints classify Candida strains with a fluconazole MIC ≤ 2 mg/liter as susceptible, those with a fluconazole MIC of 4 mg/liter as representing intermediate susceptibility, and those with a fluconazole MIC > 4 mg/liter as resistant. Machine learning models are supported by complex statistical analyses assessing whether the results have statistical relevance. The aim of this work was to use supervised classification algorithms to analyze the clinical data used to produce EUCAST fluconazole breakpoints. Five supervised classifiers (J48, Correlation and Regression Trees [CART], OneR, Naïve Bayes, and Simple Logistic) were used to analyze two cohorts of patients with oropharyngeal candidosis and candidemia. The target variable was the outcome of the infections, and the predictor variables consisted of values for the MIC or the proportion between the dose administered and the MIC of the isolate (dose/MIC). Statistical power was assessed by determining values for sensitivity and specificity, the false-positive rate, the area under the receiver operating characteristic (ROC) curve, and the Matthews correlation coefficient (MCC). CART obtained the best statistical power for a MIC > 4 mg/liter for detecting failures (sensitivity, 87%; false-positive rate, 8%; area under the ROC curve, 0.89; MCC index, 0.80). For dose/MIC determinations, the target was >75, with a sensitivity of 91%, a false-positive rate of 10%, an area under the ROC curve of 0.90, and an MCC index of 0.80. Other classifiers gave similar breakpoints with lower statistical power. EUCAST fluconazole breakpoints have been validated by means of machine learning methods. These computer tools must be incorporated in the process for developing breakpoints to avoid researcher bias, thus enhancing the statistical power of the model.

Download Full-text

A fast mrMLM algorithm for multi-locus genome-wide association studies

10.1101/341784 ◽

2018 ◽

Cited By ~ 23

Author(s):

Cox Lwaka Tamba ◽

Yuan-Ming Zhang

Keyword(s):

False Positive ◽

Statistical Power ◽

Association Studies ◽

False Positive Rate ◽

Real Data ◽

High Accuracy ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genome Wide ◽

Positive Rate

AbstractBackgroundRecent developments in technology result in the generation of big data. In genome-wide association studies (GWAS), we can get tens of million SNPs that need to be tested for association with a trait of interest. Indeed, this poses a great computational challenge. There is a need for developing fast algorithms in GWAS methodologies. These algorithms must ensure high power in QTN detection, high accuracy in QTN estimation and low false positive rate.ResultsHere, we accelerated mrMLM algorithm by using GEMMA idea, matrix transformations and identities. The target functions and derivatives in vector/matrix forms for each marker scanning are transformed into some simple forms that are easy and efficient to evaluate during each optimization step. All potentially associated QTNs with P-values ≤ 0.01 are evaluated in a multi-locus model by LARS algorithm and/or EM-Empirical Bayes. We call the algorithm FASTmrMLM. Numerical simulation studies and real data analysis validated the FASTmrMLM. FASTmrMLM reduces the running time in mrMLM by more than 50%. FASTmrMLM also shows high statistical power in QTN detection, high accuracy in QTN estimation and low false positive rate as compared to GEMMA, FarmCPU and mrMLM. Real data analysis shows that FASTmrMLM was able to detect more previously reported genes than all the other methods: GEMMA/EMMA, FarmCPU and mrMLM.ConclusionsFASTmrMLM is a fast and reliable algorithm in multi-locus GWAS and ensures high statistical power, high accuracy of estimates and low false positive rate.Author SummaryThe current developments in technology result in the generation of a vast amount of data. In genome-wide association studies, we can get tens of million markers that need to be tested for association with a trait of interest. Due to the computational challenge faced, we developed a fast algorithm for genome-wide association studies. Our approach is a two stage method. In the first step, we used matrix transformations and identities to quicken the testing of each random marker effect. The target functions and derivatives which are in vector/matrix forms for each marker scanning are transformed into some simple forms that are easy and efficient to evaluate during each optimization step. In the second step, we selected all potentially associated SNPs and evaluated them in a multi-locus model. From simulation studies, our algorithm significantly reduces the computing time. The new method also shows high statistical power in detecting significant markers, high accuracy in marker effect estimation and low false positive rate. We also used the new method to identify relevant genes in real data analysis. We recommend our approach as a fast and reliable method for carrying out a multi-locus genome-wide association study.

Download Full-text

Statistical testing and power analysis for brain-wide association study

10.1101/089870 ◽

2016 ◽

Cited By ~ 1

Author(s):

Weikang Gong ◽

Lin Wan ◽

Wenlian Lu ◽

Liang Ma ◽

Fan Cheng ◽

...

Keyword(s):

False Positive ◽

Power Analysis ◽

Statistical Power ◽

Spatial Information ◽

Association Studies ◽

False Positive Rate ◽

Gaussian Random Field ◽

Resting State Fmri ◽

Statistical Testing ◽

Positive Rate

AbstractThe identification of connexel-wise associations, which involves examining functional connectivities between pairwise voxels across the whole brain, is both statistically and computationally challenging. Although such a connexel-wise methodology has recently been adopted by brain-wide association studies (BWAS) to identify connectivity changes in several mental disorders, such as schizophrenia, autism and depression [Cheng et al., 2015a,b, 2016], the multiple correction and power analysis methods designed specifically for connexel-wise analysis are still lacking. Therefore, we herein report the development of a rigorous statistical framework for connexel-wise significance testing based on the Gaussian random field theory. It includes controlling the family-wise error rate (FWER) of multiple hypothesis testings using topological inference methods, and calculating power and sample size for a connexel-wise study. Our theoretical framework can control the false-positive rate accurately, as validated empirically using two resting-state fMRI datasets. Compared with Bonferroni correction and false discovery rate (FDR), it can reduce false-positive rate and increase statistical power by appropriately utilizing the spatial information of fMRI data. Importantly, our method considerably reduces the computational complexity of a permutation-or simulation-based approach, thus, it can efficiently tackle large datasets with ultra-high resolution images. The utility of our method is shown in a case-control study. Our approach can identify altered functional connectivities in a major depression disorder dataset, whereas existing methods failed. A software package is available at https://github.com/weikanggong/BWAS.

Download Full-text

Optimal Screening Methods to Detect Cardiac Disorders in Athletes: An Evidence-Based Review

Journal of Athletic Training ◽

10.4085/1062-6050-52.11.24 ◽

2017 ◽

Vol 52 (12) ◽

pp. 1168-1170 ◽

Cited By ~ 2

Author(s):

Zachary K. Winkelmann ◽

Ashley K. Crossway

Keyword(s):

False Positive ◽

Best Practice ◽

Meta Analysis ◽

False Positive Rate ◽

Screening Method ◽

Heart Association ◽

Screening Methods ◽

Likelihood Ratios ◽

Positive Rate ◽

Cardiac Disorders

Reference/Citation: Harmon KG, Zigman M, Drezner JA. The effectiveness of screening history, physical exam, and ECG to detect potentially lethal cardiac disorders in athletes: a systematic review/meta-analysis. J Electrocardiol. 2015;48(3):329–338. Clinical Question: Which screening method should be considered best practice to detect potentially lethal cardiac disorders during the preparticipation physical examination (PE) of athletes? Data Sources: The authors completed a comprehensive literature search of MEDLINE, CINAHL, Cochrane Library, Embase, Physiotherapy Evidence Database (PEDro), and SPORTDiscus from January 1996 to November 2014. The following key words were used individually and in combination: ECG, athlete, screening, pre-participation, history, and physical. A manual review of reference lists and key journals was performed to identify additional studies. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines were followed for this review. Study Selection: Studies selected for this analysis involved (1) outcomes of cardiovascular screening in athletes using the history, PE, and electrocardiogram (ECG); (2) history questions and PE based on the American Heart Association recommendations and guidelines; and (3) ECGs interpreted following modern standards. The exclusion criteria were (1) articles not in English, (2) conference abstracts, and (3) clinical commentary articles. Study quality was assessed on a 7-point scale for risk of bias; a score of 7 indicated the highest quality. Articles with potential bias were excluded. Data Extraction: Data included number and sex of participants, number of true- and false-positives and negatives, type of ECG criteria used, number of cardiac abnormalities, and specific cardiac conditions. The sensitivity, specificity, false-positive rate, and positive predictive value of each screening tool were calculated and summarized using a bivariate random-effects meta-analysis model. Main Results: Fifteen articles reporting on 47 137 athletes were fully reviewed. The overall quality of the 15 articles ranged from 5 to 7 on the 7-item assessment scale (ie, participant selection criteria, representative sample, prospective data with at least 1 positive finding, modern ECG criteria used for screening, cardiovascular screening history and PE per American Heart Association guidelines, individual test outcomes reported, and abnormal screening findings evaluated by appropriate diagnostic testing). The athletes (66% males and 34% females) were ethnically and racially diverse, were from several countries, and ranged in age from 5 to 39 years. The sensitivity and specificity of the screening methods were, respectively, ECG, 94% and 93%; history, 20% and 94%; and PE, 9% and 97%. The overall false-positive rate for ECG (6%) was less than that for history (8%) or PE (10%). The positive likelihood ratios of each screening method were 14.8 for ECG, 3.22 for history, and 2.93 for PE. The negative likelihood ratios were 0.055 for ECG, 0.85 for history, and 0.93 for PE. A total of 160 potentially lethal cardiovascular conditions were detected, for a rate of 0.3%, or 1 in 294 patients. The most common conditions were Wolff-Parkinson-White syndrome (n = 67, 42%), long QT syndrome (n = 18, 11%), hypertrophic cardiomyopathy (n = 18, 11%), dilated cardiomyopathy (n = 11, 7%), coronary artery disease or myocardial ischemia (n = 9, 6%), and arrhythmogenic right ventricular cardiomyopathy (n = 4, 3%). Conclusions: The most effective strategy to screen athletes for cardiovascular disease was ECG. This test was 5 times more sensitive than history and 10 times more sensitive than PE, and it had a higher positive likelihood ratio, lower negative likelihood ratio, and lower false-positive rate than history or PE. The 12-lead ECG interpreted using modern criteria should be considered the best practice in screening athletes for cardiovascular disease, and the use of history and PE alone as screening tools should be reevaluated.

Download Full-text

Tu1142 Marginal Increase in Dysplasia Detection and Very High False Positive Rate for Volumetric Laser Endomicroscopy in Barrett's Esophagus: Systemic Review and Meta-Analysis

Gastrointestinal Endoscopy ◽

10.1016/j.gie.2017.03.1279 ◽

2017 ◽

Vol 85 (5) ◽

pp. AB554 ◽

Cited By ~ 5

Author(s):

Bashar J. Qumseya ◽

Sherif Gendy ◽

Yazen Qumsiyeh ◽

Paul Bain ◽

Amira Qumseya ◽

...

Keyword(s):

Barrett’S Esophagus ◽

False Positive ◽

Barrett's Esophagus ◽

Meta Analysis ◽

False Positive Rate ◽

Systemic Review ◽

Marginal Increase ◽

High False Positive Rate ◽

Positive Rate ◽

Very High

Download Full-text

Brain metastasis detection using machine learning: a systematic review and meta-analysis

Neuro-Oncology ◽

10.1093/neuonc/noaa232 ◽

2020 ◽

Author(s):

Se Jin Cho ◽

Leonard Sunwoo ◽

Sung Hyun Baik ◽

Yun Jung Bae ◽

Byung Se Choi ◽

...

Keyword(s):

Machine Learning ◽

Brain Metastasis ◽

False Positive ◽

Data Augmentation ◽

Meta Analysis ◽

False Positive Rate ◽

Relevant Literature ◽

Inverse Variance ◽

Positive Rate

Abstract Background Accurate detection of brain metastasis (BM) is important for cancer patients. We aimed to systematically review the performance and quality of machine-learning-based BM detection on MRI in the relevant literature. Methods A systematic literature search was performed for relevant studies reported before April 27, 2020. We assessed the quality of the studies using modified tailored questionnaires of the Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) criteria and the Checklist for Artificial Intelligence in Medical Imaging (CLAIM). Pooled detectability was calculated using an inverse-variance weighting model. Results A total of 12 studies were included, which showed a clear transition from classical machine learning (cML) to deep learning (DL) after 2018. The studies on DL used a larger sample size than those on cML. The cML and DL groups also differed in the composition of the dataset, and technical details such as data augmentation. The pooled proportions of detectability of BM were 88.7% (95% CI, 84–93%) and 90.1% (95% CI, 84–95%) in the cML and DL groups, respectively. The false-positive rate per person was lower in the DL group than the cML group (10 vs 135, P < 0.001). In the patient selection domain of QUADAS-2, three studies (25%) were designated as high risk due to non-consecutive enrollment and arbitrary exclusion of nodules. Conclusion A comparable detectability of BM with a low false-positive rate per person was found in the DL group compared with the cML group. Improvements are required in terms of quality and study design.

Download Full-text

How does under-reporting of negative and inconclusive results affect the false-positive rate in meta-analysis? A simulation study

BMJ Open ◽

10.1136/bmjopen-2014-004831 ◽

2014 ◽

Vol 4 (8) ◽

pp. e004831-e004831 ◽

Cited By ~ 25

Author(s):

M. Kicinski

Keyword(s):

Simulation Study ◽

False Positive ◽

Meta Analysis ◽

False Positive Rate ◽

Positive Rate

Download Full-text

The copy quantities of TRECs as a biomarker for screening SCID in newborns：a meta-analysis

10.21203/rs.3.rs-985374/v1 ◽

2021 ◽

Author(s):

Bihan Li ◽

Zhiyong Sun ◽

Ying Liu ◽

Jinxia Song ◽

Lijuan Wang ◽

...

Keyword(s):

False Positive ◽

Meta Analysis ◽

False Positive Rate ◽

Severe Combined Immunodeficiency ◽

Positive Likelihood Ratio ◽

Current Evidence ◽

Cochrane Library ◽

Control Group ◽

China National Knowledge Infrastructure ◽

Positive Rate

Abstract Purpose Severe combined immunodeficiency (SCID) threatens the newborns' living qualities. The aim of our study is to evaluate the the value of detecting the copy quantities of T-cell receptor excision circles (TRECs) in screening for the newborns' SCID. Methods We searched eligible studies in PubMed, Web of science, EMBASE, the Cochrane Library, China National Knowledge Infrastructure (CNKI) and chose the Endnote software to filter the whole studies. Meta-analysis was conducted with meta-disc 1.4 and STATA 12.0 software. We used sensitivity and false positive rate (FPR) to evaluate this detecting method; subgroup analysis to explore the sources of heterogeneity. The literature quality was assessed by the quality assessment of diagnostic accuracy studies (QUADAS-Ⅱ). Results Fifteen studies were enrolled containing 3,570 cases in the decreased TREC copy quantities group and 12,186,899 cases in control group. All the included studies are of high quality. The heterogeneity in Diagnostic Odds Ratio (DOR) is 77.9%. The summarized estimates revealed that the pooled sensitivity is 100% (95% confidence interval (CI): 99–100%), the false positive rate is 0.00 (95% CI: 0.00–0.00) and Positive Predict Value (PPV) is 0.14 (95% CI: 0.08-0.19). The current evidence shows detecting the copy quantities of TRECs has the screening value with a high sensitivity and positive likelihood ratio (LR+) for newborns' SCID without published bias (p>0.05). Our result suggests 90 and 40 copies/ul of TRECs as the primary and secondary cut-off value. Conclusion Detecting the individual TREC contents could be used as the premise to diagnosis the newborns' SCID.

Download Full-text