scholarly journals Enhancing feedback on performance measures: the difference in outlier detection using a binary versus continuous outcome funnel plot and implications for quality improvement

2020 ◽  
Vol 30 (1) ◽  
pp. 38-45 ◽  
Author(s):  
Laurien Kuhrij ◽  
Erik van Zwet ◽  
Renske van den Berg-Vos ◽  
Paul Nederkoorn ◽  
Perla J Marang-van de Mheen

BackgroundHospitals and providers receive feedback information on how their performance compares with others, often using funnel plots to detect outliers. These funnel plots typically use binary outcomes, and continuous variables are dichotomised to fit this format. However, information is lost using a binary measure, which is only sensitive to detect differences in higher values (the tail) rather than the entire distribution. This study therefore aims to investigate whether different outlier hospitals are identified when using a funnel plot for a binary vs a continuous outcome. This is relevant for hospitals with suboptimal performance to decide whether performance can be improved by targeting processes for all patients or a subgroup with higher values.MethodsWe examined the door-to-needle time (DNT) of all (6080) patients with acute ischaemic stroke treated with intravenous thrombolysis in 65 hospitals in 2017, registered in the Dutch Acute Stroke Audit. We compared outlier hospitals in two funnel plots: the median DNT versus the proportion of patients with substantially delayed DNT (above the 90th percentile (P90)), whether these were the same or different hospitals. Two sensitivity analyses were performed using the proportion above the median and a continuous P90 funnel plot.ResultsThe median DNT was 24 min and P90 was 50 min. In the binary funnel plot for the proportion of patients above P90, 58 hospitals had average performance, whereas in the funnel plot around the median 14 of these hospitals had significantly higher median DNT (24%). These hospitals can likely improve their DNT by focusing on care processes for all patients, not shown by the binary outcome funnel plot. Similar results were shown in sensitivity analyses.ConclusionUsing funnel plots for continuous versus binary outcomes identify different outlier hospitals, which may enhance hospital feedback to direct more targeted improvement initiatives.

2020 ◽  
Vol 228 (1) ◽  
pp. 43-49 ◽  
Author(s):  
Michael Kossmeier ◽  
Ulrich S. Tran ◽  
Martin Voracek

Abstract. Currently, dedicated graphical displays to depict study-level statistical power in the context of meta-analysis are unavailable. Here, we introduce the sunset (power-enhanced) funnel plot to visualize this relevant information for assessing the credibility, or evidential value, of a set of studies. The sunset funnel plot highlights the statistical power of primary studies to detect an underlying true effect of interest in the well-known funnel display with color-coded power regions and a second power axis. This graphical display allows meta-analysts to incorporate power considerations into classic funnel plot assessments of small-study effects. Nominally significant, but low-powered, studies might be seen as less credible and as more likely being affected by selective reporting. We exemplify the application of the sunset funnel plot with two published meta-analyses from medicine and psychology. Software to create this variation of the funnel plot is provided via a tailored R function. In conclusion, the sunset (power-enhanced) funnel plot is a novel and useful graphical display to critically examine and to present study-level power in the context of meta-analysis.


Methodology ◽  
2008 ◽  
Vol 4 (3) ◽  
pp. 132-138 ◽  
Author(s):  
Michael Höfler

A standardized index for effect intensity, the translocation relative to range (TRR), is discussed. TRR is defined as the difference between the expectations of an outcome under two conditions (the absolute increment) divided by the maximum possible amount for that difference. TRR measures the shift caused by a factor relative to the maximum possible magnitude of that shift. For binary outcomes, TRR simply equals the risk difference, also known as the inverse number needed to treat. TRR ranges from –1 to 1 but is – unlike a correlation coefficient – a measure for effect intensity, because it does not rely on variance parameters in a certain population as do effect size measures (e.g., correlations, Cohen’s d). However, the use of TRR is restricted on outcomes with fixed and meaningful endpoints given, for instance, for meaningful psychological questionnaires or Likert scales. The use of TRR vs. Cohen’s d is illustrated with three examples from Psychological Science 2006 (issues 5 through 8). It is argued that, whenever TRR applies, it should complement Cohen’s d to avoid the problems related to the latter. In any case, the absolute increment should complement d.


2005 ◽  
Vol 52 (10-11) ◽  
pp. 503-508 ◽  
Author(s):  
K. Chandran ◽  
Z. Hu ◽  
B.F. Smets

Several techniques have been proposed for biokinetic estimation of nitrification. Recently, an extant respirometric assay has been presented that yields kinetic parameters for both nitrification steps with minimal physiological change to the microorganisms during the assay. Herein, the ability of biokinetic parameter estimates from the extant respirometric assay to adequately describe concurrently obtained NH4+-N and NO2−-N substrate depletion profiles is evaluated. Based on our results, in general, the substrate depletion profiles resulted in a higher estimate of the maximum specific growth rate coefficient, μmax for both NH4+-N to NO2−-N oxidation and NO2−-N to NO3−-N oxidation compared to estimates from the extant respirograms. The trends in the kinetic parameter estimates from the different biokinetic estimation techniques are paralleled in the nature of substrate depletion profiles obtained from best-fit parameters. Based on a visual inspection, in general, best-fit parameters from optimally designed complete respirograms provided a better description of the substrate depletion profiles than estimates from isolated respirograms. Nevertheless, the sum of the squared errors for the best-fit respirometry based parameters was outside the 95% joint confidence interval computed for the best-fit substrate depletion based parameters. Notwithstanding the difference in kinetic parameter estimates determined in this study, the different biokinetic estimation techniques still are close to estimates reported in literature. Additional parameter identifiability and sensitivity analysis of parameters from substrate depletion assays revealed high precision of parameters and high parameter correlation. Although biokinetic estimation via automated extant respirometry is far more facile than via manual substrate depletion measurements, additional sensitivity analyses are needed to test the impact of differences in the resulting parameter values on continuous reactor performance.


2010 ◽  
Vol 20 (6) ◽  
pp. 595-612 ◽  
Author(s):  
Steven A Julious ◽  
Roger J Owen

Non-inferiority trials are motivated in the context of clinical research where a proven active treatment exists and placebo-controlled trials are no longer acceptable for ethical reasons. Instead, active-controlled trials are conducted where a treatment is compared to an established treatment with the objective of demonstrating that it is non-inferior to this treatment. We review and compare the methodologies for calculating sample sizes and suggest appropriate methods to use. We demonstrate how the simplest method of using the anticipated response is predominantly consistent with simulations. In the context of trials with binary outcomes with expected high proportions of positive responses, we show how the sample size is quite sensitive to assumptions about the control response. We recommend when designing such a study that sensitivity analyses be performed with respect to the underlying assumptions and that the Bayesian methods described in this article be adopted to assess sample size.


2018 ◽  
Vol 25 (12) ◽  
pp. 1887-1891
Author(s):  
Malik Jamil Ahmed ◽  
Muhammad Nasir ◽  
Aamir Furqan

Objectives: To investigate whether the addition of dexamethasone and chloropheniramine to oral ketamine premedication affects the incidence of postoperative vomiting. Study Design: Randomized control trail. Setting: Department of Anesthesia and Intensive Care Nishtar Hospital, Multan. Period: March 2016 to March 2017. Methodology: After obtaining ethical approval ethical and review board of hospital. Data was entered in a computer software SPSS version 23.1 and analyzed for possible variables. Continuous variables were presented as mean and standard deviation like age, weight, sedation time, anesthesia time, admission time and PACU time. Categorical variables were presented as gender, ASA statusand postoperative vomiting. Student test and chi square test was applied to see association of outcome variable. P value of 0.05 was taken as significant. Results: Overall, 100% (n=80) patients were included in this study, both genders. The study group was further divided into twoequal groups, 50% (n=40) in each, i.e. Group K (Ketamine) group and group KD (Ketamine-Dexamethasone). The main outcome variable of this study was postoperative vomiting. In this study, Postoperative vomiting observed in 35% (n=10) and 10% (n=4) patients, for group K and group KD respectively. The difference was statistically significant (p=0.007). Conclusion: Addition of dexamethasone and chloropheniramine with ketamine as premedication reduce the incidence of postoperative vomiting.


Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-26
Author(s):  
Xiaoli Tian ◽  
Meiling Niu ◽  
Jiangshui Ma ◽  
Zeshui Xu

TODIM is a well-known multiple-criteria decision-making (MCDM) which considers the bounded rationality of decision makers (DMs) based on prospect theory (PT). However, in the classical TODIM, the perceived probability weighting function and the difference of the risk attitudes for gains and losses are not consistent with the original idea of PT. Moreover, probabilistic hesitant fuzzy information shows its superiority in handling the situation that the DMs hesitate among several possible values with different possibilities. Hence, a novel TODIM with probabilistic hesitant fuzzy information is proposed in this paper to simulate the perceptions of the DMs in PT. To show the advantages of the proposed method, a novel TODIM is combined with hesitant fuzzy information. Finally, a case study is carried out to demonstrate the feasibility of the proposed method, and a series of comparative analyses and the sensitivity analyses are used to show the stability of the proposed method.


2011 ◽  
Vol 139 (9) ◽  
pp. 3069-3074 ◽  
Author(s):  
Andreas P. Weigel ◽  
Simon J. Mason

This article refers to the study of Mason and Weigel, where the generalized discrimination score D has been introduced. This score quantifies whether a set of observed outcomes can be correctly discriminated by the corresponding forecasts (i.e., it is a measure of the skill attribute of discrimination). Because of its generic definition, D can be adapted to essentially all relevant verification contexts, ranging from simple yes–no forecasts of binary outcomes to probabilistic forecasts of continuous variables. For most of these cases, Mason and Weigel have derived expressions for D, many of which have turned out to be equivalent to scores that are already known under different names. However, no guidance was provided on how to calculate D for ensemble forecasts. This gap is aggravated by the fact that there are currently very few measures of forecast quality that could be directly applied to ensemble forecasts without requiring that probabilities be derived from the ensemble members prior to verification. This study seeks to close this gap. A definition is proposed of how ensemble forecasts can be ranked; the ranks of the ensemble forecasts can then be used as a basis for attempting to discriminate between corresponding observations. Given this definition, formulations of D are derived that are directly applicable to ensemble forecasts.


Stroke ◽  
2021 ◽  
Vol 52 (1) ◽  
pp. 40-47
Author(s):  
James E. Siegler ◽  
Alicia M. Zha ◽  
Alexandra L. Czap ◽  
Santiago Ortega-Gutierrez ◽  
Mudassir Farooqui ◽  
...  

Background and Purpose: The pandemic caused by the novel coronavirus disease 2019 (COVID-19) has led to an unprecedented paradigm shift in medical care. We sought to evaluate whether the COVID-19 pandemic may have contributed to delays in acute stroke management at comprehensive stroke centers. Methods: Pooled clinical data of consecutive adult stroke patients from 14 US comprehensive stroke centers (January 1, 2019, to July 31, 2020) were queried. The rate of thrombolysis for nontransferred patients within the Target: Stroke goal of 60 minutes was compared between patients admitted from March 1, 2019, and July 31, 2019 (pre–COVID-19), and March 1, 2020, to July 31, 2020 (COVID-19). The time from arrival to imaging and treatment with thrombolysis or thrombectomy, as continuous variables, were also assessed. Results: Of the 2955 patients who met inclusion criteria, 1491 were admitted during the pre–COVID-19 period and 1464 were admitted during COVID-19, 15% of whom underwent intravenous thrombolysis. Patients treated during COVID-19 were at lower odds of receiving thrombolysis within 60 minutes of arrival (odds ratio, 0.61 [95% CI, 0.38–0.98]; P =0.04), with a median delay in door-to-needle time of 4 minutes ( P =0.03). The lower odds of achieving treatment in the Target: Stroke goal persisted after adjustment for all variables associated with earlier treatment (adjusted odds ratio, 0.55 [95% CI, 0.35–0.85]; P <0.01). The delay in thrombolysis appeared driven by the longer delay from imaging to bolus (median, 29 [interquartile range, 18–41] versus 22 [interquartile range, 13–37] minutes; P =0.02). There was no significant delay in door-to-groin puncture for patients who underwent thrombectomy (median, 83 [interquartile range, 63–133] versus 90 [interquartile range, 73–129] minutes; P =0.30). Delays in thrombolysis were observed in the months of June and July. Conclusions: Evaluation for acute ischemic stroke during the COVID-19 period was associated with a small but significant delay in intravenous thrombolysis but no significant delay in thrombectomy time metrics. Taking steps to reduce delays from imaging to bolus time has the potential to attenuate this collateral effect of the pandemic.


2015 ◽  
Vol 97 (1) ◽  
pp. 52-55 ◽  
Author(s):  
JK Dickson ◽  
A Davies ◽  
S Rahman ◽  
C Sethu ◽  
JRO Smith ◽  
...  

Introduction Dissection of regional lymph nodes (RLNs) can lead to significant morbidity and a high prevalence of complications. Published guidance states that these procedures should be carried out by surgeons who are members of a specialist skin multidisciplinary team who carry out a combined minimum of 15 axillary/groin dissections per year. However, there is little evidence to guide this minimum figure of procedures. We report on the burden of service provision and prevalence of complications across the South West of England and Wales. Methods A 12-month review of dissections of RLNs for skin cancer was undertaken covering five Plastic Surgery Units with a collective catchment of 8.4 million people. Detailed data were collected on patient demographics, pathology, timing of surgery, and prevalence of complications. Results A total of 163 dissections were carried out. Forty-three per cent of patients experienced one or more complication. In that 12-month period, an average of 8 axillary/groin dissections was carried out per surgeon. A funnel plot demonstrated that the prevalence of complications for individual surgeons was within the limit of the plot but, in many cases, this was based only on a relatively small number of procedures per consultant. If surgeons carried out 10 procedures per year, the upper and lower limits on the plot were 73% and 11%, respectively. Conclusions Funnel plots can provide a useful guide as to whether the prevalence of complications for procedures for individual surgeons lies within acceptable limits. Based on these results, 10 procedures per consultant per year should be sufficient to enable meaningful assessment of the prevalence of complications.


Author(s):  
Sander M. J. van Kuijk ◽  
Frank J. W. M. Dankers ◽  
Alberto Traverso ◽  
Leonard Wee

AbstractThis is the first chapter of five that cover an introduction to developing and validating models for predicting outcomes for the individual patient. Such prediction models can be used for predicting the occurrence or recurrence of an event, or of the most likely value on a continuous outcome. We will mainly focus on the prediction of binary outcomes, such as the occurrence of a complication, recurrence of disease, the presence of metastases, remission, survival, etc. This chapter deals with the selection of an appropriate study design for a study on prediction, and on methods to manipulate the data before the statistical modelling can begin.


Sign in / Sign up

Export Citation Format

Share Document