scholarly journals Assessing baseline dependency of anchor-based minimal important change (MIC): don’t stratify on the baseline score!

Author(s):  
Berend Terluin ◽  
Ewa M. Roos ◽  
Caroline B. Terwee ◽  
Jonas B. Thorlund ◽  
Lina H. Ingelsrud

Abstract Purpose The minimal important change (MIC) of a patient-reported outcome measure (PROM) is often suspected to be baseline dependent, typically in the sense that patients who are in a poorer baseline health condition need greater improvement to qualify as minimally important. Testing MIC baseline dependency is commonly performed by creating two or more subgroups, stratified on the baseline PROM score. This study’s purpose was to show that this practice produces biased subgroup MIC estimates resulting in spurious MIC baseline dependency, and to develop alternative methods to evaluate MIC baseline dependency. Methods Datasets with PROM baseline and follow-up scores and transition ratings were simulated with and without MIC baseline dependency. Mean change MICs, ROC-based MICs, predictive MICs, and adjusted MICs were estimated before and after stratification on the baseline score. Three alternative methods were developed and evaluated. The methods were applied in a real data example for illustration. Results Baseline stratification resulted in biased subgroup MIC estimates and the false impression of MIC baseline dependency, due to redistribution of measurement error. Two of the alternative methods require a second baseline measurement with the same PROM or another correlated PROM. The third method involves the construction of two parallel tests based on splitting the PROM’s item set. Two methods could be applied to the real data. Conclusion MIC baseline dependency should not be tested in subgroups based on stratification on the baseline PROM score. Instead, one or more of the suggested alternative methods should be used.

2021 ◽  
Author(s):  
Berend Terluin ◽  
Ewa M Roos ◽  
Caroline B Terwee ◽  
Jonas B Thorlund ◽  
Lina H Ingelsrud

Abstract Purpose The minimal important change (MIC) of a patient-reported outcome measure (PROM) is often suspected to be baseline dependent, typically in the sense that patients who are in a poorer baseline health condition need greater improvements to qualify as minimally important. Testing MIC baseline dependency is commonly performed by creating two or more subgroups, stratified on the baseline PROM score. This study’s purpose was to show that this practice produces biased subgroup MIC estimates resulting in spurious MIC baseline dependency, and to develop alternative methods to evaluate MIC baseline dependency. Methods Datasets with PROM baseline and follow-up scores and transition ratings were simulated with and without MIC baseline dependency. Mean change MICs, ROC-based MICs, predictive MICs and adjusted MICs were estimated before and after stratification on the baseline score. Three alternative methods were developed and evaluated. The methods were applied in a real data example for illustration. Results Baseline stratification resulted in biased subgroup MIC estimates and the false impression of MIC baseline dependency, due to redistribution of measurement error. Two of the alternative methods require a second baseline measurement with the same PROM or another correlated PROM. The third method involves the construction of two parallel tests based on splitting the PROM’s item set. Two methods could be applied to the real data. Conclusion MIC baseline dependency should not be tested in subgroups based on stratification on the baseline PROM score. Instead, one or more of the suggested alternative methods should be used.


2021 ◽  
Author(s):  
Selma Flora Nordqvist ◽  
Victor Brun Boesen ◽  
Åse Krogh Rasmussen ◽  
Ulla Feldt-Rasmussen ◽  
Laszlo Hegedüs ◽  
...  

Objective: ThyPRO is the standard thyroid patient-reported outcome (PRO). The change in scores that patients perceive as important remains to be ascertained. The purpose of this study was to determine values for minimal important change (MIC) for ThyPRO. Methods: A total of 435 patients treated for benign thyroid diseases completed ThyPRO at baseline and 6 weeks following treatment initiation. At 6-weeks follow-up, patients also completed Global Rating of Change items. For each 0-100 scale, two MIC values were identified: an MIC for groups, using the ROC curve method and an MIC for individual patients, using the reliable change Index. Results: ROC analyses provided group-MIC estimates of 6.3 to 14.3 (score range 0-100). Evaluation of area under the curve (AUC) supported the robustness for 9 of 14 scales (AUC > 0.7). Reliable change index estimates of individual-MIC were 8.0 to 21.1. For all scales but two, the individual-MIC values were larger than the group-MIC values. Conclusions: Interpretability of ThyPRO was improved by the establishment of MIC values, which was 6.3 to 14.3 for groups and 8.0 to 21.1 for individuals. Thus, estimates of which changes are clinically relevant, are now available for future studies. We recommend using MIC values found by ROC analyses to evaluate changes in groups of patients, whereas MIC values identified by a dual criterion, including the reliability of changes, should be used for individual patients, e.g. to identify individual responders in clinical studies or practice.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Gregory Katz ◽  
Alexandra Rouquette ◽  
François Lignereux ◽  
Thierry Mourgues ◽  
Michel Weber ◽  
...  

Abstract Background The Catquest-9SF questionnaire is a patient reported outcome measure that quantifies the visual benefits from cataract surgery. The purpose of this study was to translate and adapt the Catquest-9SF questionnaire for France, to assess its psychometric properties via Rasch analysis, and to assess its validity when completed using an electronic notepad. Methods The Catquest-9SF questionnaire was translated following the guidelines of the International Society for Pharmacoeconomics and Outcomes Research. Catquest-9SF and clinical data were collected from patients before and after routine cataract surgery. All questionnaire data were collected via an electronic notepad. Rasch analysis was performed to assess psychometric properties, and sensitivity to change was analysed for patients with complete paired pre- and post-operative questionnaires. Results A complete filled-in preoperative questionnaire was obtained for 848 patients. Rasch analysis showed good precision (person separation: 2.32, person reliability: 0.84), ordered category probability curves, no item misfit, and unidimensionality. The respondents were slightly more able than the level of item difficulty (targeting: −1.12 logits). Sensitivity was analysed on 211 paired questionnaires, and the postoperative questionnaires showed a clear ceiling effect. The effect size was 2.6. The use of an electronic notepad for completing the questionnaire worked out very well after some adjustments. Conclusions The French version of Catquest-9SF has good psychometric properties and is suitable for use in French-speaking patients. The use of the Catquest-9SF questionnaire in an electronic format showed good validity.


2019 ◽  
Vol 6 (3) ◽  
pp. 69
Author(s):  
Jenny J. Ly ◽  
Rinah T. Yamamoto ◽  
Susan M. Dallabrida

<p class="abstract"><strong>Background:</strong> In migraine clinical trials, patients’ understanding of the terminology used in patient-reported outcome (PRO) measures is important as variability in completing PRO measures can reduce the power to detect treatment efficacy. This study examines patients’ understanding of how to complete PRO measures in the absence of training, if minimal training can improve the accuracy of answering PRO items, and patients’ opinion on the necessity of training and their preference for the method of training.</p><p class="abstract"><strong>Methods:</strong> Participants reporting a diagnosis of migraine completed online surveys. Participants were given scenarios of how to report headache days and pain severity. Respondents were asked about their opinions on the necessity of training, and their preference for the method of training. In a second study, participants were given a hypothetical scenario on how to report pain severity before and after a short training.</p><p class="abstract"><strong>Results:</strong> The majority of participants had different criteria to interpret PRO questions and provided incorrect answers to our scenarios. In the second study, with minimal training, errors were reduced by 7.5%. Over 90% of participants viewed educational materials and training as necessary and preferred electronic modes of training with the ability to review training materials as needed for the duration of the trial.</p><p class="abstract"><strong>Conclusions: </strong>Patient training may improve data quality and inter-rater reliability in clinical trials. Electronic interactive training could be used as an approach to reduce inconsistencies in PRO measures and improve data quality.</p>


2019 ◽  
Vol 53 (23) ◽  
pp. 1474-1478 ◽  
Author(s):  
Ewa M Roos ◽  
Eleanor Boyle ◽  
Richard B Frobell ◽  
L Stefan Lohmander ◽  
Lina Holm Ingelsrud

IntroductionIn sports physiotherapy, medicine and orthopaedic randomised controlled trials (RCT), the investigators (and readers) focus on the difference between groups in change scores from baseline to follow-up. Mean score changes are difficult to interpret (‘is an improvement of 20 units good?’), and follow-up scores may be more meaningful. We investigated how applying three different responder criteria to change and follow-up scores would affect the ‘outcome’ of RCTs. Responder criteria refers to participants’ perceptions of how the intervention affected them.MethodsWe applied three different criteria—minimal important change (MIC), patient acceptable symptom state (PASS) and treatment failure (TF)—to the aggregate Knee injury and Osteoarthritis Outcome Score (KOOS4) and the five KOOS subscales, the primary and secondary outcomes of the KANON trial (ISRCTN84752559). This trial included young active adults with an acute ACL injury and compared two treatment strategies: exercise therapy plus early reconstructive surgery, and exercise therapy plus delayed reconstructive surgery, if needed.ResultsMIC: At 2 years, more than 90% in the two treatment arms reported themselves to be minimally but importantly improved for the primary outcome KOOS4. PASS: About 50% of participants in both treatment arms reported their KOOS4 follow-up scores to be satisfactory. TF: Almost 10% of participants in both treatment arms found their outcomes so unsatisfactory that they thought their treatment had failed. There were no statistically significant or meaningful differences between treatment arms using these criteria.ConclusionWe applied change criteria as well as cross-sectional follow-up criteria to interpret trial outcomes with more clinical focus. We suggest researchers apply MIC, PASS and TF thresholds to enhance interpretation of KOOS and other patient-reported scores. The findings from this study can improve shared decision-making processes for people with an acute ACL injury.


Neurology ◽  
2018 ◽  
Vol 91 (20) ◽  
pp. e1880-e1892 ◽  
Author(s):  
Ilse Baert ◽  
Tori Smedal ◽  
Alon Kalron ◽  
Kamila Rasova ◽  
Adnan Heric-Mansrud ◽  
...  

ObjectiveTo determine responsiveness of functional mobility measures, and provide reference values for clinically meaningful improvements, according to disability level, in persons with multiple sclerosis (pwMS) in response to physical rehabilitation.MethodsThirteen mobility measures (clinician- and patient-reported) were assessed before and after rehabilitation in 191 pwMS from 17 international centers (European and United States). Combined anchor- and distribution-based methods were used. A global rating of change scale, from patients' and therapists' perspective, served as external criteria when determining the area under the receiver operating characteristic curve (AUC), the minimally important change (MIC), and the smallest real change (SRC). Patients were stratified into 2 subgroups based on disability level (Expanded Disability Status Scale score ≤4 [n = 72], >4 [n = 119]).ResultsThe Multiple Sclerosis Walking Scale–12, physical subscale of the Multiple Sclerosis Impact Scale–29 (especially for the mildly disabled pwMS), Rivermead Mobility Index, and 5-repetition sit-to-stand test (especially for the moderately to severely disabled pwMS) were the most sensitive measures in detecting improvements in mobility. Findings were determined once the AUC (95% confidence interval) was above 0.5, MIC was greater than SRC, and results were comparable from the patient and therapist perspective.ConclusionsResponsiveness, clinically meaningful improvement, and real changes of frequently used mobility measures were calculated, showing great heterogeneity, and were dependent on disability level in pwMS.


2019 ◽  
Vol 47 (2) ◽  
pp. 364-371 ◽  
Author(s):  
Julia C.A. Noorduyn ◽  
Victor A. van de Graaf ◽  
Lidwine B. Mokkink ◽  
Nienke W. Willigenburg ◽  
Rudolf W. Poolman ◽  
...  

Background: Responsiveness and the minimal important change (MIC) are important measurement properties to evaluate treatment effects and to interpret clinical trial results. The International Knee Documentation Committee (IKDC) Subjective Knee Form is a reliable and valid instrument for measuring patient-reported knee-specific symptoms, functioning, and sports activities in a population with meniscal tears. However, evidence on responsiveness is of limited methodological quality, and the MIC has not yet been established for patients with symptomatic meniscal tears. Purpose: To evaluate the responsiveness and determine the MIC of the IKDC for patients with meniscal tears. Study Design: Cohort study (design); Level of evidence 2. Methods: This study was part of the ESCAPE trial: a noninferiority multicenter randomized controlled trial comparing arthroscopic partial meniscectomy with physical therapy. Patients aged 45 to 70 years who were treated for a meniscal tear by arthroscopic partial meniscectomy or physical therapy completed the IKDC and 3 other questionnaires (RAND 36-Item Health Survey, EuroQol-5D-5L, and visual analog scales for pain) at baseline and 6-month follow-up. Responsiveness was evaluated by testing predefined hypotheses about the relation of the change in IKDC with regard to the change in the other self-reported outcomes. An external anchor question was used to distinguish patients reporting improvement versus no change in daily functioning. The MIC was determined by the optimal cutoff point in the receiver operating characteristic curve, which quantifies the IKDC score that best discriminated between patients with and without improvement in daily function. Results: Data from all 298 patients who completed baseline and 6-month follow-up questionnaires were analyzed. Responsiveness of the IKDC was confirmed in 7 of 10 predefined hypotheses about the change in IKDC score with regard to other patient-reported outcome measures. One hypothesis differed in the expected direction, while 2 hypotheses failed to meet the expected magnitude by 0.02 and 0.01 points. An MIC of 10.9 points was calculated for the IKDC of middle-aged and older patients with meniscal tears. Conclusion: This study showed that the IKDC is responsive to change among patients aged 45 to 70 years with meniscal tears, with an MIC of 10.9 points. This strengthens the value of the IKDC in quantifying treatment effects in this population.


2020 ◽  
Vol 20 (1) ◽  
Author(s):  
Thérèse Eriksson ◽  
Hans Tropp ◽  
Ann-Britt Wiréhn ◽  
Lars-Åke Levin

Abstract Background Value-based reimbursement programs have become increasingly common. However, little is known about the effect of such programs on patient reported outcomes. Thus, the aim of this study was to analyze the effect of introducing a value-based reimbursement program on patient reported outcome measures and to explore whether a selection bias towards less complicated patients occurred. Methods This is a retrospective observational study with a before and after design based on the introduction of a value-based reimbursement program in Region Stockholm, Sweden. We analyzed patient level data from inpatient and outpatient care of patients undergoing lumbar spine surgery during 2006–2015. Patient reported outcome measures used was Global Assessment, EQ-5D-3L and Oswestry Disability Index. The case-mix of surgically treated patients was analyzed using medical and socioeconomic factors. Results The value-based reimbursement program did not have any effect on targeted or non-targeted patient reported outcome measures. Moreover, the share of surgically treated patients with risk factors such as having comorbidities and being born outside of Europe increased after the introduction. Hence, the value-based reimbursement program did not encourage discrimination against sicker patients. However, the income was higher among patients surgically treated after the introduction of the value-based reimbursement. This indicates that a value-based reimbursement program may contribute to increased inequalities in access to healthcare. Conclusions The value-based reimbursement program did not have any effect on patient reported outcome measures. Our study contributes to the understanding of the effects of a value-based reimbursement program on patient reported outcome measures and to what extent cherry-picking arises.


Blood ◽  
2018 ◽  
Vol 132 (Supplement 1) ◽  
pp. 2292-2292
Author(s):  
Marjorie E Zettler ◽  
Ethan Basch ◽  
Chadi Nabhan

Abstract Introduction: Patient-reported outcomes (PROs), defined as any report on the status of a patient's health condition that comes directly from the patient without interpretation by anyone else, play an increasingly important role in drug development. In 2009 the FDA issued final guidance on using PRO measures to support labeling claims, to incorporate the patient perspective into the drug approval process. The 21st Century Cures Act emphasizes PROs as a differentiating element in the FDA approval process of new drugs, beyond traditional clinical outcome measures. Further, recent data has shown that intervention based on PROs can improve survival in metastatic cancers (Basch et al; 2017). The incorporation of PROs into the labeling of new drugs for malignant hematology disorders has not been studied and is the subject of this investigation. Methods: We reviewed the FDA's Novel New Drug Summaries (2011-2017) to identify drugs approved for malignant hematology indications. Drug approval packages and product labeling were accessed via the Drugs@FDA database and analyzed for PRO endpoints, measures, and labeling claims. Clinical trial designs and published study results were retrieved via the ClinicalTrials.gov website and PubMed. Results: Of 250 novel drugs approved by the FDA between 2011 and 2017, 22 (8.8%) were approved for malignant hematology indications. Interestingly, only 1 had PRO-based claims in their labeling, even though 13 of the 22 drugs (59%) collected PRO data in pivotal trials that led to their approval. Notably, the proportion of malignant hematology trials assessing PROs has increased over time, with 4 of the 5 drugs approved in 2017 for malignant hematology indications evaluating PROs in their development programs, compared with 9 of the 17 drugs approved in the preceding 6 years (80% vs. 53%). PROs evaluated included generic instruments such as the EQ-5D, and disease-specific instruments such as the EORTC QLQ-C30 (see table). Reasons cited for rejection of PRO data inclusion in drug labeling were single arm trial design, excessive missing data, statistical issues, and use of an inappropriate PRO instrument. Conclusions: While the FDA encourages PRO data submission as part of the new drug approval process, and although more than half of all malignant hematology drugs approved in the past 7 years assessed PROs during development, only 1 was able to successfully acquire labeling claims. Whether this is due to lack of PRO expertise on clinical development teams or absence of strong regulatory guidance on how best to implement PROs remains unknown and requires further research. Designing strategies to develop, validate and report PROs effectively is needed to meet regulatory requirements and enhance patients' voices in their own care. Table. Table. Disclosures Nabhan: Cardinal Health: Employment, Equity Ownership.


Sign in / Sign up

Export Citation Format

Share Document