scholarly journals Evaluation of the Impact of Calibration of Patient-Reported Outcomes Measures on Clinical Trial Results: A Simulation Study based on Rasch Measurement Theory

Author(s):  
Angély Loubert ◽  
Antoine Regnault ◽  
Véronique Sébille ◽  
Jean-Benoit Hardouin

Abstract BackgroundIn the analysis of clinical trial endpoints, calibration of patient-reported outcomes (PRO) instruments ensures that resulting “scores” represent the same quantity of the measured concept between applications. Rasch measurement theory (RMT) is a psychometric approach that guarantees algebraic separation of person and item parameter estimates, allowing formal calibration of PRO instruments. In the RMT framework, calibration is performed using the item parameter estimates obtained from a previous “calibration” study. But if calibration is based on poorly estimated item parameters (e.g., because the sample size of the calibration sample was low), this may hamper the ability to detect a treatment effect, and direct estimation of item parameters from the trial data (non-calibration) may then be preferred. The objective of this simulation study was to assess the impact of calibration on the comparison of PRO results between treatment groups, using different analysis methods.MethodsPRO results were simulated following a polytomous Rasch model, for a calibration and a trial sample. Scenarios included varying sample sizes, with instrument of varying number of items and modalities, and varying item parameters distributions. Different treatment effect sizes and distributions of the two patient samples were also explored. Comparison of treatment groups was performed using different methods based on a random effect Rasch model. Calibrated and non-calibrated approaches were compared based on type-I error, power, bias, and variance of the estimates for the difference between groups.Results There was no impact of the calibration approach on type-I error, power, bias, and dispersion of the estimates. Among other findings, mistargeting between the PRO instrument and patients from the trial sample (regarding the level of measured concept) resulted in a lower power and higher position bias than appropriate targeting. ConclusionsCalibration of PROs in clinical trials does not compromise the ability to accurately assess a treatment effect and is essential to properly interpret PRO results. Given its important added value, calibration should thus always be performed when a PRO instrument is used as an endpoint in a clinical trial, in the RMT framework.

2021 ◽  
pp. 107699862199436
Author(s):  
Yue Liu ◽  
Hongyun Liu

The prevalence and serious consequences of noneffortful responses from unmotivated examinees are well-known in educational measurement. In this study, we propose to apply an iterative purification process based on a response time residual method with fixed item parameter estimates to detect noneffortful responses. The proposed method is compared with the traditional residual method and noniterative method with fixed item parameters in two simulation studies in terms of noneffort detection accuracy and parameter recovery. The results show that when severity of noneffort is high, the proposed method leads to a much higher true positive rate with a small increase of false discovery rate. In addition, parameter estimation is significantly improved by the strategies of fixing item parameters and iteratively cleansing. These results suggest that the proposed method is a potential solution to reduce the impact of data contamination due to severe low test-taking effort and to obtain more accurate parameter estimates. An empirical study is also conducted to show the differences in the detection rate and parameter estimates among different approaches.


2017 ◽  
Vol 78 (3) ◽  
pp. 460-481 ◽  
Author(s):  
Margarita Olivera-Aguilar ◽  
Samuel H. Rikoon ◽  
Oscar Gonzalez ◽  
Yasemin Kisbu-Sakarya ◽  
David P. MacKinnon

When testing a statistical mediation model, it is assumed that factorial measurement invariance holds for the mediating construct across levels of the independent variable X. The consequences of failing to address the violations of measurement invariance in mediation models are largely unknown. The purpose of the present study was to systematically examine the impact of mediator noninvariance on the Type I error rates, statistical power, and relative bias in parameter estimates of the mediated effect in the single mediator model. The results of a large simulation study indicated that, in general, the mediated effect was robust to violations of invariance in loadings. In contrast, most conditions with violations of intercept invariance exhibited severely positively biased mediated effects, Type I error rates above acceptable levels, and statistical power larger than in the invariant conditions. The implications of these results are discussed and recommendations are offered.


2009 ◽  
Vol 24 (S1) ◽  
pp. 1-1
Author(s):  
A. Leon

Dr. Leon will present the biostatistical considerations that contribute to a clinical trial design and the strategies to enhance signal detection. These include minimizing bias in the estimate of treatment effect while maintaining a nominal level of type I error (i.e., false positive results) and maintaining sufficient statistical power (i.e. reducing the likelihood of false negative results). Particular attention will be paid to reducing the problems of attrition and the hazards of multiplicity. Methods to examine moderators of the treatment effect will also be explored. Examples from psychopharmacologic and psychotherapy trials for the treatment of depression and panic disorder will be provided to illustrate these issues. Following the didactic session, the participants will be encouraged to bring forth their own questions regarding clinical trial design for a 45-minute interactive discussion with the presenters. The objectives of the workshop are to improve the participants’ understanding of the goals of clinical trial design and methods to achieve those goals in order to improve their own research techniques, grantsmanship, and abilities to more accurately judge the results of studies presented in the literature.


2021 ◽  
pp. 096228022110082
Author(s):  
Yang Li ◽  
Wei Ma ◽  
Yichen Qin ◽  
Feifang Hu

Concerns have been expressed over the validity of statistical inference under covariate-adaptive randomization despite the extensive use in clinical trials. In the literature, the inferential properties under covariate-adaptive randomization have been mainly studied for continuous responses; in particular, it is well known that the usual two-sample t-test for treatment effect is typically conservative. This phenomenon of invalid tests has also been found for generalized linear models without adjusting for the covariates and are sometimes more worrisome due to inflated Type I error. The purpose of this study is to examine the unadjusted test for treatment effect under generalized linear models and covariate-adaptive randomization. For a large class of covariate-adaptive randomization methods, we obtain the asymptotic distribution of the test statistic under the null hypothesis and derive the conditions under which the test is conservative, valid, or anti-conservative. Several commonly used generalized linear models, such as logistic regression and Poisson regression, are discussed in detail. An adjustment method is also proposed to achieve a valid size based on the asymptotic results. Numerical studies confirm the theoretical findings and demonstrate the effectiveness of the proposed adjustment method.


Stats ◽  
2019 ◽  
Vol 2 (2) ◽  
pp. 174-188
Author(s):  
Yoshifumi Ukyo ◽  
Hisashi Noma ◽  
Kazushi Maruo ◽  
Masahiko Gosho

The mixed-effects model for repeated measures (MMRM) approach has been widely applied for longitudinal clinical trials. Many of the standard inference methods of MMRM could possibly lead to the inflation of type I error rates for the tests of treatment effect, when the longitudinal dataset is small and involves missing measurements. We propose two improved inference methods for the MMRM analyses, (1) the Bartlett correction with the adjustment term approximated by bootstrap, and (2) the Monte Carlo test using an estimated null distribution by bootstrap. These methods can be implemented regardless of model complexity and missing patterns via a unified computational framework. Through simulation studies, the proposed methods maintain the type I error rate properly, even for small and incomplete longitudinal clinical trial settings. Applications to a postnatal depression clinical trial are also presented.


Author(s):  
Samantha Cruz Rivera ◽  
Derek G. Kyte ◽  
Olalekan Lee Aiyegbusi ◽  
Anita L. Slade ◽  
Christel McMullan ◽  
...  

Abstract Background Patient-reported outcomes (PROs) are commonly collected in clinical trials and should provide impactful evidence on the effect of interventions on patient symptoms and quality of life. However, it is unclear how PRO impact is currently realised in practice. In addition, the different types of impact associated with PRO trial results, their barriers and facilitators, and appropriate impact metrics are not well defined. Therefore, our objectives were: i) to determine the range of potential impacts from PRO clinical trial data, ii) identify potential PRO impact metrics and iii) identify barriers/facilitators to maximising PRO impact; and iv) to examine real-world evidence of PRO trial data impact based on Research Excellence Framework (REF) impact case studies. Methods Two independent investigators searched MEDLINE, EMBASE, CINAHL+, HMIC databases from inception until December 2018. Articles were eligible if they discussed research impact in the context of PRO clinical trial data. In addition, the REF 2014 database was systematically searched. REF impact case studies were included if they incorporated PRO data in a clinical trial. Results Thirty-nine publications of eleven thousand four hundred eighty screened met the inclusion criteria. Nine types of PRO trial impact were identified; the most frequent of which centred around PRO data informing clinical decision-making. The included publications identified several barriers and facilitators around PRO trial design, conduct, analysis and report that can hinder or promote the impact of PRO trial data. Sixty-nine out of two hundred nine screened REF 2014 case studies were included. 12 (17%) REF case studies led to demonstrable impact including changes to international guidelines; national guidelines; influencing cost-effectiveness analysis; and influencing drug approvals. Conclusions PRO trial data may potentially lead to a range of benefits for patients and society, which can be measured through appropriate impact metrics. However, in practice there is relatively limited evidence demonstrating directly attributable and indirect real world PRO-related research impact. In part, this is due to the wider challenges of measuring the impact of research and PRO-specific issues around design, conduct, analysis and reporting. Adherence to guidelines and multi-stakeholder collaboration is essential to maximise the use of PRO trial data, facilitate impact and minimise research waste. Trial registration Systematic Review registration PROSPERO CRD42017067799.


2015 ◽  
Vol 33 (7_suppl) ◽  
pp. 293-293 ◽  
Author(s):  
Alexander M. Helfand ◽  
Cheryl T. Lee ◽  
Khaled Hafez ◽  
Maha Hussain ◽  
Monica Liebert ◽  
...  

293 Background: We conducted a phase II trial to evaluate combination therapy with intravesical BCG + sunitinib for prevention of recurrence and progression of NMIBC. Methods: Patients with high-grade clinical ≤ T1N0M0 NMIBC without BCG in the past year were eligible and received induction BCG followed 2 weeks later by 28 days of sunitinib (50mg). The primary endpoint was 3 month complete response (CR) by biopsy and cytology. Patients with incomplete response were eligible for a second cycle of BCG + sunitinib. Secondary endpoints included 2-year recurrence and progression-free survival (RFS, PFS). Toxicity was graded according to the NCI CTCAE v.3.0. The Simon Minimax 2-stage study had 80% power with a 5% type I error assuming a 3m CR of 75% with sunitinib + BCG compared to 55% with BCG alone. If ≥ 25/36 evaluable patients achieved a 3m CR, then the treatment would be considered for further study. Binomial proportions, confidence intervals and Kaplan-Meier estimates are reported. Results: Of 36 evaluable patients, median age was 65.9 years (IQR 59-72). Initial stage was T1 (19), Ta (9), and CIS (8). Thirty-six percent completed sunitinib without interruption. Treatment was delayed (median 12 days (IQR 9-16)) and dose was reduced to 37.5 mg in 13 patients. One patient had reduction to 25mg with re-escalation to 37.5mg. One patient completed a 2nd cycle of BCG + sunitinib for incomplete response. BCG maintenance therapy was given to 21 patients. Of 133 adverse events in 34/36 patients, 6 (4.5%) in 5 patients were ≥ grade 3: thrombocytopenia, diarrhea (2), shingles, extremity rash/pain and hand + foot syndrome. CR at 3m included 26/36 (72%, 95% CI[55,86]) reaching the primary endpoint. The patient who completed a 2nd cycle of BCG induction and sunitinib had CR at 6 months. 2y RFS (patients with intact bladder) was 77% (95% CI[58,88]) and 2y PFS was 100%. Conclusions: The primary endpoint of the study of 25 3m CR has been reached. Combined treatment with BCG + sunitinib is associated with low rates of recurrence and progression. Adverse effects were common and frequent but few were serious. BCG + sunitinib may produce outcomes superior to BCG alone. (Study supported by Pfizer, Inc) Clinical trial information: NCT00794950.


2015 ◽  
Vol 46 (3) ◽  
pp. 586-603 ◽  
Author(s):  
Ma Dolores Hidalgo ◽  
Isabel Benítez ◽  
Jose-Luis Padilla ◽  
Juana Gómez-Benito

The growing use of scales in survey questionnaires warrants the need to address how does polytomous differential item functioning (DIF) affect observed scale score comparisons. The aim of this study is to investigate the impact of DIF on the type I error and effect size of the independent samples t-test on the observed total scale scores. A simulation study was conducted, focusing on potential variables related to DIF in polytomous items, such as DIF pattern, sample size, magnitude, and percentage of DIF items. The results showed that DIF patterns and the number of DIF items affected the type I error rates and effect size of t-test values. The results highlighted the need to analyze DIF before making comparative group interpretations.


Sign in / Sign up

Export Citation Format

Share Document