Development and assessment of the inter-rater and intra-rater reproducibility of a self-administration version of the ALSFRS-R

2019 ◽  
Vol 91 (1) ◽  
pp. 75-81 ◽  
Author(s):  
Leonhard A Bakker ◽  
Carin D Schröder ◽  
Harold H G Tan ◽  
Simone M A G Vugts ◽  
Ruben P A van Eijk ◽  
...  

ObjectiveThe Amyotrophic Lateral Sclerosis Functional Rating Scale-Revised (ALSFRS-R) is widely applied to assess disease severity and progression in patients with motor neuron disease (MND). The objective of the study is to assess the inter-rater and intra-rater reproducibility, i.e., the inter-rater and intra-rater reliability and agreement, of a self-administration version of the ALSFRS-R for use in apps, online platforms, clinical care and trials.MethodsThe self-administration version of the ALSFRS-R was developed based on both patient and expert feedback. To assess the inter-rater reproducibility, 59 patients with MND filled out the ALSFRS-R online and were subsequently assessed on the ALSFRS-R by three raters. To assess the intra-rater reproducibility, patients were invited on two occasions to complete the ALSFRS-R online. Reliability was assessed with intraclass correlation coefficients, agreement was assessed with Bland-Altman plots and paired samples t-tests, and internal consistency was examined with Cronbach’s coefficient alpha.ResultsThe self-administration version of the ALSFRS-R demonstrated excellent inter-rater and intra-rater reliability. The assessment of inter-rater agreement demonstrated small systematic differences between patients and raters and acceptable limits of agreement. The assessment of intra-rater agreement demonstrated no systematic changes between time points; limits of agreement were 4.3 points for the total score and ranged from 1.6 to 2.4 points for the domain scores. Coefficient alpha values were acceptable.DiscussionThe self-administration version of the ALSFRS-R demonstrates high reproducibility and can be used in apps and online portals for both individual comparisons, facilitating the management of clinical care and group comparisons in clinical trials.

2013 ◽  
Vol 25 (9) ◽  
pp. 1503-1511 ◽  
Author(s):  
Florindo Stella ◽  
Orestes Vicente Forlenza ◽  
Jerson Laks ◽  
Larissa Pires de Andrade ◽  
Michelle A. Ljubetic Avendaño ◽  
...  

ABSTRACTBackground:Patients with dementia may be unable to describe their symptoms, and caregivers frequently suffer emotional burden that can interfere with judgment of the patient's behavior. The Neuropsychiatric Inventory-Clinician rating scale (NPI-C) was therefore developed as a comprehensive and versatile instrument to assess and accurately measure neuropsychiatric symptoms (NPS) in dementia, thereby using information from caregiver and patient interviews, and any other relevant available data. The present study is a follow-up to the original, cross-national NPI-C validation, evaluating the reliability and concurrent validity of the NPI-C in quantifying psychopathological symptoms in dementia in a large Brazilian cohort.Methods:Two blinded raters evaluated 312 participants (156 patient-knowledgeable informant dyads) using the NPI-C for a total of 624 observations in five Brazilian centers. Inter-rater reliability was determined through intraclass correlation coefficients for the NPI-C domains and the traditional NPI. Convergent validity included correlations of specific domains of the NPI-C with the Brief Psychiatric Rating Scale (BPRS), the Cohen-Mansfield Agitation Index (CMAI), the Cornell Scale for Depression in Dementia (CSDD), and the Apathy Inventory (AI).Results:Inter-rater reliability was strong for all NPI-C domains. There were high correlations between NPI-C/delusions and BPRS, NPI-C/apathy-indifference with the AI, NPI-C/depression-dysphoria with the CSDD, NPI-C/agitation with the CMAI, and NPI-C/aggression with the CMAI. There was moderate correlation between the NPI-C/aberrant vocalizations and CMAI and the NPI-C/hallucinations with the BPRS.Conclusion:The NPI-C is a comprehensive tool that provides accurate measurement of NPS in dementia with high concurrent validity and inter-rater reliability in the Brazilian setting. In addition to universal assessment, the NPI-C can be completed by individual domains.


Author(s):  
Zukiswa Zingela ◽  
Louise Stroud ◽  
Johan Cronje ◽  
Max Fink ◽  
Stephan van Wyk

Abstract Background Clinical assessment of catatonia includes the use of diagnostic systems, such as the Diagnostic and Statistical Manual, Fifth Edition (DSM-5) and the International Classification of Disease, Tenth Revision (ICD-10), or screening tools such as the Bush Francis Catatonia Screening Instrument (BFCSI)/Bush Francis Catatonia Rating Scale (BFCRS) and the Braunig Catatonia Rating Scale. In this study, we describe the inter-rater reliability (IRR), utilizing the BFCSI, BFCRS, and DSM-5 to screen for catatonia. Methods Data from 10 participants recruited as part of a larger prevalence study (of 135 participants) were used to determine the IRR by five assessors after they were trained in the application of the 14-item BFCSI, 23-item BFCRS, and DSM-5 to assess catatonia in new admissions. Krippendorff’s α was used to compute the IRR, and Spearman’s correlation was used to determine the concordance between screening tools. The study site was a 35-bed acute mental health unit in Dora Nginza Hospital, Nelson Mandela Bay Metro. Participants were mostly involuntary admissions under the Mental Health Care Act of 2002 and between the ages of 13 and 65 years. Results Of the 135 participants, 16 (11.9%) had catatonia. The majority (92 [68.1%]) were between 16 and 35 years old, with 126 (93.3%) of them being Black and 89 (66.4%) being male. The BFCRS (complete 23-item scale) had the greatest level of inter-rater agreement with α = 0.798, while the DSM-5 had the lowest level of inter-rater agreement with α = 0.565. The highest correlation coefficients were observed between the BFCRS and the BFCSI. Conclusion The prevalence rate of catatonia was 11.9%, with the BFCSI and BFCRS showing the highest pick-up rate and a high IRR with high correlation coefficients, while the DSM-5 had deficiencies in screening for catatonia with low IRR and the lowest correlation with the other two tools.


2016 ◽  
Vol 10 (1) ◽  
pp. 179-189 ◽  
Author(s):  
Michael Pierrynowski ◽  
Colleen McPhee ◽  
Saurabh P. Mehta ◽  
Joy C. MacDermid ◽  
Anita Gross

Background: Whiplash-Associated Disorders (WAD) are common following a motor vehicle accident. The Functional Impairment Test - Hand, and Neck/Shoulder/Arm (FIT-HaNSA) assesses upper extremity physical performance. It has been validated in patients with shoulder pathology but not in those with WAD. Objectives: Establish the Intra and inter-rater reliability and the known-group and construct validity of the FIT-HaNSA in patients with Grade II WAD (WAD2). Methods: Twenty-five patients with WAD2 and 41 healthy controls were recruited. Numeric Pain Rating Scale (NPRS), Neck Disability Index (NDI), Disabilities of the Arm, Shoulder and Hand (DASH), cervical range of motion (CROM), and FIT-HaNSA were completed at two sessions conducted 2 to 7 days apart by two raters. Intraclass correlation coefficients (ICC) were used to describe Intra and inter-rater reliability. Spearman rank correlation coefficients (ρ) were used to quantify the associations between scores of the FIT-HaNSA and other measures in the WAD2 group (convergent construct validity). Results: The Intra and inter-ICCs for the FIT-HaNSA scores ranged from 0.88 to 0.89 in the control group and 0.78 to 0.85 in the WAD2 group. Statistically significant differences in FIT-HaNSA performance between the two groups suggested known group construct validity (P < 0.001). The correlations between the NPRS, NDI, DASH, CROM and FIT-HaNSA were generally poor (ρ < 0.4). Conclusion: The study results indicate that the total FIT-HaNSA score has good Intra and inter-rater reliability and the construct validity in WAD2 and healthy controls.


2021 ◽  
Author(s):  
Zuksiwa Zingela ◽  
Louise Stroud ◽  
Johan Cronje ◽  
Max Fink ◽  
Stephan van Wyk

Abstract BackgroundClinical assessment of catatonia includes the use of diagnostic systems, like the Diagnostic and Statistical Manual 5 (DSM-5) and International Classification of Disease Manual 10 (ICD-10), or screening tools such as the Bush Francis Screening Instrument (BFCSI)/Bush Francis Catatonia Rating Scale (BFCRS) and Braunig’s Catatonia Rating Scale. In this study, we describe the inter-rater reliability (IRR), utilizing the BFCSI, BFCRS and DSM-5 to screen for catatonia. MethodData from 10 participants recruited as part of a larger prevalence study were used to determine IRR by five assessors after they were trained in the application of the 14-item BFCSI, 23-item BFCRS and DSM-5 to assess catatonia in new admissions. Krippendorff α was used to compute the IRR, and Spearman’s correlation was used to determine concordance between screening tools. The study was performed at a 35-bed acute mental health unit in Dora Nginza Hospital, Nelson Mandela Bay Metro. Participants were mostly involuntary admissions under the Mental Health Care Act of 2002 and between the ages of 13 and 65 years and older.ResultsThe majority (8, 80%) were 35 years or less and males (8, 80%). At least six (60%) of the ten participants were found to have catatonia based on the BFCSI/BFCRS and DSM-5 criteria.The BFCRS (complete 23-item scale) had the greatest level of inter-rater agreement with an α=0.798, while the DSM-5 had the lowest level of inter-rater agreement with an α=0.565. The highest correlation coefficients were observed between the BFCRS and BFCSI.ConclusionThe BFCSI and BFCRS were found to be sensitive tools for screening for catatonia with high IRR and high correlation coefficients, while the DSM 5 showed some deficiencies in screening for catatonia with low IRR and lowest correlation with the other two tools.


2017 ◽  
Vol 10 (5) ◽  
pp. 462-466 ◽  
Author(s):  
Scott L Zuckerman ◽  
Nikita Lakomkin ◽  
Jordan A Magarik ◽  
Jan Vargas ◽  
Marcus Stephens ◽  
...  

BackgroundThe angiographic evaluation of previously coiled aneurysms can be difficult yet remains critical for determining re-treatment.ObjectiveThe main objective of this study was to determine the inter-rater reliability for both the Raymond Scale and per cent embolization among a group of neurointerventionalists evaluating previously embolized aneurysms.MethodsA panel of 15 neurointerventionalists examined 92 distinct cases of immediate post-coil embolization and 1 year post-embolization angiographs. Each case was presented four times throughout the study, along with alterations in demographics in order to evaluate intra-rater reliability. All respondents were asked to provide the per cent embolization (0–100%) and Raymond Scale grade (1-3) for each aneurysm. Inter-rater reliability was evaluated by computing weighted kappa values (for the Raymond Scale) and intraclass correlation coefficients (ICC) for per cent embolization.Results10 neurosurgeons and 5 interventional neuroradiologists evaluated 368 simulated cases. The agreement among all readers employing the Raymond Scale was fair (κ=0.35) while concordance in per cent embolization was good (ICC=0.64). Clinicians with fewer than 10 years of experience demonstrated a significantly greater level of agreement than the group with greater than 10 years (κ=0.39 and ICC=0.70 vs κ=0.28 and ICC=0.58). When the same aneurysm was presented multiple times, clinicians demonstrated excellent consistency when assessing per cent embolization (ICC=0.82), but moderate agreement when employing the Raymond classification (κ=0.58).ConclusionsIdentifying the per cent embolization in previously coiled aneurysms resulted in good inter- and intra-rater agreement, regardless of years of experience. The strong agreement among providers employing per cent embolization may make it a valuable tool for embolization assessment in this patient population.


2021 ◽  
Author(s):  
Zuksiwa Zingela ◽  
Louise Stroud ◽  
Johan Cronje ◽  
Max Fink ◽  
Stephan Van Wyk

Abstract BackgroundClinical assessment of catatonia includes the use of diagnostic systems, like the Diagnostic and Statistical Manual 5 (DSM-5) and International Classification of Disease Manual 10 (ICD-10), or screening tools such as the Bush Francis Screening Instrument (BFCSI)/Bush Francis Catatonia Rating Scale (BFCRS) and Braunig’s Catatonia Rating Scale. In this study, we describe the inter-rater reliability (IRR), utilizing the BFCSI, BFCRS and DSM-5 to screen for catatonia. MethodData from 10 participants recruited as part of a larger prevalence study were used to determine IRR by five assessors after they were trained in the application of the 14-item BFCSI, 23-item BFCRS and DSM-5 to assess catatonia in new admissions. Krippendorff α was used to compute the IRR, and Spearman’s correlation was used to determine concordance between screening tools. The study was performed at a 35-bed acute mental health unit in Dora Nginza Hospital, Nelson Mandela Bay Metro. Participants were mostly involuntary admissions under the Mental Health Care Act of 2002 and between the ages of 13 and 65 years and older. ResultsThe majority (8, 80%) were 35 years or less and males (8, 80%). At least six (60%) of the ten participants were found to have catatonia based on the BFCSI/BFCRS and DSM-5 criteria. The BFCRS (complete 23-item scale) had the greatest level of inter-rater agreement with an α=0.798, while the DSM-5 had the lowest level of inter-rater agreement with an α=0.565. The highest correlation coefficients were observed between the BFCRS and BFCSI. ConclusionThe BFCSI and BFCRS were found to be sensitive tools for screening for catatonia with high IRR and high correlation coefficients, while the DSM 5 showed some deficiencies in screening for catatonia with low IRR and lowest correlation with the other two tools.


2021 ◽  
Vol 12 ◽  
Author(s):  
Wei Xia ◽  
William Ho Cheung Li ◽  
Tingna Liang ◽  
Yuanhui Luo ◽  
Laurie Long Kwan Ho ◽  
...  

Objectives: This study conducted a linguistic and psychometric evaluation of the Chinese Counseling Competencies Scale-Revised (CCS-R).Methods: The Chinese CCS-R was created from the original English version using a standard forward-backward translation process. The psychometric properties of the Chinese CCS-R were examined in a cohort of 208 counselors-in-training by two independent raters. Fifty-three counselors-in-training were asked to undergo another counseling performance evaluation for the test-retest. The confirmatory factor analysis (CFA) was conducted for the Chinese CCS-R, followed by internal consistency, test-retest reliability, inter-rater reliability, convergent validity, and concurrent validity.Results: The results of the CFA supported the factorial validity of the Chinese CCS-R, with adequate construct replicability. The scale had a McDonald's omega of 0.876, and intraclass correlation coefficients of 0.63 and 0.90 for test-retest reliability and inter-rater reliability, respectively. Significantly positive correlations were observed between the Chinese CCS-R score and scores of performance checklist (Pearson's γ = 0.781), indicating a large convergent validity, and knowledge on drug abuse (Pearson's γ = 0.833), indicating a moderate concurrent validity.Conclusion: The results support that the Chinese CCS-R is a valid and reliable measure of the counseling competencies.Practice implication: The CCS-R provides trainers with a reliable tool to evaluate counseling students' competencies and to facilitate discussions with trainees about their areas for growth.


10.2196/20172 ◽  
2021 ◽  
Vol 4 (1) ◽  
pp. e20172
Author(s):  
Masanori Tanaka ◽  
Manabu Saito ◽  
Michio Takahashi ◽  
Masaki Adachi ◽  
Kazuhiko Nakamura

Background Early detection and intervention for neurodevelopmental disorders are effective. Several types of paper questionnaires have been developed to assess these conditions in early childhood; however, the psychometric equivalence between the web-based and the paper versions of these questionnaires is unknown. Objective This study examined the interformat reliability of the web-based parent-rated version of the Autism Spectrum Screening Questionnaire (ASSQ), Attention-Deficit/Hyperactivity Disorder Rating Scale (ADHD-RS), Developmental Coordination Disorder Questionnaire 2007 (DCDQ), and Strengths and Difficulties Questionnaire (SDQ) among Japanese preschoolers in a community developmental health check-up setting. Methods A set of paper-based questionnaires were distributed for voluntary completion to parents of children aged 5 years. The package of the paper format questionnaires included the ASSQ, ADHD-RS, DCDQ, parent-reported SDQ (P-SDQ), and several additional demographic questions. Responses were received from 508 parents of children who agreed to participate in the study. After 3 months, 300 parents, who were among the initial responders, were randomly selected and asked to complete the web-based versions of these questionnaires. A total of 140 parents replied to the web-based format and were included as a final sample in this study. Results We obtained the McDonald ω coefficients for both the web-based and paper formats of the ASSQ (web-based: ω=.90; paper: ω=.86), ADHD-RS total and subscales (web-based: ω=.88-.94; paper: ω=.87-.93), DCDQ total and subscales (web-based: ω=.82-.94; paper: ω=.74-.92), and P-SDQ total and subscales (web-based: ω=.55-.81; paper: ω=.52-.80). The intraclass correlation coefficients between the web-based and paper formats were all significant at the 99.9% confidence level: ASSQ (r=0.66, P<.001); ADHD-RS total and subscales (r=0.66-0.74, P<.001); DCDQ total and subscales (r=0.66-0.71, P<.001); P-SDQ Total Difficulties and subscales (r=0.55-0.73, P<.001). There were no significant differences between the web-based and paper formats for total mean score of the ASSQ (P=.76), total (P=.12) and subscale (P=.11-.47) mean scores of DCDQ, and the P-SDQ Total Difficulties mean score (P=.20) and mean subscale scores (P=.28-.79). Although significant differences were found between the web-based and paper formats for mean ADHD-RS scores (total: t132=2.83, P=.005; Inattention subscale: t133=2.15, P=.03; Hyperactivity/Impulsivity subscale: t133=3.21, P=.002), the effect sizes were small (Cohen d=0.18-0.22). Conclusions These results suggest that the web-based versions of the ASSQ, ADHD-RS, DCDQ, and P-SDQ were equivalent, with the same level of internal consistency and intrarater reliability as the paper versions, indicating the applicability of the web-based versions of these questionnaires for assessing neurodevelopmental disorders.


2019 ◽  
Author(s):  
Marco Bardus ◽  
Nathalie Awada ◽  
Lilian A Ghandour ◽  
Elie-Jacques Fares ◽  
Tarek Gherbal ◽  
...  

BACKGROUND With thousands of health apps in app stores globally, it is crucial to systemically and thoroughly evaluate the quality of these apps due to their potential influence on health decisions and outcomes. The Mobile App Rating Scale (MARS) is the only currently available tool that provides a comprehensive, multidimensional evaluation of app quality, which has been used to compare medical apps from American and European app stores in various areas, available in English, Italian, Spanish, and German. However, this tool is not available in Arabic. OBJECTIVE This study aimed to translate and adapt MARS to Arabic and validate the tool with a sample of health apps aimed at managing or preventing obesity and associated disorders. METHODS We followed a well-established and defined “universalist” process of cross-cultural adaptation using a mixed methods approach. Early translations of the tool, accompanied by confirmation of the contents by two rounds of separate discussions, were included and culminated in a final version, which was then back-translated into English. Two trained researchers piloted the MARS in Arabic (MARS-Ar) with a sample of 10 weight management apps obtained from Google Play and the App Store. Interrater reliability was established using intraclass correlation coefficients (ICCs). After reliability was ascertained, the two researchers independently evaluated a set of additional 56 apps. RESULTS MARS-Ar was highly aligned with the original English version. The ICCs for MARS-Ar (0.836, 95% CI 0.817-0.853) and MARS English (0.838, 95% CI 0.819-0.855) were good. The MARS-Ar subscales were highly correlated with the original counterparts (<i>P</i>&lt;.001). The lowest correlation was observed in the area of usability (<i>r</i>=0.685), followed by aesthetics (<i>r</i>=0.827), information quality (<i>r</i>=0.854), engagement (<i>r</i>=0.894), and total app quality (<i>r</i>=0.897). Subjective quality was also highly correlated (<i>r</i>=0.820). CONCLUSIONS MARS-Ar is a valid instrument to assess app quality among trained Arabic-speaking users of health and fitness apps. Researchers and public health professionals in the Arab world can use the overall MARS score and its subscales to reliably evaluate the quality of weight management apps. Further research is necessary to test the MARS-Ar on apps addressing various health issues, such as attention or anxiety prevention, or sexual and reproductive health.


2020 ◽  
Vol 15 (1) ◽  
Author(s):  
Ana Jovanovic ◽  
Philipp Klassen ◽  
Peter Heuschmann ◽  
Claudia Sommer ◽  
Mark Roberts ◽  
...  

Abstract Background Pain is an early symptom of Fabry disease (FD) and is characterized by a unique phenotype with mainly episodic acral and triggerable burning pain. Recently, we designed and validated the first pain questionnaire for adult FD patients in an interview and a self-administered version in German: the Würzburg Fabry Pain Questionnaire (FPQ). We now report the validation of the English version of the self-administered FPQ (enFPQ). Methods After two forward–backward translations of the FPQ by native German and native English speakers, the enFPQ was applied at The Mark Holland Metabolic Unit, Manchester, UK for validation. Consecutive patients with genetically ascertained FD and current or previous FD pain underwent a face-to-face interview using the enFPQ. Two weeks later, patients filled in the self-administered enFPQ at home. The agreement between entries collected by supervised administration and self-administration of the enFPQ was assessed via Gwet’s AC1-statistics (AC1) for nominal-scaled scores and intraclass correlation coefficient (ICC) for interval-scaled elements. Results Eighty-three FD patients underwent the face-to-face interview and 54 patients sent back a completed self-administered version of the enFPQ 2 weeks later. We found high agreement with a mean AC1-statistics of 0.725 for 55 items, and very high agreement with a mean ICC of 0.811 for 9 items. Conclusions We provide the validated English version of the FPQ for self-administration in adult FD patients. The enFPQ collects detailed information on the individual FD pain phenotype and thus builds a solid basis for better pain classification and treatment in patients with FD.


Sign in / Sign up

Export Citation Format

Share Document