The Neurobehavioral Rating Scale: An Interrater Reliability Study in the HIV Seropositive Population

1990 ◽  
Vol 22 (1) ◽  
pp. 36-42 ◽  
Author(s):  
Genell Hilton ◽  
Rebecca Sisson ◽  
Edward Freeman
1996 ◽  
Vol 8 (2) ◽  
pp. 247-266 ◽  
Author(s):  
Stefanie R. Auer ◽  
Isabel M. Monteiro ◽  
Barry Reisberg

A clinician should not rely entirely upon a caregiver's report regarding behavioral pathology when planning a treatment strategy. Direct observational evaluation instruments as well as caregiver-based assessments are necessary. A new scale for the empirical (observational) evaluation of behavioral symptoms in Alzheimer's disease (AD) and related dementias, the Empirical Behavioral Pathology in Alzheimer's Disease Rating Scale (E-BEHAVEAD) was developed. Interrater reliability of this new assessment instrument was examined. Additionally, the relationship between the observed occurrence of behavioral symptomatology on this new rating instrument was compared with the occurrence using a similarly designed, caregiver-based instrument. The interrater reliability study consisted of two raters who simultaneously evaluated 20 dementia patients. The comparative study employed a cross-sectional design (N = 49). Individuals were evaluated in an outpatient clinic setting. The study population consisted of cognitively normal individuals and dementia patients. Evaluations included the new, observationally based behavioral assessment (the E-BEHAVE-AD), a caregiver-based behavioral assessment (the Behavioral Pathology in Alzheimer's Disease Rating Scale; BEHAVE-AD), a clinical global measure (the Global Deterioration Scale), and a mental status assessment (the Mini-Mental State Examination). The interrater reliability study revealed an intraclass correlation coefficient of .97 (p < .01) for total scores on the new E-BEHAVE-AD rating scale. The correlation coefficient for the amount of agreement on the presence of symptoms in six symptomatic categories between caregiver-based information about the patient's behavioral pathology assessed on the BEHAVE-AD and the clinician's observations assessed with the new E-BEHAVE-AD rating instrument was .51 (p < .01). The New E-BEHAVE-AD rating instrument showed excellent interrater reliability. Furthermore, there was a statistically significant relationship between clinician observation of the occurrence of behavioral pathology assessed using the E-BEHAVE-AD and caregive-reported pathology assessed with the BEHAVE-AD. However the magnitude of the correlation between these measures indicated that the majority of variance was independent and nonoverlapping. Consequently, these data support theoretical models suggesting that the assessment of behavioral pathology in dementia might ideally encompass both direct observational and caregiver-report approaches, using measures such as the E-BEHAVE-AD as well as measures such as the BEHAVE-AD.


2020 ◽  
Vol 25 (1) ◽  
pp. 37-42 ◽  
Author(s):  
Ros Whelan ◽  
Eric Prince ◽  
David M. Mirsky ◽  
Robert Naftel ◽  
Aashim Bhatia ◽  
...  

OBJECTIVEPediatric adamantinomatous craniopharyngiomas (ACPs) are histologically benign brain tumors that confer significant neuroendocrine morbidity. Previous studies have demonstrated that injury to the hypothalamus is associated with worsened quality of life and a shorter lifespan. This insight helps many surgeons define the goals of surgery for patients with ACP. Puget and colleagues proposed a 3-tiered preoperative and postoperative grading system based on the degree of hypothalamic involvement identified on MRI. In a prospective cohort from their institution, the authors found that use of the system to guide operative goals was associated with decreased morbidity. To date, however, the Puget system has not been externally validated. Here, the authors present an interrater reliability study that assesses the generalizability of this system for surgeons planning initial operative intervention for children with craniopharyngiomas.METHODSA panel of 6 experts, consisting of pediatric neurosurgeons and pediatric neuroradiologists, graded 30 preoperative and postoperative MRI scans according to the Puget system. Interrater reliability was calculated using Fleiss’ κ and Krippendorff’s α statistics.RESULTSInterrater reliability in the preoperative context demonstrated moderate agreement (κ = 0.50, α = 0.51). Interrater reliability in the postoperative context was 0.27 for both methods of statistical evaluation.CONCLUSIONSInterrater reliability for the system as defined is moderate. Slight refinements of the Puget MRI grading system, such as collapsing the 3 grades into 2, may improve its reliability, making the system more generalizable.


2020 ◽  
Vol 80 (4) ◽  
pp. 808-820
Author(s):  
Cindy M. Walker ◽  
Sakine Göçer Şahin

The purpose of this study was to investigate a new way of evaluating interrater reliability that can allow one to determine if two raters differ with respect to their rating on a polytomous rating scale or constructed response item. Specifically, differential item functioning (DIF) analyses were used to assess interrater reliability and compared with traditional interrater reliability measures. Three different procedures that can be used as measures of interrater reliability were compared: (1) intraclass correlation coefficient (ICC), (2) Cohen’s kappa statistic, and (3) DIF statistic obtained from Poly-SIBTEST. The results of this investigation indicated that DIF procedures appear to be a promising alternative to assess the interrater reliability of constructed response items, or other polytomous types of items, such as rating scales. Furthermore, using DIF to assess interrater reliability does not require a fully crossed design and allows one to determine if a rater is either more severe, or more lenient, in their scoring of each individual polytomous item on a test or rating scale.


Author(s):  
Linye Jing ◽  
Maria I. Grigos

Purpose: Forming accurate and consistent speech judgments can be challenging when working with children with speech sound disorders who produce a large number and varied types of error patterns. Rating scales offer a systematic approach to assessing the whole word rather than individual sounds. Thus, these scales can be an efficient way for speech-language pathologists (SLPs) to monitor treatment progress. This study evaluated the interrater reliability of an existing 3-point rating scale using a large group of SLPs as raters. Method: Utilizing an online platform, 30 SLPs completed a brief training and then rated single words produced by children with typical speech patterns and children with speech sound disorders. Words were closely balanced across the three rating categories of the scale. The interrater reliability of the SLPs ratings to a consensus judgment was examined. Results: The majority of SLPs (87%) reached substantial interrater reliability to a consensus judgment using the 3-point rating scale. Correct productions had the highest interrater reliability. Productions with extensive errors had higher agreement than those with minor errors. Certain error types, such as vowel distortions, were especially challenging for SLPs to judge. Conclusions: This study demonstrated substantial interrater reliability to a consensus judgment among a large majority of 30 SLPs using a 3-point rating. The clinical implications of the findings are discussed along with proposed modifications to the training procedure to guide future research.


2015 ◽  
Vol 10 (2) ◽  
pp. 183-190 ◽  
Author(s):  
Joel B. Chidley ◽  
Alexandra L. MacGregor ◽  
Caoimhe Martin ◽  
Calum A. Arthur ◽  
Jamie H. Macdonald

Purpose:To identify physiological, psychological, and skill characteristics that explain performance in downhill (DH) mountainbike racing.Methods:Four studies were used to (1) identify factors potentially contributing to DH performance (using an expert focus group), (2) develop and validate a measure of rider skill (using video analysis and expert judge evaluation), (3) evaluate whether physiological, psychological, and skill variables contribute to performance at a DH competition, and (4) test the specific contribution of aerobic capacity to DH performance.Results:Study 1 identified aerobic capacity, handgrip endurance, anaerobic power, rider skill, and self-confidence as potentially important for DH. In study 2 the rider-skill measure displayed good interrater reliability. Study 3 found that rider skill and handgrip endurance were significantly related to DH ride time (β = –0.76 and –0.14, respectively; R2 = .73), with exploratory analyses suggesting that DH ride time may also be influenced by self-confidence and aerobic capacity. Study 4 confirmed aerobic capacity as an important variable influencing DH performance (for a DH ride, mean oxygen uptake was 49 ± 5 mL · kg−1 · min−1, and 90% of the ride was completed above the 1st ventilatory threshold).Conclusions:In order of importance, rider skill, handgrip endurance, self-confidence, and aerobic capacity were identified as variables influencing DH performance. Practically, this study provides a novel assessment of rider skill that could be used by coaches to monitor training and identify talent. Novel intervention targets to enhance DH performance were also identified, including self-confidence and aerobic capacity.


2018 ◽  
Vol 55 (5) ◽  
pp. 655-663 ◽  
Author(s):  
Supakit Peanchitlertkajorn ◽  
Ana Mercado ◽  
John Daskalogiannakis ◽  
Ronald Hathaway ◽  
Kathleen Russell ◽  
...  

Objective: To compare nasolabial appearance outcomes of patients with complete unilateral cleft lip and palate (CUCLP) in preadolescence from 4 cleft centers including a center using nasoalveolar molding (NAM) and primary nasal reconstruction. Design: Retrospective cohort study. Setting: Four cleft centers in North America. Patients: 135 subjects with repaired CUCLP. Methods: Frontal and profile facial pictures were assessed using the Asher-McDade rating scale. Intra- and interrater reliability were tested using weighted Kappa statistics. Median scores by center were compared with Kruskal-Wallis statistics. Results: Intrarater reliability scores were moderate to good. Interrater reliability scores were moderate. Significant differences ( P < .05) among centers were found. For nasal form, center G (median = 2.83) had better scores than centers C and D (C median = 3.33, D median = 3.17). For nose symmetry, center G had better scores (median = 2.33) than all other centers (B median = 2.67, C median = 2.83, D median = 2.83). For vermillion border, center G had better scores (median = 2.58) than centers B and C (B median = 3.17, C median = 3.17). For nasolabial profile, center G (median score = 2.67) had better scores than center C (median = 3.00). For total nasolabial score, center G (median = 2.67) had better scores than all other centers (B median = 2.83, C median = 3, D median = 2.83). Conclusion: The protocol followed by center G, the only center that performed NAM and primary nasal reconstruction, produced better results in all categories when compared to center C, the only center that did not perform presurgical orthopedics or lip/nose revisions. When compared to centers that performed traditional presurgical orthopedics and surgical revisions (B and D), center G was not consistently better in all categories. As with other uncontrolled, retrospective intercenter studies, it is not possible to attribute the outcomes to a specific protocol component.


2020 ◽  
Vol 100 (3) ◽  
pp. 468-476 ◽  
Author(s):  
Bolette S Rafn ◽  
Chiara A Singh ◽  
Julie Midtgaard ◽  
Pat G Camp ◽  
Margaret L McNeely ◽  
...  

Abstract Background Early identification of breast cancer–related upper body issues is important to enable timely physical therapist treatment. Objective This study evaluated the feasibility and reliability of women performing self-managed prospective surveillance for upper body issues in the early postoperative phase as part of a hospital-based physical therapy program. Design This was a prospective, single-site, single-group feasibility and reliability study. Methods Presurgery arm circumference measurements were completed at home and at the hospital by participants and by a physical therapist. Instruction in self-measurement was provided using a video guide. After surgery, all circumference measurements were repeated along with self-assessment and therapist assessment for shoulder flexion and abduction active range of motion. Feasibility was determined by recruitment/retention rates and participant-reported ease of performing self-measurements (1 [very difficult] to 10 [very easy]). Reliability was determined as intrarater reliability, interrater reliability, and agreement. Results Thirty-three women who were 53.4 (SD = 11.4) years old participated, with recruitment and retention rates of 79% and 94%, respectively. Participant-reported ease of measurement was 8.2 (SD = 2.2) before surgery and 8.0 (SD = 1.9) after surgery. The intrarater reliability and interrater reliability were excellent before surgery (intraclass correlation coefficient [ICC] ≥ 0.94; 95% confidence interval = 0.87–0.97) and after surgery (ICC ≥ 0.91; 95% confidence interval = 0.76–0.96). Agreement between self-assessed and therapist-assessed active shoulder flexion (κ = 0.79) and abduction (κ = 0.71) was good. Limitations Further testing is needed using a prospective design with a longer follow-up to determine whether self-managed prospective surveillance and timely treatment can hinder the development of chronic breast cancer–related upper body issues Conclusions Self-measured arm circumference and shoulder range of motion are reliable, and their inclusion in a hospital-based program of prospective surveillance for upper body issues seems feasible. This approach may improve early detection and treatment


Sign in / Sign up

Export Citation Format

Share Document