Reconsidering the Psychometrics of the Gifted Rating Scales – School Form: Evidence for Parsimony in Measurement

Author(s):  
Yaacov Petscher ◽  
Steven I. Pfeiffer

The authors evaluated measurement-level, factor-level, item-level, and scale-level revisions to the Gifted Rating Scales – School Form (GRS-S). Measurement-level considerations tested the extent to which treating the Likert scale rating as categorical or continuous produced different fit across unidimensional, correlated trait, and bi-factor latent factor structures. Item- and scale-level analyses demonstrated that the GRS-S could be reduced from a 72-item assessment on a 9-point rating scale down to a 30-item assessment on a 3-point rating scale. Reliability from the reduced assessment was high (ω > .95). ROC curve comparisons between the original and reduced versions of the GRS-S showed that diagnostic accuracy (i.e., area under the curve) of the scales was comparable when considering cut-scores of 120, 125, and 130 on the WISC-IV full scale and verbal IQ and the WIAT-III composite score. The findings suggest that a brief form of the GRS-S can be used as a universal or selective screener for giftedness without sacrificing key psychometric considerations.

2019 ◽  
Vol 46 (1) ◽  
pp. 55-66
Author(s):  
Yaacov Petscher ◽  
Steven I. Pfeiffer

The authors evaluated measurement-level, factor-level, item-level, and scale-level revisions to the Gifted Rating Scales–School Form (GRS-S). Measurement-level considerations tested the extent to which treating the Likert-type scale rating as categorical or continuous produced different fit across unidimensional, correlated trait, and bifactor latent factor structures. Item- and scale-level analyses demonstrated that the GRS-S could be reduced from a 72-item assessment on a 9-point rating scale down to a 30-item assessment on a 3-point rating scale. Reliability from the reduced assessment was high (ω > .95). Receiver operating characteristic (ROC) curve comparisons between the original and reduced versions of the GRS-S showed that diagnostic accuracy (i.e., area under the curve) of the scales was comparable when considering cut scores of 120, 125, and 130 on the WISC-IV Full Scale ( Wechsler Intelligence Scale for Child–Fourth Edition) and verbal IQ and the WIAT-III (Wechsler Individual Achievement Test–Third Edition) composite score. The findings suggest that a brief form of the GRS-S can be used as a universal or selective screener for giftedness without sacrificing key psychometric considerations.


2005 ◽  
Vol 48 (2) ◽  
pp. 323-335 ◽  
Author(s):  
Rahul Shrivastav ◽  
Christine M. Sapienza ◽  
Vuday Nandur

Rating scales are commonly used to study voice quality. However, recent research has demonstrated that perceptual measures of voice quality obtained using rating scales suffer from poor interjudge agreement and reliability, especially in the midrange of the scale. These findings, along with those obtained using multidimensional scaling (MDS), have been interpreted to show that listeners perceive voice quality in an idiosyncratic manner. Based on psychometric theory, the present research explored an alternative explanation for the poor interlistener agreement observed in previous research. This approach suggests that poor agreement between listeners may result, in part, from measurement errors related to a variety of factors rather than true differences in the perception of voice quality. In this study, 10 listeners rated breathiness for 27 vowel stimuli using a 5-point rating scale. Each stimulus was presented to the listeners 10 times in random order. Interlistener agreement and reliability were calculated from these ratings. Agreement and reliability were observed to improve when multiple ratings of each stimulus from each listener were averaged and when standardized scores were used instead of absolute ratings. The probability of exact agreement was found to be approximately .9 when using averaged ratings and standardized scores. In contrast, the probability of exact agreement was only .4 when a single rating from each listener was used to measure agreement. These findings support the hypothesis that poor agreement reported in past research partly arises from errors in measurement rather than individual differences in the perception of voice quality.


Author(s):  
Linye Jing ◽  
Maria I. Grigos

Purpose: Forming accurate and consistent speech judgments can be challenging when working with children with speech sound disorders who produce a large number and varied types of error patterns. Rating scales offer a systematic approach to assessing the whole word rather than individual sounds. Thus, these scales can be an efficient way for speech-language pathologists (SLPs) to monitor treatment progress. This study evaluated the interrater reliability of an existing 3-point rating scale using a large group of SLPs as raters. Method: Utilizing an online platform, 30 SLPs completed a brief training and then rated single words produced by children with typical speech patterns and children with speech sound disorders. Words were closely balanced across the three rating categories of the scale. The interrater reliability of the SLPs ratings to a consensus judgment was examined. Results: The majority of SLPs (87%) reached substantial interrater reliability to a consensus judgment using the 3-point rating scale. Correct productions had the highest interrater reliability. Productions with extensive errors had higher agreement than those with minor errors. Certain error types, such as vowel distortions, were especially challenging for SLPs to judge. Conclusions: This study demonstrated substantial interrater reliability to a consensus judgment among a large majority of 30 SLPs using a 3-point rating. The clinical implications of the findings are discussed along with proposed modifications to the training procedure to guide future research.


2020 ◽  
Vol 35 (7) ◽  
pp. 1094-1108
Author(s):  
Morgan E Nitta ◽  
Brooke E Magnus ◽  
Paul S Marshall ◽  
James B Hoelzle

Abstract There are many challenges associated with assessment and diagnosis of ADHD in adulthood. Utilizing the graded response model (GRM) from item response theory (IRT), a comprehensive item-level analysis of adult ADHD rating scales in a clinical population was conducted with Barkley's Adult ADHD Rating Scale-IV, Self-Report of Current Symptoms (CSS), a self-report diagnostic checklist and a similar self-report measure quantifying retrospective report of childhood symptoms, Barkley's Adult ADHD Rating Scale-IV, Self-Report of Childhood Symptoms (BAARS-C). Differences in item functioning were also considered after identifying and excluding individuals with suspect effort. Items associated with symptoms of inattention (IA) and hyperactivity/impulsivity (H/I) are endorsed differently across the lifespan, and these data suggest that they vary in their relationship to the theoretical constructs of IA and H/I. Screening for sufficient effort did not meaningfully change item level functioning. The application IRT to direct item-to-symptom measures allows for a unique psychometric assessment of how the current DSM-5 symptoms represent latent traits of IA and H/I. Meeting a symptom threshold of five or more symptoms may be misleading. Closer attention given to specific symptoms in the context of the clinical interview and reported difficulties across domains may lead to more informed diagnosis.


2020 ◽  
Author(s):  
Alexandria Remus ◽  
Valerie Smith ◽  
Francesca Wuytack

Abstract Background: As the development of core outcome sets (COS) increases, guidance for developing and reporting high-quality COS continues to evolve; however, a number of methodological uncertainties still remain. The objectives of this study were: (1) to explore the impact of including patient interviews in developing a COS, (2) to examine the impact of using a 5-point versus a 9-point rating scale during Delphi consensus methods on outcome selection and (3) to inform and contribute to COS development methodology by advancing the evidence base on COS development techniques. Methods: Semi-structured patient interviews and a nested randomised controlled parallel group trial as part of the Pelvic Girdle Pain Core Outcome Set project (PGP-COS). Patient interviews, as an adjunct to a systematic review of outcomes reported in previous studies, were undertaken to identify preliminary outcomes for including in a Delphi consensus survey. In the Delphi survey, participants were randomised (1:1) to a 5-point or 9-point rating scale for rating the importance of the list of preliminary outcomes. Results: Four of the eight patient interview derived outcomes were included in the preliminary COS, however, none of these outcomes were included in the final PGP-COS. The 5-point rating scale resulted in twice as many outcomes reaching consensus after the 3-round Delphi survey compared to the 9-point scale. Consensus on all five outcomes included in the final PGP-COS was achieved by participants allocated the 5-point rating scale, whereas consensus on four of these was achieved by those using the 9-point scale. Conclusions: Using patient interviews to identify preliminary outcomes as an adjunct to conducting a systematic review of outcomes measured in the literature did not appear to influence outcome selection in developing the COS in this study. The use of different rating scales in a Delphi survey, however, did appear to impact on outcome selection. The 5-point scale demonstrated greater congruency than the 9-point scale with the outcomes included in the final PGP-COS. Future research to substantiate our findings and to explore the impact of other rating scales on outcome selection during COS development, however, is warranted.


2015 ◽  
Vol 20 (3) ◽  
pp. 159-163
Author(s):  
Nadine Wickboldt ◽  
Georges Savoldelli ◽  
Benno Rehberg-Klug

BACKGROUND: Assessment of dynamic changes in painful experiences, such as labour, using conventional rating scales (eg, numerical rating scale [NRS]) has limitations. An alternative for continuous pain evaluation could be a signal generated by voluntary action of the parturient. Remifentanil administration for obstetric analgesia could be improved by these dynamic measures of labour pain. In the present study, handgrip force was measured by a dynamometer to signal labour pain.OBJECTIVES: To evaluate: whether continuous monitoring of labour pain using handgrip force allows for determination of pain measurement during contractions; and the correlation between handgrip force and pain intensity on NRS.METHODS: The present observational, single-centre study included 43 parturients. After calibration of the dynamometer for individual hand muscle strength, pain was recorded during early and late labour using a dynamometer and an NRS. The primary end point was the correlation coefficient between NRS ratings and peak intensity recorded by the dynamometer.RESULTS: All dynamometer-registered readings were also registered by the external tocogram. All contractions recorded by external tocogram were also registered by the dynamometer. Handgrip force was moderately correlated with pain scores on the NRS. Mean handgrip force during contractions had the highest correlation coefficient (Pearson’s r=0.67) compared with peak handgrip force (r=0.56) and area under the curve of handgrip force (r=0.55).CONCLUSIONS: Pain intensity and duration can be assessed continuously using handgrip force measured via a dynamometer. The feedback of intensity and duration of pain could optimize patient-controlled remifentantil application for obstetric analgesia and other situations of highly variable pain intensity.


Author(s):  
Stamo Manouvelou ◽  
Vasilios Koutoulidis ◽  
Ioannis Tsougos ◽  
Maria Tolia ◽  
George Kyrgias ◽  
...  

Background: Frontotemporal dementia (FTD) represents the second most frequent early onset of dementia in people younger than 65 years. The main syndromes encompassed by the term FTD are behavioral variant of Frontotemporal dementia (bvFTD), non-fluent variant primary progressive aphasia (nfvPPA) and semantic variant (SD). Aims: To assess the bvFTD and SD, which represent the most common subtypes of FTD, using visual rating scales. Methods: Brain MRI exams of 77 patients either with bvFTD (n=43) or SD (n=34) were evaluated. The rating scales used were: Global cortical atrophy (GCA), Fazekas Scale: periventricular (PV) and white matter (WM) changes, Koedam rating scale and visual scales regarding specific cortical regions: dorsofrontal (DF), orbitofrontal (OF), anterior cingulate (AC), basal ganglia (BG), anterior- temporal (AT), insula, lateral-temporal (LT), entorhinal (ERC), perirhinal (PRC), anterior fusiform( AF), anterior hippocampus (AHIP) and posterior hippocampus (PHIP). Both Left (L) and Right (R) hemispheres were evaluated. Results: R-OF (p=0.059), L-OF (p<0.0005), L-AT (p=0.047) and L-AHIP (p=0.007) have a statistically significant effect on the variable occurrence of SD compared to bvFTD. The indicators with the highest value of the area under the curve (AUC) were R-AC (0.829), L-OF (0.808), L-AC (0.791) and L-AF (0.778). Highest sensitivity was achieved by R-OF (97%) and L-AF (75%). Highest specificity was achieved by L-OF (95%), L-AT (91%) followed by R-AC (84%). Best combination of sensitivity and specificity was achieved by L-AF (74%-79%), L-OF (56%-95%) and R-OF (97%-42%). Best combination of PPV and NPV was achieved by L-OF (90%-73%), LAT (83%-72%) and R-AC (77%-77%). Conclusion: Visual rating scales can be a practical diagnostic tool in the characterization of patterns of atrophy in FTLD and may be used as an alternative to highly technical methods of quantification.


2020 ◽  
Author(s):  
Alexandria Remus ◽  
Valerie Smith ◽  
Francesca Wuytack

Abstract Background: As the development of core outcome sets (COS) increases, guidance for developing and reporting high-quality COS continues to evolve; however, a number of methodological uncertainties still remain. The objectives of this study were: (1) to explore the impact of including patient interviews in developing a COS, (2) to examine the impact of using a 5-point versus a 9-point rating scale during Delphi consensus methods on outcome selection and the COS and (3) to inform and contribute to COS development methodology by advancing the evidence base on COS development techniques.Methods: Semi-structured patient interviews and a nested randomised controlled parallel group trial as part of the Pelvic Girdle Pain Core Outcome Set project (PGP-COS). Patient interviews, as an adjunct to a systematic review of outcomes reported in previous studies, were undertaken to identify preliminary outcomes for including in a Delphi consensus survey. In the Delphi survey, participants were randomised (1:1) to a 5-point or 9-point scale for rating the importance of the list of preliminary outcomes.Results: None of the outcomes identified from the patient interviews were included in the final PGP-COS. The 5-point rating scale resulted in twice as many outcomes reaching consensus after the 3-round Delphi survey compared to the 9-point scale. Consensus on all five outcomes included in the final PGP-COS was achieved by participants allocated the 5-point rating scale, whereas consensus on four of these was achieved by those using the 9-point scale.Conclusions: Using patient interviews to identify preliminary outcomes as an adjunct to conducting a systematic review of outcomes measured in the literature does not appear to influence outcome selection in developing a COS. The use of different rating scales in a Delphi survey, however, does appear to impact on outcome selection. The 5-point scale demonstrated greater congruency than the 9-point scale with the outcomes included in the final PGP-COS. Use of a 5-point scale might be considered for Delphi surveys in COS development studies. Future research to substantiate our findings and to explore the impact of other rating scales on outcome selection during COS development, however, is still warranted.


2017 ◽  
Vol 4 (3) ◽  
Author(s):  
Lalitha Subramanian ◽  
Sarah Manickaraj

The present study explored the relationship between knowledge, attitudes and competency skills of regular school teachers (n=149) pertaining to children with special education needs (CSEN) and Inclusive Education (IE). The knowledge about CSEN and IE were measured using a 4 point rating scale and an informal questionnaire that were specifically modified for this study. The attitudes, concerns and competency skills were measured using 4 point rating scales. Results of the study indicated significant correlations between knowledge about CSEN and attitudes towards disability and teacher efficacy for inclusive practices; attitudes towards disability and attitudes towards inclusive practices and attitudes towards inclusive education and concerns about inclusive education and teacher efficacy for inclusive practices. The findings offers insight into preparation of training programs for teachers for successful implementation of inclusive education.


2020 ◽  
Author(s):  
Alexandria Remus ◽  
Valerie Smith ◽  
Francesca Wuytack

Abstract Background As the development of core outcome sets (COS) increases, guidance for developing and reporting high-quality COS continues to evolve; however, a number of methodological uncertainties still remain. The objectives of this study were: (1) to explore the impact of including patient interviews in developing a COS (2) to examine the impact of using a 5-point versus a 9-point rating scale during Delphi consensus methods on outcome selection and the COS and (3) to inform and contribute to COS development methodology by advancing the evidence base on COS development techniques. Methods Semi-structured patient interviews (i) and a nested randomised controlled parallel group trial (ii) as part of the Pelvic Girdle Pain Core Outcome Set project (PGP-COS). Patient interviews, as an adjunct to a systematic review of outcomes reported in previous studies, were undertaken to identify preliminary outcomes for including in a Delphi consensus survey. In the Delphi survey, participants were randomised (1:1) to a 5-point or 9-point Likert scale for rating the importance of the list of preliminary outcomes. Results None of the outcomes identified from the patient interviews only were included in the final PGP-COS. The 5-point rating scale resulted in twice as many outcomes reaching consensus after the 3-round Delphi survey compared to the 9-point scale. Consensus on all five outcomes included in the final PGP-COS was achieved by participants allocated the 5-point rating scale, whereas consensus on four of these was achieved by those using the 9-point scale. Conclusions Using patient interviews to identify preliminary outcomes as an adjunct to conducting a systematic review of outcomes measured in the literature, does not appear to influence outcome selection in developing a COS. The use of different rating scales in a Delphi survey, however, does appear to impact on outcome selection. The 5-point scale demonstrated greater congruency than the 9-point scale with the outcomes included in the final PGP-COS. Use of a 5-point scale can thus be considered for Delphi surveys in COS development studies. Future research to explore the impact of other rating scales on outcome selection during COS development is still warranted.


Sign in / Sign up

Export Citation Format

Share Document