The Dichotic Digits difference Test (DDdT): Development, Normative Data, and Test–Retest Reliability Studies Part 1

2016 ◽  
Vol 27 (06) ◽  
pp. 458-469 ◽  
Author(s):  
Sharon Cameron ◽  
Helen Glyde ◽  
Harvey Dillon ◽  
Jessica Whitfield ◽  
John Seymour

Background: The dichotic digits test is one of the most widely used assessment tools for central auditory processing disorder. However, questions remain concerning the impact of cognitive factors on test results. Purpose: To develop the Dichotic Digits difference Test (DDdT), an assessment tool that could differentiate children with cognitive deficits from children with genuine dichotic deficits based on differential test results. The DDdT consists of four subtests: dichotic free recall (FR), dichotic directed left ear (DLE), dichotic directed right ear (DRE), and diotic. Scores for six conditions are calculated (FR left ear [LE], FR right ear [RE], and FR total, as well as DLE, DRE, and diotic). Scores for four difference measures are also calculated: dichotic advantage, right-ear advantage (REA) FR, REA directed, and attention advantage. Research Design: Experiment 1 involved development of the DDdT, including error rate analysis. Experiment 2 involved collection of normative and test–retest reliability data. Study Sample: Twenty adults (aged 25 yr 10 mo to 50 yr 7 mo, mean 36 yr 4 mo) took part in the development study; 62 normal-hearing, typically developing, primary-school children (aged 7 yr 1 mo to 11 yr 11 mo, mean 9 yr 4 mo) and 10 adults (aged 25 yr 0 mo to 51 yr 6 mo, mean 34 yr 10 mo) took part in the normative and test–retest reliability study. Data Collection and Analysis: In Experiment 1, error rate analysis was conducted on the 36 digit-pair combinations of the DDdT. Normative data collected in Experiment 2 were arcsine transformed to achieve a distribution that was closer to a normal distribution and z-scores calculated. Pearson product-moment correlations were used to determine the strength of relationships between DDdT conditions. Results: The development study revealed no significant differences in the adult population between test and retest on any DDdT condition. Error rates on 36 digit pairs ranged from 1.5% to 16.7%. The most and the least error-prone digits were removed before commencement of the normative data study, leaving 25 unique digit pairs. Average z-scores calculated from the arcsine-transformed data collected from the 62 children who took part in the normative data study revealed that FR dichotic processing (LE, RE, and total) was highly correlated with diotic processing (r ranging from 0.5 to 0.6; p < 0.0001). Significant improvements in performance on retest occurred for the FR LE, RE, total, and diotic conditions (p ranging from 0.05 to 0.0004), the conditions that would be expected to improve with practice if the participant’s response strategies are better the second time around. Conclusions: The addition of a diotic control task—that shares many response demands with the usual dichotic tasks—opens up the possibility of differentiating children who perform below expectations because of poor dichotic processing skills from those who perform poorly because of impaired attention, memory, or other cognitive abilities. The high correlation between dichotic and diotic performance suggests that factors other than dichotic performance play a substantial role in a child’s ability to perform a dichotic listening task. This hypothesis is investigated further in the cognitive correlation study that follows in the companion paper (DDdT Study Part 2; Cameron et al, 2016).


2018 ◽  
Vol 29 (02) ◽  
pp. 135-150
Author(s):  
Sharon Cameron ◽  
Nicky Chong-White ◽  
Kiri Mealings ◽  
Tim Beechey ◽  
Harvey Dillon ◽  
...  

AbstractPrevious research suggests that a proportion of children experiencing reading and listening difficulties may have an underlying primary deficit in the way that the central auditory nervous system analyses the perceptually important, rapidly varying, formant frequency components of speech.The Phoneme Identification Test (PIT) was developed to investigate the ability of children to use spectro-temporal cues to perceptually categorize speech sounds based on their rapidly changing formant frequencies. The PIT uses an adaptive two-alternative forced-choice procedure whereby the participant identifies a synthesized consonant-vowel (CV) (/ba/ or /da/) syllable. CV syllables differed only in the second formant (F2) frequency along an 11-step continuum (between 0% and 100%—representing an ideal /ba/ and /da/, respectively). The CV syllables were presented in either quiet (PIT Q) or noise at a 0 dB signal-to-noise ratio (PIT N).Development of the PIT stimuli and test protocols, and collection of normative and test–retest reliability data.Twelve adults (aged 23 yr 10 mo to 50 yr 9 mo, mean 32 yr 5 mo) and 137 typically developing, primary-school children (aged 6 yr 0 mo to 12 yr 4 mo, mean 9 yr 3 mo). There were 73 males and 76 females.Data were collected using a touchscreen computer. Psychometric functions were automatically fit to individual data by the PIT software. Performance was determined by the width of the continuum for which responses were neither clearly /ba/ nor /da/ (referred to as the uncertainty region [UR]). A shallower psychometric function slope reflected greater uncertainty. Age effects were determined based on raw scores. Z scores were calculated to account for the effect of age on performance. Outliers, and individual data for which the confidence interval of the UR exceeded a maximum allowable value, were removed. Nonparametric tests were used as the data were skewed toward negative performance.Across participants, the median value of the F2 range that resulted in uncertain responses was 33% in quiet and 40% in noise. There was a significant effect of age on the width of this UR (p < 0.00001) in both quiet and noise, with performance becoming adult like by age 9 on the PIT Q and age 10 on the PIT N. A skewed distribution toward negative performance occurred in both quiet (p = 0.01) and noise (p = 0.006). Median UR scores were significantly wider in noise than in quiet (T = 2041, p < 0.0000001). Performance (z scores) across the two tests was significantly correlated (r = 0.36, p = 0.000009). Test–retest z scores were significantly correlated in both quiet and noise (r = 0.4 and 0.37, respectively, p < 0.0001).The PIT normative data show that the ability to identify phonemes based on changes in formant transitions improves with age, and that some children in the general population have performance much worse than their age peers. In children, uncertainty increases when the stimuli are presented in noise. The test is suitable for use in planned studies in a clinical population.



2018 ◽  
Vol 29 (02) ◽  
pp. 151-163
Author(s):  
Sharon Cameron ◽  
Nicky Chong-White ◽  
Kiri Mealings ◽  
Tim Beechey ◽  
Harvey Dillon ◽  
...  

AbstractIntensity peaks and valleys in the acoustic signal are salient cues to syllable structure, which is accepted to be a crucial early step in phonological processing. As such, the ability to detect low-rate (envelope) modulations in signal amplitude is essential to parse an incoming speech signal into smaller phonological units.The Parsing Syllable Envelopes (ParSE) test was developed to quantify the ability of children to recognize syllable boundaries using an amplitude modulation detection paradigm. The envelope of a 750-msec steady-state /a/ vowel is modulated into two or three pseudo-syllables using notches with modulation depths varying between 0% and 100% along an 11-step continuum. In an adaptive three-alternative forced-choice procedure, the participant identified whether one, two, or three pseudo-syllables were heard.Development of the ParSE stimuli and test protocols, and collection of normative and test–retest reliability data.Eleven adults (aged 23 yr 10 mo to 50 yr 9 mo, mean 32 yr 10 mo) and 134 typically developing, primary-school children (aged 6 yr 0 mo to 12 yr 4 mo, mean 9 yr 3 mo). There were 73 males and 72 females.Data were collected using a touchscreen computer. Psychometric functions (PFs) were automatically fit to individual data by the ParSE software. Performance was related to the modulation depth at which syllables can be detected with 88% accuracy (referred to as the upper boundary of the uncertainty region [UBUR]). A shallower PF slope reflected a greater level of uncertainty. Age effects were determined based on raw scores. z Scores were calculated to account for the effect of age on performance. Outliers, and individual data for which the confidence interval of the UBUR exceeded a maximum allowable value, were removed. Nonparametric tests were used as the data were skewed toward negative performance.Across participants, the performance criterion (UBUR) was met with a median modulation depth of 42%. The effect of age on the UBUR was significant (p < 0.00001). The UBUR ranged from 50% modulation depth for 6-yr-olds to 25% for adults. Children aged 6–10 had significantly higher uncertainty region boundaries than adults. A skewed distribution toward negative performance occurred (p = 0.00007). There was no significant difference in performance on the ParSE between males and females (p = 0.60). Test–retest z scores were strongly correlated (r = 0.68, p < 0.0000001).The ParSE normative data show that the ability to identify syllable boundaries based on changes in amplitude modulation improves with age, and that some children in the general population have performance much worse than their age peers. The test is suitable for use in planned studies in a clinical population.



2016 ◽  
Vol 23 (3) ◽  
pp. 239 ◽  
Author(s):  
D. Rodin ◽  
B. Banihashemi ◽  
L. Wang ◽  
A. Lau ◽  
S. Harris ◽  
...  

Purpose We evaluated the feasibility, reliability, and validity of the Brain Metastases Symptom Checklist (BMSC), a novel self-report measure of common symptoms experienced by patients with brain metastases.Methods Patients with first-presentation symptomatic brain metastases (n = 137) referred for whole-brain radiotherapy (WBRT) completed the BMSC at time points before and after treatment. Their caregivers (n = 48) provided proxy ratings twice on the day of consultation to assess reliability, and at week 4 after WBRT to assess responsiveness to change. Correlations with 4 other validated assessment tools were evaluated.Results The symptoms reported on the BMSC were largely mild to moderate, with tiredness (71%) and difficulties with balance (61%) reported most commonly at baseline. Test–retest reliability for individual symptoms had a median intraclass correlation of 0.59 (range: 0.23–0.85). Caregiver proxy and patient responses had a median intraclass correlation of 0.52. Correlation of absolute scores on the BMSC and other symptom assessment tools was low, but consistency in the direction of symptom change was observed. At week 4, change in symptoms was variable, with improvements in weight gain and sleep of 42% and 41% respectively, and worsening of tiredness and drowsiness of 62% and 59% respectively.Conclusions The BMSC captures a wide range of symptoms experienced by patients with brain metastases, and it is sensitive to change. It demonstrated adequate test–retest reliability and face validity in terms of its responsiveness to change. Future research is needed to determine whether modifications to the BMSC itself or correlation with more symptom-specific measures will enhance validity. 



BMJ Open ◽  
2018 ◽  
Vol 8 (10) ◽  
pp. e021734 ◽  
Author(s):  
Alison Griffiths ◽  
Rachel Toovey ◽  
Prue E Morgan ◽  
Alicia J Spittle

ObjectiveGross motor assessment tools have a critical role in identifying, diagnosing and evaluating motor difficulties in childhood. The objective of this review was to systematically evaluate the psychometric properties and clinical utility of gross motor assessment tools for children aged 2–12 years.MethodA systematic search of MEDLINE, Embase, CINAHL and AMED was performed between May and July 2017. Methodological quality was assessed with the COnsensus-based Standards for the selection of health status Measurement INstruments checklist and an outcome measures rating form was used to evaluate reliability, validity and clinical utility of assessment tools.ResultsSeven assessment tools from 37 studies/manuals met the inclusion criteria: Bayley Scale of Infant and Toddler Development-III (Bayley-III), Bruininks-Oseretsky Test of Motor Proficiency-2 (BOT-2), Movement Assessment Battery for Children-2 (MABC-2), McCarron Assessment of Neuromuscular Development (MAND), Neurological Sensory Motor Developmental Assessment (NSMDA), Peabody Developmental Motor Scales-2 (PDMS-2) and Test of Gross Motor Development-2 (TGMD-2). Methodological quality varied from poor to excellent. Validity and internal consistency varied from fair to excellent (α=0.5–0.99). The Bayley-III, NSMDA and MABC-2 have evidence of predictive validity. Test–retest reliability is excellent in the BOT-2 (intraclass correlation coefficient (ICC)=0.80–0.99), PDMS-2 (ICC=0.97), MABC-2 (ICC=0.83–0.96) and TGMD-2 (ICC=0.81–0.92). TGMD-2 has the highest inter-rater (ICC=0.88–0.93) and intrarater reliability (ICC=0.92–0.99).ConclusionsThe majority of gross motor assessments for children have good-excellent validity. Test–retest reliability is highest in the BOT-2, MABC-2, PDMS-2 and TGMD-2. The Bayley-III has the best predictive validity at 2 years of age for later motor outcome. None of the assessment tools demonstrate good evaluative validity. Further research on evaluative gross motor assessment tools are urgently needed.



2015 ◽  
Vol 126 (2) ◽  
pp. 356-364 ◽  
Author(s):  
U. Hofstadt-van Oy ◽  
P.M. Keune ◽  
J. Muenssinger ◽  
D. Hagenburger ◽  
P. Oschmann


2006 ◽  
Vol 21 (1) ◽  
pp. 117-127 ◽  
Author(s):  
Audrey Hokoda ◽  
Luciana Ramos-Lira ◽  
Patricia Celaya ◽  
Keleigh Vilhauer ◽  
Manuel Angeles ◽  
...  

Research on the prevalence and correlates of dating violence in Mexican teens is challenged by the lack of culturally and linguistically appropriate assessment tools. This study modified, translated, and back-translated the Conflict in Adolescent Dating Relationships Inventory (CADRI; Wolfe et al., 2001) and the Attitudes Towards Dating Violence Scales (Price, Byers, & the Dating Violence Research Team, 1999) for Mexican adolescents. Analyses on 307 adolescents (15–18 years old) from Monterrey and Mexicali, Mexico, revealed that most of the translated CADRI subscales and Attitudes Towards Dating Violence Scales had acceptable internal consistency and test-retest reliability coefficients. The study offers some evidence that the measures may be useful in assessing dating violence in Mexican teens.



2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Stefanie Bühn ◽  
Peggy Ober ◽  
Tim Mathes ◽  
Uta Wegewitz ◽  
Anja Jacobs ◽  
...  

Abstract Background Systematic Reviews (SRs) can build the groundwork for evidence-based health care decision-making. A sound methodological quality of SRs is crucial. AMSTAR (A Measurement Tool to Assess Systematic Reviews) is a widely used tool developed to assess the methodological quality of SRs of randomized controlled trials (RCTs). Research shows that AMSTAR seems to be valid and reliable in terms of interrater reliability (IRR), but the test retest reliability (TRR) of AMSTAR has never been investigated. In our study we investigated the TRR of AMSTAR to evaluate the importance of its measurement and contribute to the discussion of the measurement properties of AMSTAR and other quality assessment tools. Methods Seven raters at three institutions independently assessed the methodological quality of SRs in the field of occupational health with AMSTAR. Between the first and second ratings was a timespan of approximately two years. Answers were dichotomized, and we calculated the TRR of all raters and AMSTAR items using Gwet’s AC1 coefficient. To investigate the impact of variation in the ratings over time, we obtained summary scores for each review. Results AMSTAR item 4 (Was the status of publication used as an inclusion criterion?) provided the lowest median TRR of 0.53 (moderate agreement). Perfect agreement of all reviewers was detected for AMSTAR-item 1 with a Gwet’s AC1 of 1, which represented perfect agreement. The median TRR of the single raters varied between 0.69 (substantial agreement) and 0.89 (almost perfect agreement). Variation of two or more points in yes-scored AMSTAR items was observed in 65% (73/112) of all assessments. Conclusions The high variation between the first and second AMSTAR ratings suggests that consideration of the TRR is important when evaluating the psychometric properties of AMSTAR.. However, more evidence is needed to investigate this neglected issue of measurement properties. Our results may initiate discussion of the importance of considering the TRR of assessment tools. A further examination of the TRR of AMSTAR, as well as other recently established rating tools such as AMSTAR 2 and ROBIS (Risk Of Bias In Systematic reviews), would be useful.



2018 ◽  
Author(s):  
Sarah Patrick ◽  
Peter Connick

AbstractBackgroundDepression affects approximately 25% of people with MS (pwMS) at any given time. It is however under recognised in clinical practice, in part due to a lack of uptake for brief assessment tools and uncertainty about their psychometric properties. The 9-item Patient Health Questionnaire (PHQ-9) is an attractive candidate for this role.ObjectiveTo synthesise published findings on the psychometric properties of the 9-item Patient Health Questionnaire (PHQ-9) when applied to people with multiple sclerosis (pwMS).Data sourcesPubMed, Medline and ISI Web of Science databases, supplemented by hand-searching of references from all eligible sources.Study eligibility criteriaPrimary literature written in English and published following peer-review with a primary aim to evaluate the performance of the PHQ-9 in pwMS.Outcome measuresPsychometric performance with respect to appropriateness, reliability, validity, responsiveness, precision, interpretability, acceptability, and feasibility.ResultsSeven relevant studies were identified, these were of high quality and included 5080 participants from all MS disease-course groups. Strong evidence was found supporting the validity of the PHQ-9 as a unidimensional measure of depression. Used as a screening tool for major depressive disorder (MDD) with a cut-point of 11, sensitivity was 95% sensitivity and specificity 88.3% (PPV 51.4%, NPV 48.6%). Alternative scoring systems that may address the issue of overlap between somatic features of depression and features of MS per se are being developed, although their utility remains unclear. However data on reliability was limited, and no specific evidence was available on test-retest reliability, responsiveness, acceptability, or feasibility.ConclusionsThe PHQ-9 represents a suitable tool to screen for MDD in pwMS. However use as a diagnostic tool cannot currently be recommended, and the potential value for monitoring depressive symptoms cannot be established without further evidence on test-retest reliability, responsiveness, acceptability, and feasibility.PROSPERO register ID: CRD42017067814



Sign in / Sign up

Export Citation Format

Share Document