Reliability and Validity of Bifactor Models of Dimensional Psychopathology in Youth from three Continents

Bifactor models are a promising strategy to parse general from specific aspects of psychopathology in youth. Currently, there are multiple configurations of bifactor models originating from different theoretical and empirical perspectives. Our aim is to identify and test the reliability, validity, measurement invariance, and the correlation of different bifactor models of psychopathology using the Child Behavior Checklist (CBCL). We used data from the Reproducible Brain Charts (RBC) initiative (N=7,011, ages 5 to 22 years, 40.2% females). Factor models were tested using the baseline data. To address our aim, we a) mapped the published bifactor models using the CBCL; b) tested their global model fit; c) calculated model-based reliability indices. d) tested associations with symptoms' impact in everyday life; e) tested measurement invariance across many characteristics and f) analyzed the observed factor correlation across the models. We found 11 bifactor models ranging from 39 to 116 items. Their global model fit was broadly similar. Factor determinacy and H index were acceptable for the p-factors, internalizing, externalizing and somatic specific factors in most models. However, only p- and attention factors were predictors of symptoms' impact in all models. Models were broadly invariant across different characteristics. P-factors were highly correlated across models (r = 0.88 to 0.99). Homotypic specific factors were also highly correlated. Regardless of item selection and strategy to compose CBCL bifactor models, results suggest that they all assess very similar constructs. Our results provide support for the robustness of the bifactor of psychopathology and distinct study characteristics.

Download Full-text

Evaluating Bifactor Models of Psychopathology Using Model-Based Reliability Indices

10.31234/osf.io/6tf7j ◽

2019 ◽

Cited By ~ 1

Author(s):

Matthew Constantinou ◽

Peter Fonagy

Keyword(s):

Bifactor Model ◽

Common Variance ◽

Reliability Indices ◽

Bifactor Models ◽

Study Characteristics ◽

Specific Factors ◽

P Factor ◽

Explained Common Variance ◽

Internalizing And Externalizing ◽

Variance Explained

There is has been a rapid increase in quantitative researchers applying the bifactor model to psychopathology data. The bifactor model, which typically includes a general p factor and internalizing and externalizing residual factors, consistently demonstrates superior model fit to competing models, including the correlated factors model, which typically includes internalizing and externalizing factors. However, the bifactor model’s superior fit might stem from its tendency to overfit noise and flexibly fit most datasets. An alternative approach to evaluating bifactor models that does not rely on fit statistics is model-based reliability assessment. Reliability indices, including omega/omega hierarchical, explained common variance, and percent uncontaminated correlations can be used to determine the viability of the general and specific psychopathology factors and the extent that the underlying data structure and its measurement is multidimensional. In this methodological review, we identified 49 studies published between 2009 and 2019 that applied the bifactor model to at least two separate symptom domains and calculated reliability indices from the standardized factor loading matrices. We also predicted variation in the p factor’s strength, indexed by the explained common variance, from study characteristics. We found that psychopathology measures tend to be multidimensional, with 57% of the variance explained by the p factor and the remaining variance explained by specific factors. By contrast, most of the variance in observed total scores (74%) was explained by the p factor, while relatively little of the variance in in observed subscale scores (37%) was explained by specific factors beyond the p factor. Finally, 62% of the variability in the p factor’s strength could be predicted by study characteristics, most notably the informant (in a simultaneous regression model), but also age, percent uncontaminated correlations, and the number of items (in separate regression models). We conclude that the latent structure of psychopathology is multidimensional, but its measurement is governed by a single dimension, the strength of which is predicted by study characteristics, particularly the informant.

Download Full-text

Issues in Estimating Interpretable Lower Order Factors in Second-Order Hierarchical Models: Commentary on Clark et al. (2021)

Clinical Psychological Science ◽

10.1177/21677026211035114 ◽

2021 ◽

pp. 216770262110351

Author(s):

Tyler M. Moore ◽

Benjamin B. Lahey

Keyword(s):

Lower Order ◽

Hierarchical Modeling ◽

Simulated Data ◽

Second Order ◽

General Factor ◽

Bifactor Models ◽

Specific Factors ◽

Factor Modeling ◽

Highly Correlated ◽

Criterion Variables

In a previous issue of Clinical Psychological Science, Clark and colleagues asserted that lower order factors in second-order models are comparable with specific factors in bifactor models when residualized on the general factor. Modeling simulated data demonstrated that residualized lower order factors are correlated with bifactor-specific factors only to the extent that factor loadings are proportional. Modeling actual data with violations of proportionality showed that specific and residualized lower order factors are not always highly correlated and have differential correlations with criterion variables even when both models fit acceptably. Because proportionality constraints limit only second-order models, bifactor models should be the first option for hierarchical modeling.

Download Full-text

Feasibility of Race by Sex Intersectionality Research on Suicidality in the Adolescent Brain Cognitive Development (ABCD) Study

Children ◽

10.3390/children8060437 ◽

2021 ◽

Vol 8 (6) ◽

pp. 437

Author(s):

Shervin Assari ◽

Shanika Boyce ◽

Mohsen Bazargan

Keyword(s):

Cognitive Development ◽

Sample Size ◽

Child Behavior Checklist ◽

Reliability And Validity ◽

Self Report ◽

Large Sample Size ◽

Cross Sectional ◽

Cronbach Alpha ◽

Rich Data ◽

Suicidal Ideas

Intersectional research on childhood suicidality requires studies with a reliable and valid measure of suicidality, as well as a large sample size that shows some variability of suicidality across sex by race intersectional groups. Objectives: We aimed to investigate the feasibility of intersectionality research on childhood suicidality in the Adolescent Brain Cognitive Development (ABCD) study. We specifically explored the reliability and validity of the measure, sample size, and variability of suicidality across sex by race intersectional groups. Methods: We used cross-sectional data (wave 1) from the ABCD study, which sampled 9013 non-Hispanic white (NHW) or non-Hispanic black (NHB) children between the ages of 9 and 10 between years 2016 and 2018. Four intersectional groups were built based on race and sex: NHW males (n = 3554), NHW females (n = 3158), NHB males (n = 1164), and NHB females (n = 1137). Outcome measure was the count of suicidality symptoms, reflecting all positive history and symptoms of suicidal ideas, plans, and attempts. To validate our measure, we tested the correlation between our suicidality measure and depression and Child Behavior Checklist (CBCL) sub-scores. Cronbach alpha was calculated for reliability across each intersectional group. We also compared groups for suicidality. Results: We observed some suicidality history in observed 3.2% (n = 101) of NHW females, 4.9% (n = 175) of NHW males, 5.4% (n = 61) of NHB females, and 5.8% (n = 68) of NHB males. Our measure’s reliability was acceptable in all race by sex groups (Cronbach alpha higher than .70+ in all intersectional groups). Our measure was valid in all intersectional groups, documented by a positive correlation with depression and CBCL sub-scores. We could successfully model suicidality across sex by race groups, using multivariable models. Conclusion: Given the high sample size, reliability, and validity of the suicidality measure, variability of suicidality, it is feasible to investigate correlates of suicidality across race by sex intersections in the ABCD study. We also found evidence of higher suicidality in NHB than NHW children in the ABCD study. The ABCD rich data in domains of social context, self-report, schools, parenting, psychopathology, personality, and brain imaging provides a unique opportunity to study intersectional differences in neural circuits associated with youth suicidality.

Download Full-text

How Universal Is a Construct of Loneliness? Measurement Invariance of the UCLA Loneliness Scale in Indonesia, Germany, and the United States

Assessment ◽

10.1177/10731911211034564 ◽

2021 ◽

pp. 107319112110345

Author(s):

Joevarian Hudiyana ◽

Tania M. Lincoln ◽

Steffi Hartanto ◽

Muhammad A. Shadiqi ◽

Mirra N. Milla ◽

...

Keyword(s):

United States ◽

Measurement Invariance ◽

Social Isolation ◽

Total Sample ◽

The United States ◽

Cross Cultural ◽

Model Fit ◽

Short Version ◽

Loneliness Scale ◽

The One

The UCLA Loneliness Scale (ULS-20) and its short version (ULS-8) are widely used to measure loneliness. However, the question remains whether or not previous studies using the scale to measure loneliness are measuring the construct equally across countries. The present study examined the measurement invariance (MI) of both scales in Germany, Indonesia, and the United States ( N = 2350). The one-, two-, and three-factor structure of the ULS-20 did not meet the model fit cut-off criteria in the total sample. The ULS-8 met the model fit cut-off criteria and has configural, but not metric invariance because two items unrelated to social isolation were not MI. The final six items (ULS-6) exclusively related to social isolation had complete MI. Participants from the United States scored highest in the ULS-6, followed by participants from Germany and then Indonesia. We conclude that the ULS-6 is an appropriate measure for cross-cultural studies on loneliness.

Download Full-text

A short food-group-based dietary questionnaire is reliable and valid for assessing toddlers' dietary risk in relatively advantaged samples

British Journal Of Nutrition ◽

10.1017/s0007114514001184 ◽

2014 ◽

Vol 112 (4) ◽

pp. 627-637 ◽

Cited By ~ 15

Author(s):

Lucinda K. Bell ◽

Rebecca K. Golley ◽

Anthea M. Magarey

Keyword(s):

Risk Score ◽

Dietary Patterns ◽

Reliability And Validity ◽

Food Group ◽

Dietary Guidelines ◽

Risk Category ◽

Food Groups ◽

Sweetened Beverages ◽

Dietary Risk ◽

Highly Correlated

Identifying toddlers at dietary risk is crucial for determining who requires intervention to improve dietary patterns and reduce health consequences. The objectives of the present study were to develop a simple tool that assesses toddlers' dietary risk and investigate its reliability and validity. The nineteen-item Toddler Dietary Questionnaire (TDQ) is informed by dietary patterns observed in Australian children aged 14 (n552) and 24 (n493) months and the Australian dietary guidelines. It assesses the intake of ‘core’ food groups (e.g. fruit, vegetables and dairy products) and ‘non-core’ food groups (e.g. high-fat, high-sugar and/or high-salt foods and sweetened beverages) over the previous 7 d, which is then scored against a dietary risk criterion (0–100; higher score = higher risk). Parents of toddlers aged 12–36 months (Socio-Economic Index for Areas decile range 5–9) were asked to complete the TDQ for their child (n111) on two occasions, 3·2 (sd1·8) weeks apart, to assess test–retest reliability. They were also asked to complete a validated FFQ from which the risk score was calculated and compared with the TDQ-derived risk score (relative validity). Mean scores were highly correlated and not significantly different for reliability (intra-class correlation = 0·90, TDQ1 30·2 (sd8·6)v. TDQ2 30·9 (sd8·9);P= 0·14) and validity (r0·83, average TDQ ((TDQ1+TDQ2)/2) 30·5 (sd8·4)v. FFQ 31·4 (sd8·1);P= 0·05). All the participants were classified into the same (reliability 75 %; validity 79 %) or adjacent (reliability 25 %; validity 21 %) risk category (low (0–24), moderate (25–49), high (50–74) and very high (75–100)). Overall, the TDQ is a valid and reliable screening tool for identifying at-risk toddlers in relatively advantaged samples.

Download Full-text

Riskier Tests of the Validity of Bifactor Models of Psychopathology

10.31234/osf.io/3f28h ◽

2019 ◽

Cited By ~ 1

Author(s):

Ashley L. Watts ◽

Holly Poore ◽

Irwin Waldman

Keyword(s):

Community Sample ◽

General Factor ◽

Symptom Dimensions ◽

Bifactor Models ◽

Specific Factors ◽

Psychopathology Symptom ◽

Using Data ◽

P Factor ◽

Correlated Factors ◽

Additional Variance

We advanced several “riskier tests” of the validity of bifactor models of psychopathology, which included that the general and specific factors should be reliable and well-represented by their indicators, and that including a general factor should improve the correlated factor model’s external validity. We compared bifactor and correlated factors models using data from a community sample of youth (N=2498) whose parents provided ratings on psychopathology and external criteria (i.e., temperament, aggression, antisociality). Bifactor models tended to yield either general or specific factors that were unstable and difficult to interpret. The general factor appeared to reflect a differentially-weighted amalgam of psychopathology rather than a liability for psychopathology broadly construed. With rare exceptions, bifactor models did not explain additional variance in psychopathology symptom dimensions or external criteria compared with correlated factors models. Together, our findings call into question the validity of bifactor models of psychopathology, and the p-factor more broadly.

Download Full-text

The reliability and validity of a screening scale for online gaming disorder among Chinese adolescents and young adults

BMC Psychiatry ◽

10.1186/s12888-021-03678-1 ◽

2022 ◽

Vol 22 (1) ◽

Author(s):

Xuechan Lyu ◽

Tianzhen Chen ◽

Zhe Wang ◽

Jing Lu ◽

Chenyi Ma ◽

...

Keyword(s):

Young Adults ◽

Age Groups ◽

Reliability And Validity ◽

Online Gaming ◽

Chinese Adolescents ◽

Model Fit ◽

Adolescents And Young Adults ◽

Gaming Disorder ◽

Good Reliability ◽

Screening Scale

Abstract Background In recent years, there have been frequent reports of gaming disorder in China, with more focus on young people. We developed and psychometrically tested a Gaming Disorder screening scale (i.e., Gaming Disorder Screening Scale - GDSS) for Chinese adolescents and young adults, based on the existing scales and diagnostic criteria, but also considering the development status of China. Methods For testing content and criterion validity, 1747 participants competed the GDSS and the Internet Addiction Test (IAT). After 15 days, 400 participants were retested with the scales for to assess test-retest reliability. Besides, 200 game players were interviewed for a diagnosis of gaming disorder. Results The Cronbach’s alpha coefficient on the GDSS was 0.93. The test-retest coefficient of 0.79. Principal components analysis identified three factors accounting for 62.4% of the variance; behavior, functioning, cognition and emotion. Confirmatory factor analysis showed a good model fit to the data (χ2 /df = 5.581; RMSEA =0.074; TLI = 0.916, CFI = 0.928). The overall model fit was significantly good in the measurement invariance tested across genders and different age groups. Based on the clinical interview, the screening cut-off point was determined to be ≥47 (sensitivity 41.4%, specificity 82.3%). Conclusions The GDSS demonstrated good reliability and validity aspects for screening online gaming disorder among Chinese adolescents and young adults.

Download Full-text

Three Studies on the Reliability and Validity of a Self-Report Measure of Posttraumatic Stress Disorder

Assessment ◽

10.1177/107319119600300102 ◽

1996 ◽

Vol 3 (1) ◽

pp. 17-25 ◽

Cited By ~ 73

Author(s):

Dean Lauterbach ◽

Scott Vrana

Keyword(s):

Posttraumatic Stress Disorder ◽

Posttraumatic Stress ◽

Discriminant Validity ◽

Stress Disorder ◽

Traumatic Event ◽

Clinical Sample ◽

Reliability And Validity ◽

Self Report ◽

Highly Correlated ◽

The Impact

This paper describes three studies of the reliability and validity of a newly revised version of the Purdue Posttraumatic Stress Disorder scale (PPTSD-R). The PPTSD-R is a 17-item questionnaire that yields four scores: Reexperiencing, Avoidance, Arousal, and Total. It is highly internally consistent (α = .91), and the scores are relatively stable across time. The PPTSD-R is highly correlated with other measures of PTSD symptomatology and moderately correlated with measures of related psychopathology, providing preliminary support for the measure's convergent and discriminant validity. It reliably distinguishes between groups of people who were and were not traumatized, it is sensitive to the impact of different types of traumatic events, and (within a clinical sample) it discriminates between those who did and did not seek treatment for difficulty coping with the traumatic event being assessed. The PPTSD-R shows promise as a measure of PTSD symptoms in the college population.

Download Full-text

The Military Suicide Research Consortium Common Data Elements: An Examination of Measurement Invariance Across Current Service Members and Veterans

Assessment ◽

10.1177/1073191118777635 ◽

2018 ◽

Vol 26 (6) ◽

pp. 963-975 ◽

Cited By ~ 1

Author(s):

Ian H. Stanley ◽

Jennifer M. Buchman-Schmitt ◽

Carol Chu ◽

Megan L. Rogers ◽

Anna R. Gai ◽

...

Keyword(s):

Measurement Invariance ◽

Model Fit ◽

Service Members ◽

Common Data Elements ◽

Adequate Model ◽

Increased Risk ◽

Research Consortium ◽

Data Elements ◽

Military Suicide ◽

The Military

Suicide rates within the U.S. military are elevated, necessitating greater efforts to identify those at increased risk. This study utilized a multigroup confirmatory factor analysis to examine measurement invariance of the Military Suicide Research Consortium Common Data Elements (CDEs) across current service members ( n = 2,015), younger veterans (<35 years; n = 377), and older veterans (≥35 years; n = 1,001). Strong factorial invariance was supported with adequate model fit observed for current service members, younger veterans, and older veterans. The structures of all models were generally comparable with few exceptions. The Military Suicide Research Consortium CDEs demonstrate at least adequate model fit for current military service members and veterans, regardless of age. Thus, the CDEs can be validly used across military and veteran populations. Given similar latent structures, research findings in one group may inform clinical and policy decision making for the other.

Download Full-text

Best Practices in the Use of Bifactor Models: Conceptual Grounds, Fit Indices and Complementary Indicators

Revista Evaluar ◽

10.35670/1667-4545.v18.n3.22221 ◽

2018 ◽

Vol 18 (3) ◽

Cited By ~ 3

Author(s):

Pablo Ezequiel Flores-Kanter ◽

Sergio Dominguez-Lara ◽

Mario Alberto Trógolo ◽

Leonardo Adrián Medrano

Keyword(s):

Best Practices ◽

Empirical Studies ◽

Model Fit ◽

Fit Indices ◽

Research Article ◽

Bifactor Models ◽

Model Fit Indices ◽

Personality Psychopathology ◽

Using Data ◽

Published Research

<p>Bifactor models have gained increasing popularity in the literature concerned with personality, psychopathology and assessment. Empirical studies using bifactor analysis generally judge the estimated model using SEM model fit indices, which may lead to erroneous interpretations and conclusions. To address this problem, several researchers have proposed multiple criteria to assess bifactor models, such as a) conceptual grounds, b) overall model fit indices, and c) specific bifactor model indicators. In this article, we provide a brief summary of these criteria. An example using data gathered from a recently published research article is also provided to show how taking into account all criteria, rather than solely SEM model fit indices, may prevent researchers from drawing wrong conclusions.</p>

Download Full-text