Person Fit Across Subgroups: An Achievement Testing Example

Author(s):  
Rob R. Meijer ◽  
Edith M. L. A. van Krimpen-Stoop
2020 ◽  
Vol 29 (3) ◽  
pp. 1226-1240
Author(s):  
Janet L. Patterson ◽  
Barbara L. Rodríguez ◽  
Philip S. Dale

Purpose Early identification is a key element for accessing appropriate services for preschool children with language impairment. However, there is a high risk of misidentifying typically developing dual language learners as having language impairment if inappropriate tools designed for monolingual children are used. In this study of children with bilingual exposure, we explored performance on brief dynamic assessment (DA) language tasks using graduated prompting because this approach has potential applications for screening. We asked if children's performance on DA language tasks earlier in the year was related to their performance on a year-end language achievement measure. Method Twenty 4-year-old children from Spanish-speaking homes attending Head Start preschools in the southwestern United States completed three DA graduated prompting language tasks 3–6 months prior to the Head Start preschools' year-end achievement testing. The DA tasks, Novel Adjective Learning, Similarities in Function, and Prediction, were administered in Spanish, but correct responses in English or Spanish were accepted. The year-end achievement measure, the Learning Accomplishment Profile–Third Edition (LAP3), was administered by the children's Head Start teachers, who also credited correct responses in either language. Results Children's performance on two of the three DA language tasks was significantly and positively related to year-end LAP3 language scores, and there was a moderate and significant relationship for one of the DA tasks, even when controlling for age and initial LAP3 scores. Conclusions Although the relationship of performance on DA with year-end performance varies across tasks, the findings indicate potential for using a graduated prompting approach to language screening with young dual language learners. Further research is needed to select the best tasks for administration in a graduated prompting framework and determine accuracy of identification of language impairment.


2019 ◽  
Vol 35 (1) ◽  
pp. 126-136 ◽  
Author(s):  
Tour Liu ◽  
Tian Lan ◽  
Tao Xin

Abstract. Random response is a very common aberrant response behavior in personality tests and may negatively affect the reliability, validity, or other analytical aspects of psychological assessment. Typically, researchers use a single person-fit index to identify random responses. This study recommends a three-step person-fit analysis procedure. Unlike the typical single person-fit methods, the three-step procedure identifies both global misfit and local misfit individuals using different person-fit indices. This procedure was able to identify more local misfit individuals than single-index method, and a graphical method was used to visualize those particular items in which random response behaviors appear. This method may be useful to researchers in that it will provide them with more information about response behaviors, allowing better evaluation of scale administration and development of more plausible explanations. Real data were used in this study instead of simulation data. In order to create real random responses, an experimental test administration was designed. Four different random response samples were produced using this experimental system.


Methodology ◽  
2006 ◽  
Vol 2 (4) ◽  
pp. 142-148 ◽  
Author(s):  
Pere J. Ferrando

In the IRT person-fluctuation model, the individual trait levels fluctuate within a single test administration whereas the items have fixed locations. This article studies the relations between the person and item parameters of this model and two central properties of item and test scores: temporal stability and external validity. For temporal stability, formulas are derived for predicting and interpreting item response changes in a test-retest situation on the basis of the individual fluctuations. As for validity, formulas are derived for obtaining disattenuated estimates and for predicting changes in validity in groups with different levels of fluctuation. These latter formulas are related to previous research in the person-fit domain. The results obtained and the relations discussed are illustrated with an empirical example.


Methodology ◽  
2015 ◽  
Vol 11 (1) ◽  
pp. 3-12 ◽  
Author(s):  
Jochen Ranger ◽  
Jörg-Tobias Kuhn

In this manuscript, a new approach to the analysis of person fit is presented that is based on the information matrix test of White (1982) . This test can be interpreted as a test of trait stability during the measurement situation. The test follows approximately a χ2-distribution. In small samples, the approximation can be improved by a higher-order expansion. The performance of the test is explored in a simulation study. This simulation study suggests that the test adheres to the nominal Type-I error rate well, although it tends to be conservative in very short scales. The power of the test is compared to the power of four alternative tests of person fit. This comparison corroborates that the power of the information matrix test is similar to the power of the alternative tests. Advantages and areas of application of the information matrix test are discussed.


2019 ◽  
Author(s):  
CHIEN WEI ◽  
Chi Chow Julie ◽  
Chou Willy

UNSTRUCTURED Backgrounds: Dengue fever (DF) is an important public health issue in Asia. However, the disease is extremely hard to detect using traditional dichotomous (i.e., absent vs. present) evaluations of symptoms. Convolution neural network (CNN), a well-established deep learning method, can improve prediction accuracy on account of its usage of a large number of parameters for modeling. Whether the HT person fit statistic can be combined with CNN to increase the prediction accuracy of the model and develop an application (APP) to detect DF in children remains unknown. Objectives: The aim of this study is to build a model for the automatic detection and classification of DF with symptoms to help patients, family members, and clinicians identify the disease at an early stage. Methods: We extracted 19 feature variables of DF-related symptoms from 177 pediatric patients (69 diagnosed with DF) using CNN to predict DF risk. The accuracy of two sets of characteristics (19 symptoms and four other variables, including person mean, standard deviation, and two HT-related statistics matched to DF+ and DF−) for predicting DF, were then compared. Data were separated into training and testing sets, and the former was used to predict the latter. We calculated the sensitivity (Sens), specificity (Spec), and area under the receiver operating characteristic curve (AUC) across studies for comparison. Results: We observed that (1) the 23-item model yields a higher accuracy rate (0.95) and AUC (0.94) than the 19-item model (accuracy = 0.92, AUC = 0.90) based on the 177-case training set; (2) the Sens values are almost higher than the corresponding Spec values (90% in 10 scenarios) for predicting DF; (3) the Sens and Spec values of the 23-item model are consistently higher than those of the 19-item model. An APP was subsequently designed to detect DF in children. Conclusion: The 23-item model yielded higher accuracy rates (0.95) and AUC (0.94) than the 19-item model (accuracy = 0.92, AUC = 0.90). An APP could be developed to help patients, family members, and clinicians discriminate DF from other febrile illnesses at an early stage.


2020 ◽  
Vol 30 (Supplement_5) ◽  
Author(s):  
J Ese ◽  
C Ihlebak

Abstract Background Public health problems often constitute so called “wicked problems”, and the importance of involving multiple stakeholders in order to address such problems is acknowledged, for instance through the SDG17 guidelines. Partnerships between academia and the public sector have been deemed especially promising. However, sustainable partnerships might be difficult due to divergent understandings and interests. Although there is a substantial research literature on academic-public partnerships in general, partnerships addressing public health specifically are less investigated. The aim of the project was therefore to identify enablers for sustainable public health partnerships between academia and the public sector. Methods A mixed methods design was used. A survey regarding partnerships was sent to 41 European, Asian and American regions, with a response rate of 72 %. Based on survey data, an interview guide was developed and four best cases (Canada, Bulgaria, the Netherlands and Norway) were identified. Site visits and group interviews with representatives from stakeholders of the partnerships were conducted. Interview data and answers to open ended questions from questionnaires were analysed. Results Three main findings became apparent through the analysis. Important enablers were: 1) person-to-person fit between individuals, 2) national incentive schemes for collaboration, and 3) formal partnership agreements that provided a framework that allowed for manoeuvring. The enablers identified are on a macro, miso and micro level. Furthermore, they can be categorised as political, organisational, and social. Conclusions The data support the notion that partnerships are complex social structures that need to be initiated and managed on different levels and with different measures. At the same time, data demonstrate that across different geographical, political, and social contexts the same enablers are reappearing as important for sustaining public health partnerships. Key messages Similar enablers for sustaining public health partnerships are found across geographical, political, and social contexts. Important enablers for partnerships are person-to-person fit, national incentive schemes, and formal agreements.


1980 ◽  
Vol 50 (2) ◽  
pp. 293-314 ◽  
Author(s):  
Gale Roid ◽  
Tom Haladyna

The emerging technology of item writing for achievement tests is reviewed. Several different approaches to item development are discussed. A continuum of item-writing methods is proposed ranging from informal-subjective methods to algorithmic-objective methods. Examples of techniques include objective-based item writing, amplified objectives, item forms, facet design, domain-referenced concept testing, and computerized techniques. Each item-writing technique is critically reviewed, and empirical studies of methods are described. Recommendations for further research and for applications to achievement testing are presented.


Psychometrika ◽  
2000 ◽  
Vol 65 (1) ◽  
pp. 29-42 ◽  
Author(s):  
Ivo Poncny

1979 ◽  
Vol 73 (4) ◽  
pp. 134-139
Author(s):  
T. Ernest Newland

Against a general background of learning aptitude and educational achievement testing of blind children, and of basic orientation for such work, the development and nature of the Blind Learning Aptitude Test (BLAT), an individual test, are described. Standardized upon 961 widely representative educationally blind children, it had high reliability and its validity particularly with respect to the more complex school learnings was clearly indicated. It had particular value not only when its results were combined with those of the Hayes-Binet (though understandably lower when combined with those of the verbal portion of the Wechsler Intelligence Scale for Children), but especially when used with blind children coming from backgrounds offering limited cultural nurturance.


Sign in / Sign up

Export Citation Format

Share Document