3pl model Latest Research Papers

Investigating the Impact of Noneffortful Responses on Individual-Level Scores: Can the Effort-Moderated IRT Model Serve as a Solution?

10.31234/osf.io/5wemf ◽

2021 ◽

Author(s):

Joseph Rios ◽

Jim Soland

Keyword(s):

Classification Accuracy ◽

Negative Impact ◽

Parameter Recovery ◽

Individual Level ◽

Irt Model ◽

Ability Estimates ◽

Misclassification Rates ◽

Individual Scores ◽

3Pl Model ◽

The Impact

Suboptimal effort is a major threat to valid score-based inferences. While the effects of such behavior have been frequently examined in the context of mean group comparisons, minimal research has considered its effects on individual score use (e.g., identifying students for remediation). Focusing on the latter context, this study addressed two related questions via simulation and applied analyses. First, we investigated how much including noneffortful responses in scoring using a three-parameter logistic (3PL) model affects person parameter recovery and classification accuracy for noneffortful responders. Second, we explored whether improvements in these individual-level inferences were observed when employing the Effort Moderated IRT (EM-IRT) model under conditions in which its assumptions were met and violated. Results demonstrated that including 10% noneffortful responses in scoring led to average bias in ability estimates and misclassification rates by as much as 0.15 SDs and 7% respectively. These results were mitigated when employing the EM-IRT model, particularly when model assumptions were met. However, once model assumptions were violated, the EM-IRT model’s performance deteriorated, though still outperforming the 3PL model. Thus, findings from this study show that: (a) including noneffortful responses when using individual scores can lead to potential unfounded inferences and potential score misuse; and (b) the negative impact that noneffortful responding has on person ability estimates and classification accuracy can be mitigated by employing the EM-IRT model, particularly when its assumptions are met.

Download Full-text

Investigating the Impact of Noneffortful Responses on Individual-Level Scores: Can the Effort-Moderated IRT Model Serve as a Solution?

Applied Psychological Measurement ◽

10.1177/01466216211013896 ◽

2021 ◽

pp. 014662162110138

Author(s):

Joseph A. Rios ◽

James Soland

Keyword(s):

Classification Accuracy ◽

Negative Impact ◽

Parameter Recovery ◽

Individual Level ◽

Irt Model ◽

Ability Estimates ◽

Misclassification Rates ◽

Individual Scores ◽

3Pl Model ◽

The Impact

Suboptimal effort is a major threat to valid score-based inferences. While the effects of such behavior have been frequently examined in the context of mean group comparisons, minimal research has considered its effects on individual score use (e.g., identifying students for remediation). Focusing on the latter context, this study addressed two related questions via simulation and applied analyses. First, we investigated how much including noneffortful responses in scoring using a three-parameter logistic (3PL) model affects person parameter recovery and classification accuracy for noneffortful responders. Second, we explored whether improvements in these individual-level inferences were observed when employing the Effort Moderated IRT (EM-IRT) model under conditions in which its assumptions were met and violated. Results demonstrated that including 10% noneffortful responses in scoring led to average bias in ability estimates and misclassification rates by as much as 0.15 SDs and 7%, respectively. These results were mitigated when employing the EM-IRT model, particularly when model assumptions were met. However, once model assumptions were violated, the EM-IRT model’s performance deteriorated, though still outperforming the 3PL model. Thus, findings from this study show that (a) including noneffortful responses when using individual scores can lead to potential unfounded inferences and potential score misuse, and (b) the negative impact that noneffortful responding has on person ability estimates and classification accuracy can be mitigated by employing the EM-IRT model, particularly when its assumptions are met.

Download Full-text

Assessing the Accuracy of Parameter Estimates in the Presence of Rapid Guessing Misclassifications

10.31234/osf.io/tws5a ◽

2021 ◽

Author(s):

Joseph Rios

Keyword(s):

Parameter Estimation ◽

Response Times ◽

Item Parameter ◽

Measurement Properties ◽

Estimation Accuracy ◽

Parameter Estimates ◽

Irt Model ◽

Item Parameter Estimation ◽

Ability Parameter ◽

3Pl Model

The presence of rapid guessing (RG) presents a challenge to practitioners in obtaining accurate estimates of measurement properties and examinee ability. In response to this concern, researchers have utilized response times as a proxy of RG, and have attempted to improve parameter estimation accuracy by filtering RG responses using popular scoring approaches, such as the Effort-moderated IRT (EM-IRT) model. However, such an approach assumes that RG can be correctly identified based on an indirect proxy of examinee behavior. A failure to meet this assumption leads to the inclusion of distortive and psychometrically uninformative information in parameter estimates. To address this issue, a simulation study was conducted to examine how violations to the assumption of correct RG classification influences EM-IRT item and ability parameter estimation accuracy and compares these results to parameter estimates from the three-parameter logistic (3PL) model, which includes RG responses in scoring. Two RG misclassification factors were manipulated: type (underclassification vs. overclassification) and rate (10%, 30%, and 50%). Results indicated that the EMIRT model provided improved item parameter estimation over the 3PL model regardless of misclassification type and rate. Furthermore, under most conditions, increased rates of RG underclassification were associated with the greatest bias in ability parameter estimates from the EM-IRT model. In spite of this, the EM-IRT model with RG misclassifications demonstrated more accurate ability parameter estimation than the 3PL model when the mean ability of RG subgroups did not differ. This suggests that in certain situations it may be better for practitioners to: (a) imperfectly identify RG than to ignore the presence of such invalid responses, and (b) select liberal over conservative response time thresholds to mitigate bias from underclassified RG.

Download Full-text

Assessing the Accuracy of Parameter Estimates in the Presence of Rapid Guessing Misclassifications

Educational and Psychological Measurement ◽

10.1177/00131644211003640 ◽

2021 ◽

pp. 001316442110036

Author(s):

Joseph A. Rios

Keyword(s):

Parameter Estimation ◽

Response Times ◽

Item Parameter ◽

Measurement Properties ◽

Estimation Accuracy ◽

Parameter Estimates ◽

Irt Model ◽

Item Parameter Estimation ◽

Ability Parameter ◽

3Pl Model

The presence of rapid guessing (RG) presents a challenge to practitioners in obtaining accurate estimates of measurement properties and examinee ability. In response to this concern, researchers have utilized response times as a proxy of RG and have attempted to improve parameter estimation accuracy by filtering RG responses using popular scoring approaches, such as the effort-moderated item response theory (EM-IRT) model. However, such an approach assumes that RG can be correctly identified based on an indirect proxy of examinee behavior. A failure to meet this assumption leads to the inclusion of distortive and psychometrically uninformative information in parameter estimates. To address this issue, a simulation study was conducted to examine how violations to the assumption of correct RG classification influences EM-IRT item and ability parameter estimation accuracy and compares these results with parameter estimates from the three-parameter logistic (3PL) model, which includes RG responses in scoring. Two RG misclassification factors were manipulated: type (underclassification vs. overclassification) and rate (10%, 30%, and 50%). Results indicated that the EM-IRT model provided improved item parameter estimation over the 3PL model regardless of misclassification type and rate. Furthermore, under most conditions, increased rates of RG underclassification were associated with the greatest bias in ability parameter estimates from the EM-IRT model. In spite of this, the EM-IRT model with RG misclassifications demonstrated more accurate ability parameter estimation than the 3PL model when the mean ability of RG subgroups did not differ. This suggests that in certain situations it may be better for practitioners to (a) imperfectly identify RG than to ignore the presence of such invalid responses and (b) select liberal over conservative response time thresholds to mitigate bias from underclassified RG.

Download Full-text

Analysis of Quality of Critical Thinking Skills Test Based on Item Response Theory Using R-Program

Psychology and Education Journal ◽

10.17762/pae.v58i1.867 ◽

2021 ◽

Vol 58 (1) ◽

pp. 1167-1174

Author(s):

Kaharuddin Arafah Et al.

Keyword(s):

Critical Thinking ◽

Fluid Mechanics ◽

Thinking Skills ◽

Critical Thinking Skills ◽

Thinking Skill ◽

Student Responses ◽

Test Items ◽

Skill Test ◽

3Pl Model

This research aims to describe the quality of critical thinking skill test of fluid mechanics material in senior high schools in Makassar City. The emphasis is the content validity aspect, the characteristics of each item for the One-Parameter Logistic (1PL) model, the Two-Parameter Logistic (2PL) model, dan the Three-Parameter Logistic (3PL) model. It is a descriptive quantitative research with the subject of all student responses to the critical thinking skills test of fluid mechanics material at senior high schools in Makassar City. There were 726 student responses. The data were collected online by google form and analyzed by using descriptive quantitative technique. The results showed that the critical thinking skill test of fluid mechanics material had fulfilled the content validity. Analysis of the quality of CTS test items showed that the models of 1PL, 2PL, and 3PL were all consistent, showing that the test items were mostly able to discriminate the high abilitytestees from the low abilitytestees. Of the three logistic model approaches used to estimate the parameters of item difficulty level, item discrimination, and guess factor, the 3PL model is better than the other twomodels.

Download Full-text

Quantifying Individuals’ Theory-Based Knowledge Using Probabilistic Causal Graphs: A Bayesian Hierarchical Approach

Volume 3: 17th International Conference on Design Education (DEC) ◽

10.1115/detc2020-22613 ◽

2020 ◽

Author(s):

Atharva Hans ◽

Ashish M. Chaudhari ◽

Ilias Bilionis ◽

Jitesh H. Panchal

Keyword(s):

Engineering Students ◽

Likelihood Function ◽

Computational Design ◽

Directed Acyclic Graphs ◽

Causal Knowledge ◽

Bayesian Hierarchical ◽

Proposed Model ◽

Significant Research ◽

Ability Parameter ◽

3Pl Model

Abstract Extracting an individual’s knowledge structure is a challenging task as it requires formalization of many concepts and their interrelationships. While there has been significant research on how to represent knowledge to support computational design tasks, there is limited understanding of the knowledge structures of human designers. This understanding is necessary for comprehension of cognitive tasks such as decision making and reasoning, and for improving educational programs. In this paper, we focus on quantifying theory-based causal knowledge, which is a specific type of knowledge held by human designers. We develop a probabilistic graph-based model for representing individuals’ concept-specific causal knowledge for a given theory. We propose a methodology based on probabilistic directed acyclic graphs (DAGs) that uses logistic likelihood function for calculating the probability of a correct response. The approach involves a set of questions for gathering responses from 205 engineering students, and a hierarchical Bayesian approach for inferring individuals’ DAGs from the observed responses. We compare the proposed model to a baseline three-parameter logistic (3PL) model from the item response theory. The results suggest that the graph-based logistic model can estimate individual students’ knowledge graphs. Comparisons with the 3PL model indicate that knowledge assessment is more accurate when quantifying knowledge at the level of causal relations than quantifying it using a scalar ability parameter. The proposed model allows identification of parts of the curriculum that a student struggles with and parts they have already mastered which is essential for remediation.

Download Full-text

Comparison of Scale Identification Methods in Mixture IRT Models

Journal of Modern Applied Statistical Methods ◽

10.22237/jmasm/1556669700 ◽

2020 ◽

Vol 18 (1) ◽

Author(s):

Youn-Jeng Choi ◽

Allan S. Cohen

Keyword(s):

Simulation Study ◽

Constraint Effect ◽

Identification Methods ◽

Irt Models ◽

Correct Model ◽

Mixture Irt ◽

3Pl Model

The effects of three scale identification constraints in mixture IRT models were studied. A simulation study found no constraint effect on the mixture Rasch and mixture 2PL models, but the item anchoring constraint was the only one that worked well on selecting correct model with the mixture 3PL model.

Download Full-text

The Power of Crossing SIBTEST

Applied Psychological Measurement ◽

10.1177/0146621620909907 ◽

2020 ◽

Vol 44 (5) ◽

pp. 393-408

Author(s):

Zhushan Li

Keyword(s):

Logistic Model ◽

Model Simulation ◽

Rejection Rate ◽

Theoretical Formula ◽

Test Statistic ◽

Item Functioning ◽

Alternative Hypotheses ◽

Size Standard ◽

Observed Rejection ◽

3Pl Model

Crossing SIBTEST or CSIB is designed to detect crossing differential item functioning (DIF) as well as unidirectional DIF. A theoretical formula for the power of CSIB is derived based on the asymptotic distribution of the test statistic under the null and alternative hypotheses. The derived power formula provides insights on the factors that influence the CSIB power, including DIF effect size, standard error, and sample size. The power formula and those influencing factors are further discussed in the context of the item response theory (IRT) three parameter logistic model (3PL) model. Simulation results show the consistency between the theoretical power and the observed rejection rate. The power of CSIB is compared with the unidirectional SIBTEST in theory and through simulation.

Download Full-text

Analysing Standard Progressive Matrices (SPM-LS) with Bayesian Item Response Models

Journal of Intelligence ◽

10.3390/jintelligence8010005 ◽

2020 ◽

Vol 8 (1) ◽

pp. 5 ◽

Cited By ~ 4

Author(s):

Paul-Christian Bürkner

Keyword(s):

Cognitive Ability ◽

Item Response ◽

Short Form ◽

Response Models ◽

Item Response Models ◽

Irt Models ◽

Dichotomous Responses ◽

Uncertainty Estimates ◽

Raven's Standard Progressive Matrices ◽

3Pl Model

Raven’s Standard Progressive Matrices (SPM) test and related matrix-based tests are widely applied measures of cognitive ability. Using Bayesian Item Response Theory (IRT) models, I reanalyzed data of an SPM short form proposed by Myszkowski and Storme (2018) and, at the same time, illustrate the application of these models. Results indicate that a three-parameter logistic (3PL) model is sufficient to describe participants dichotomous responses (correct vs. incorrect) while persons’ ability parameters are quite robust across IRT models of varying complexity. These conclusions are in line with the original results of Myszkowski and Storme (2018). Using Bayesian as opposed to frequentist IRT models offered advantages in the estimation of more complex (i.e., 3–4PL) IRT models and provided more sensible and robust uncertainty estimates.

Download Full-text

Analysing Standard Progressive Matrices (SPM-LS) with Bayesian Item Response Models

10.31234/osf.io/gn5hk ◽

2019 ◽

Author(s):

Paul - Christian Bürkner

Keyword(s):

Cognitive Ability ◽

Item Response ◽

Short Form ◽

Response Models ◽

Item Response Models ◽

Irt Models ◽

Dichotomous Responses ◽

Uncertainty Estimates ◽

Raven's Standard Progressive Matrices ◽

3Pl Model

Raven’s Standard Progressive Matrices (SPM) test and related matrix-based tests are widely applied measures of cognitive ability. Using Bayesian Item Response Theory (IRT) models, I reanalyse data of an SPM short form proposed by Myszkowski & Storme (2018) and, at the same time, illustrate the application of these models. Results indicate that a 3-parameter logistic (3PL) model is sufficient to describe participants dichotomous responses (correct vs. incorrect) while persons' ability parameters are quite robust across IRT models of varying complexity. These conclusions are in line with the original results of Myszkowski & Storme (2018). Using Bayesian as opposed to frequentist IRT models offered advantages in the estimation of more complex (i.e., 3-4PL) IRT models and provided more sensible and robust uncertainty estimates.

Download Full-text

3pl model
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Investigating the Impact of Noneffortful Responses on Individual-Level Scores: Can the Effort-Moderated IRT Model Serve as a Solution?

Investigating the Impact of Noneffortful Responses on Individual-Level Scores: Can the Effort-Moderated IRT Model Serve as a Solution?

Assessing the Accuracy of Parameter Estimates in the Presence of Rapid Guessing Misclassifications

Assessing the Accuracy of Parameter Estimates in the Presence of Rapid Guessing Misclassifications

Analysis of Quality of Critical Thinking Skills Test Based on Item Response Theory Using R-Program

Quantifying Individuals’ Theory-Based Knowledge Using Probabilistic Causal Graphs: A Bayesian Hierarchical Approach

Comparison of Scale Identification Methods in Mixture IRT Models

The Power of Crossing SIBTEST

Analysing Standard Progressive Matrices (SPM-LS) with Bayesian Item Response Models

Analysing Standard Progressive Matrices (SPM-LS) with Bayesian Item Response Models

Export Citation Format

3pl modelRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Investigating the Impact of Noneffortful Responses on Individual-Level Scores: Can the Effort-Moderated IRT Model Serve as a Solution?

Investigating the Impact of Noneffortful Responses on Individual-Level Scores: Can the Effort-Moderated IRT Model Serve as a Solution?

Assessing the Accuracy of Parameter Estimates in the Presence of Rapid Guessing Misclassifications

Assessing the Accuracy of Parameter Estimates in the Presence of Rapid Guessing Misclassifications

Analysis of Quality of Critical Thinking Skills Test Based on Item Response Theory Using R-Program

Quantifying Individuals’ Theory-Based Knowledge Using Probabilistic Causal Graphs: A Bayesian Hierarchical Approach

Comparison of Scale Identification Methods in Mixture IRT Models

The Power of Crossing SIBTEST

Analysing Standard Progressive Matrices (SPM-LS) with Bayesian Item Response Models

Analysing Standard Progressive Matrices (SPM-LS) with Bayesian Item Response Models

3pl model
Recently Published Documents