A Reanalysis of the Survey-based Navigation Strategy Scale based on Item Response Theory

Mapping Intimacies ◽

10.31234/osf.io/6f8z2 ◽

2018 ◽

Author(s):

Jimmy Y. Zhong

Keyword(s):

Item Response Theory ◽

Item Response ◽

Large Scale ◽

Relative Magnitude ◽

Latent Structure ◽

Response Theory ◽

Future Studies ◽

Bidimensional Model ◽

Navigation Strategy ◽

Component Loading

Focusing on 12 allocentric/survey-based strategy items of the Navigation Strategy Questionnaire (Zhong & Kozhevnikov, 2016), the current study applied item response theory-based analysis to determine whether a bidimensional model could better describe the latent structure of the survey-based strategy. Results from item and model fit diagnostics, categorical response and item information curves showed that an item with the lowest rotated component loading (.27) [SURVEY12], could be considered for exclusion in future studies; and that a bidimensional model with three preference-related items constituting a content factor offered a better representation of the latent structure than a unidimensional model per se. Mean scores from these three items also correlated significantly with a pointing-to-landmarks task to the same relative magnitude as the mean scores from all items, and all items excluding SURVEY12. These findings gave early evidence suggesting that the three preference-related items could constitute a subscale for deriving quick estimates of large-scale allocentric spatial processing in healthy adults in both experimental and clinical settings. Potential cognitive and brain mechanisms were discussed, followed by calls for future studies to gather greater evidence confirming the predictive validity of the full and sub scales, along with the design of new items focusing on environmental familiarity. [COPYRIGHT CC-BY-NC-ND 4.0 J. Y. ZHONG 2018]. AUTHOR'S NOTE: Officially published as "Reanalysis of an Allocentric Navigation Strategy Scale based on Item Response Theory"

Download Full-text

Reanalysis of an Allocentric Navigation Strategy Scale based on Item Response Theory

Open Psychology ◽

10.1515/psych-2020-0100 ◽

2020 ◽

Vol 2 (1) ◽

pp. 90-105

Author(s):

Jimmy Y. Zhong

Keyword(s):

Item Response Theory ◽

Item Response ◽

Large Scale ◽

Relative Magnitude ◽

Latent Structure ◽

Response Theory ◽

Future Studies ◽

Bidimensional Model ◽

Navigation Strategy ◽

Component Loading

AbstractFocusing on 12 allocentric/survey-based strategy items of the Navigation Strategy Questionnaire (Zhong & Kozhevnikov, 2016), the current study applied item response theory-based analysis to determine whether a bidimensional model could better describe the latent structure of the survey-based strategy. Results from item and model fit diagnostics, categorical response and item information curves showed that an item with the lowest rotated component loading (.27) [SURVEY12], could be considered for exclusion in future studies; and that a bidimensional model with three preference-related items constituting a content factor offered a better representation of the latent structure than a unidimensional model per se. Mean scores from these three items also correlated significantly with a pointing-to-landmarks task to the same relative magnitude as the mean scores from all items, and all items excluding SURVEY12. These findings gave early evidence suggesting that the three preference-related items could constitute a subscale for deriving quick estimates of large-scale allocentric spatial processing in healthy adults in both experimental and clinical settings. Potential cognitive and brain mechanisms were discussed, followed by calls for future studies to gather greater evidence confirming the predictive validity of the full and sub scales, along with the design of new items focusing on environmental familiarity.

Download Full-text

Analytics in International Large-Scale Assessments: Item Response Theory and Population Models

Handbook of International Large-Scale Assessment ◽

10.1201/b16061-12 ◽

2013 ◽

pp. 169-188

Keyword(s):

Item Response Theory ◽

Item Response ◽

Large Scale ◽

Population Models ◽

Response Theory ◽

Large Scale Assessments

Download Full-text

The Delta-Scoring Method of Tests With Binary Items: A Note on True Score Estimation and Equating

Educational and Psychological Measurement ◽

10.1177/0013164417724187 ◽

2017 ◽

Vol 78 (5) ◽

pp. 805-825 ◽

Cited By ~ 6

Author(s):

Dimiter M. Dimitrov

Keyword(s):

Item Response Theory ◽

Item Response ◽

Large Scale ◽

Standard Errors ◽

True Score ◽

Response Theory ◽

Scoring Method ◽

New Developments ◽

Item Response Function ◽

True Values

This article presents some new developments in the methodology of an approach to scoring and equating of tests with binary items, referred to as delta scoring (D-scoring), which is under piloting with large-scale assessments at the National Center for Assessment in Saudi Arabia. This presentation builds on a previous work on delta scoring and adds procedures for scaling and equating, item response function, and estimation of true values and standard errors of D scores. Also, unlike the previous work on this topic, where D-scoring involves estimates of item and person parameters in the framework of item response theory, the approach presented here does not require item response theory calibration.

Download Full-text

The accuracy of the cheating detection methods in large-scale tests: mathematics national examination

Jurnal Penelitian dan Evaluasi Pendidikan ◽

10.21831/pep.v22i2.14930 ◽

2018 ◽

Vol 22 (2) ◽

pp. 130-142

Author(s):

Thomas Mbenu Nulangi ◽

Djemari Mardapi

Keyword(s):

Item Response Theory ◽

Item Response ◽

Large Scale ◽

Accurate Method ◽

School Level ◽

Detection Methods ◽

Response Theory ◽

Index Method ◽

Maximum Information ◽

National Examination

This study aimed to describe (1) the characteristics of items based on the Item Response Theory, (2) the cheating level in the implementation of the national examinartion based on Angoffs B-Index method, Pair 1 method, Pair 2 method, Modified Error Similarity Analysis (MESA) method, and G2 method, (3) the most accurate method to detect the cheating in the mathematics national examination at the senior secondary school level in the academic year of 2015/2016 in East Nusa Tenggara Province. The result of the item response theory analysis showed that 17 (42.5%) items of the mathematics national examination fit with the 3-PL model, with the maximum information function of 58.0128 at 1.6, and the measurement error of 0.1313. The number of pairs detected to be cheating by Angoff’s B-Index method was 63 pairs, that by the Pair 1 method was 52 pairs, that by the Pair 2 method was 141 pairs, that by MESA method was 67 pairs, and that by the G2 method was 183 pairs. The methods which could detect most pairs doing cheating were the G2 method, the Pair 2 method, the MESA method, Angoff’s B-Index method, and the Pair 1 method successively. The methods which could accurately detect cheating based on the computation of the standard error were Angoff’s B-Index method, the G2 method, the MESA method, the Pair 1 method, and the Pair 2 method successively.

Download Full-text

ITEM RESPONSE THEORY APPROACH: KALIBRASI BUTIR SOAL PENILAIAN AKHIR SEMESTER MATA PELAJARAN BAHASA ARAB

Arabi Journal of Arabic Studies ◽

10.24865/ajas.v6i1.320 ◽

2021 ◽

Vol 6 (1) ◽

pp. 93

Author(s):

Rahmat Danni ◽

Ajeng Wahyuni ◽

Tauratiya Tauratiya

Keyword(s):

Item Response Theory ◽

Item Response ◽

Large Scale ◽

Arabic Language ◽

Thinking Skills ◽

Future Research ◽

Theory Approach ◽

Response Theory ◽

Good Level ◽

Level Of Difficulty

This study describes the item details of the final semester questions in Arabic MAN 1 Pangkalpinang using the item response theory approach. The problem behind this research is that the development of Arabic final assessment items did not go through the correct stages. Therefore, this research is quantitative research. The subjects of this study were 176 students of class XI MAN 1 Pangkalpinang. The answer data is in the form of answers to questions in the final semester in Arabic which are 40 multiple-choice items with five answers. The results showed that the final results of the Arabic semester (1) proved valid, indicated by 40 items (100%) with loading factors; (2) proven to be reliable, indicated by the reliability coefficient of 0.884; (3) there are 33 items (82.5%) of the 40 items that have a good level of difficulty and distinguishing power so that they can be stored in the question bank and used in subsequent activities, while 7 items (17.5%) are item number 10, 26, 27, 29, 32, 34, and 35 do not meet the criteria for a good level of difficulty so they need to be revised or eliminated; and (4) suitable for use in students with low to moderate ability (θ) in the range -3.5 to +1.5 in logit. Future research is expected to be able to analyze Arabic language question items in the form of descriptive tests on a large scale or develop high-quality high-order thinking skills in Arabic.

Download Full-text

Bayesian analysis in item response theory applied to a large-scale educational assessment

10.1093/oxfordhb/9780198703174.013.22 ◽

2018 ◽

Author(s):

Dani Gamerman ◽

Tufi M. Soares ◽

Flávio Gonçalves

Keyword(s):

Item Response Theory ◽

Bayesian Analysis ◽

Item Response ◽

Large Scale ◽

International Student ◽

Student Assessment ◽

Educational Assessment ◽

Response Theory ◽

International Student Assessment ◽

Mathematics Ability

This article discusses the use of a Bayesian model that incorporates differential item functioning (DIF) in analysing whether cultural differences may affect the performance of students from different countries in the various test items which make up the OECD’s Programme for International Student Assessment (PISA) test of mathematics ability. The PISA tests in mathematics and other subjects are used to compare the educational attainment of fifteen-year old students in different countries. The article first provides a background on PISA, DIF and item response theory (IRT) before describing a hierarchical three-parameter logistic model for the probability of a correct response on an individual item to determine the extent of DIF remaining in the mathematics test of 2003. The results of Bayesian analysis illustrate the importance of appropriately accounting for all sources of heterogeneity present in educational testing and highlight the advantages of the Bayesian paradigm when applied to large-scale educational assessment.

Download Full-text

Assessment of fit of item response theory models used in large-scale educational survey assessments

Large-scale Assessments in Education ◽

10.1186/s40536-016-0025-3 ◽

2016 ◽

Vol 4 (1) ◽

Cited By ~ 5

Author(s):

Peter W. van Rijn ◽

Sandip Sinharay ◽

Shelby J. Haberman ◽

Matthew S. Johnson

Keyword(s):

Item Response Theory ◽

Item Response ◽

Large Scale ◽

Response Theory ◽

Item Response Theory Models

Download Full-text

IRT and MIRT Models for Item Parameter Estimation With Multidimensional Multistage Tests

Journal of Educational and Behavioral Statistics ◽

10.3102/1076998619881790 ◽

2019 ◽

Vol 45 (4) ◽

pp. 383-402

Author(s):

Paul A. Jewsbury ◽

Peter W. van Rijn

Keyword(s):

Item Response Theory ◽

Item Response ◽

Latent Variable ◽

Large Scale ◽

Real Data ◽

Item Parameter ◽

Practical Reasons ◽

Parameter Estimates ◽

Response Theory ◽

Item Parameter Estimates

In large-scale educational assessment data consistent with a simple-structure multidimensional item response theory (MIRT) model, where every item measures only one latent variable, separate unidimensional item response theory (UIRT) models for each latent variable are often calibrated for practical reasons. While this approach can be valid for data from a linear test, unacceptable item parameter estimates are obtained when data arise from a multistage test (MST). We explore this situation from a missing data perspective and show mathematically that MST data will be problematic for calibrating multiple UIRT models but not MIRT models. This occurs because some items that were used in the routing decision are excluded from the separate UIRT models, due to measuring a different latent variable. Both simulated and real data from the National Assessment of Educational Progress are used to further confirm and explore the unacceptable item parameter estimates. The theoretical and empirical results confirm that only MIRT models are valid for item calibration of multidimensional MST data.

Download Full-text