scholarly journals A Reanalysis of the Survey-based Navigation Strategy Scale based on Item Response Theory

2018 ◽  
Author(s):  
Jimmy Y. Zhong

Focusing on 12 allocentric/survey-based strategy items of the Navigation Strategy Questionnaire (Zhong & Kozhevnikov, 2016), the current study applied item response theory-based analysis to determine whether a bidimensional model could better describe the latent structure of the survey-based strategy. Results from item and model fit diagnostics, categorical response and item information curves showed that an item with the lowest rotated component loading (.27) [SURVEY12], could be considered for exclusion in future studies; and that a bidimensional model with three preference-related items constituting a content factor offered a better representation of the latent structure than a unidimensional model per se. Mean scores from these three items also correlated significantly with a pointing-to-landmarks task to the same relative magnitude as the mean scores from all items, and all items excluding SURVEY12. These findings gave early evidence suggesting that the three preference-related items could constitute a subscale for deriving quick estimates of large-scale allocentric spatial processing in healthy adults in both experimental and clinical settings. Potential cognitive and brain mechanisms were discussed, followed by calls for future studies to gather greater evidence confirming the predictive validity of the full and sub scales, along with the design of new items focusing on environmental familiarity. [COPYRIGHT CC-BY-NC-ND 4.0 J. Y. ZHONG 2018]. AUTHOR'S NOTE: Officially published as "Reanalysis of an Allocentric Navigation Strategy Scale based on Item Response Theory"

2020 ◽  
Vol 2 (1) ◽  
pp. 90-105
Author(s):  
Jimmy Y. Zhong

AbstractFocusing on 12 allocentric/survey-based strategy items of the Navigation Strategy Questionnaire (Zhong & Kozhevnikov, 2016), the current study applied item response theory-based analysis to determine whether a bidimensional model could better describe the latent structure of the survey-based strategy. Results from item and model fit diagnostics, categorical response and item information curves showed that an item with the lowest rotated component loading (.27) [SURVEY12], could be considered for exclusion in future studies; and that a bidimensional model with three preference-related items constituting a content factor offered a better representation of the latent structure than a unidimensional model per se. Mean scores from these three items also correlated significantly with a pointing-to-landmarks task to the same relative magnitude as the mean scores from all items, and all items excluding SURVEY12. These findings gave early evidence suggesting that the three preference-related items could constitute a subscale for deriving quick estimates of large-scale allocentric spatial processing in healthy adults in both experimental and clinical settings. Potential cognitive and brain mechanisms were discussed, followed by calls for future studies to gather greater evidence confirming the predictive validity of the full and sub scales, along with the design of new items focusing on environmental familiarity.


2017 ◽  
Vol 78 (5) ◽  
pp. 805-825 ◽  
Author(s):  
Dimiter M. Dimitrov

This article presents some new developments in the methodology of an approach to scoring and equating of tests with binary items, referred to as delta scoring (D-scoring), which is under piloting with large-scale assessments at the National Center for Assessment in Saudi Arabia. This presentation builds on a previous work on delta scoring and adds procedures for scaling and equating, item response function, and estimation of true values and standard errors of D scores. Also, unlike the previous work on this topic, where D-scoring involves estimates of item and person parameters in the framework of item response theory, the approach presented here does not require item response theory calibration.


2018 ◽  
Vol 22 (2) ◽  
pp. 130-142
Author(s):  
Thomas Mbenu Nulangi ◽  
Djemari Mardapi

This study aimed to describe (1) the characteristics of items based on the Item Response Theory, (2) the cheating level in the implementation of the national examinartion based on Angoffs B-Index method, Pair 1 method, Pair 2 method, Modified Error Similarity Analysis (MESA) method, and G2 method, (3) the most accurate method to detect the cheating in the mathematics national examination at the senior secondary school level in the academic year of 2015/2016 in East Nusa Tenggara Province. The result of the item response theory analysis showed that 17 (42.5%) items of the mathematics national examination fit with the 3-PL model, with the maximum information function of 58.0128 at 1.6, and the measurement error of 0.1313. The number of pairs detected to be cheating by Angoff’s B-Index method was 63 pairs, that by the Pair 1 method was 52 pairs, that by the Pair 2 method was 141 pairs, that by MESA method was 67 pairs, and that by the G2 method was 183 pairs. The methods which could detect most pairs doing cheating were the G2 method, the Pair 2 method, the MESA method, Angoff’s B-Index method, and the Pair 1 method successively. The methods which could accurately detect cheating based on the computation of the standard error were Angoff’s B-Index method, the G2 method, the MESA method, the Pair 1 method, and the Pair 2 method successively.


2021 ◽  
Vol 6 (1) ◽  
pp. 93
Author(s):  
Rahmat Danni ◽  
Ajeng Wahyuni ◽  
Tauratiya Tauratiya

This study describes the item details of the final semester questions in Arabic MAN 1 Pangkalpinang using the item response theory approach. The problem behind this research is that the development of Arabic final assessment items did not go through the correct stages. Therefore, this research is quantitative research. The subjects of this study were 176 students of class XI MAN 1 Pangkalpinang. The answer data is in the form of answers to questions in the final semester in Arabic which are 40 multiple-choice items with five answers. The results showed that the final results of the Arabic semester (1) proved valid, indicated by 40 items (100%) with loading factors; (2) proven to be reliable, indicated by the reliability coefficient of 0.884; (3) there are 33 items (82.5%) of the 40 items that have a good level of difficulty and distinguishing power so that they can be stored in the question bank and used in subsequent activities, while 7 items (17.5%) are item number 10, 26, 27, 29, 32, 34, and 35 do not meet the criteria for a good level of difficulty so they need to be revised or eliminated; and (4) suitable for use in students with low to moderate ability (θ) in the range -3.5 to +1.5 in logit. Future research is expected to be able to analyze Arabic language question items in the form of descriptive tests on a large scale or develop high-quality high-order thinking skills in Arabic.


Author(s):  
Dani Gamerman ◽  
Tufi M. Soares ◽  
Flávio Gonçalves

This article discusses the use of a Bayesian model that incorporates differential item functioning (DIF) in analysing whether cultural differences may affect the performance of students from different countries in the various test items which make up the OECD’s Programme for International Student Assessment (PISA) test of mathematics ability. The PISA tests in mathematics and other subjects are used to compare the educational attainment of fifteen-year old students in different countries. The article first provides a background on PISA, DIF and item response theory (IRT) before describing a hierarchical three-parameter logistic model for the probability of a correct response on an individual item to determine the extent of DIF remaining in the mathematics test of 2003. The results of Bayesian analysis illustrate the importance of appropriately accounting for all sources of heterogeneity present in educational testing and highlight the advantages of the Bayesian paradigm when applied to large-scale educational assessment.


2019 ◽  
Vol 45 (4) ◽  
pp. 383-402
Author(s):  
Paul A. Jewsbury ◽  
Peter W. van Rijn

In large-scale educational assessment data consistent with a simple-structure multidimensional item response theory (MIRT) model, where every item measures only one latent variable, separate unidimensional item response theory (UIRT) models for each latent variable are often calibrated for practical reasons. While this approach can be valid for data from a linear test, unacceptable item parameter estimates are obtained when data arise from a multistage test (MST). We explore this situation from a missing data perspective and show mathematically that MST data will be problematic for calibrating multiple UIRT models but not MIRT models. This occurs because some items that were used in the routing decision are excluded from the separate UIRT models, due to measuring a different latent variable. Both simulated and real data from the National Assessment of Educational Progress are used to further confirm and explore the unacceptable item parameter estimates. The theoretical and empirical results confirm that only MIRT models are valid for item calibration of multidimensional MST data.


Sign in / Sign up

Export Citation Format

Share Document