scholarly journals How does Rasch modeling reveal difficulty and suitability level the fraction test question?

Jurnal Elemen ◽  
2022 ◽  
Vol 8 (1) ◽  
pp. 66-76
Author(s):  
Karlimah Karlimah

This article explains how to analyze test items in arithmetic operation with fractions to obtain the items' level of difficulty and fitness. Data were collected by using multiple-choice questions given to 50 fourth-grade students of an elementary school in Tasikmalaya city. The answers were then analyzed using the Rasch model and Winsteps 3.75 application, a combination of standard deviation (SD) and logit mean values (Mean). The score data of each person and question were used to estimate the pure score in the logit scale, indicating the level of difficulty of the test items. The categories were difficult (logit value +1 SD); very difficult (0.0 logit +1 SD); easy (0.0 logit -1 SD); very easy (logit value –SD). Three criteria were used to determine the level of difficulty and fitness of the questions: the Outfit Z-Standard/ZSTD value; Outfit Mean Square/MNSQ; and Point Measure Correlation. It resulted in a collection of test items suitable for use with several levels of difficulties, namely, difficult, very difficult, easy, and very easy, from the previous items, which had difficult, medium, and easy categories. Rasch model can help categorize questions and students' ability levels.

2021 ◽  
Vol 12 (2) ◽  
pp. 399-412
Author(s):  
Adilla Desy Rizbudiani ◽  
Amat Jaedun ◽  
Abdul Rahim ◽  
Arief Nurrahman

A high-quality test has a balanced level of difficulty and can be completed by the respondent with their level of abilities. This study analyzed the test instrument used to measure students' mathematics abilities in the semester final exam on System of Linear Equations in Two-Variables. The purposive sampling technique was applied to select the respondent students (N=195). The test items were twenty multiple-choice questions. The researchers performed the data analysis using Rasch model Item Response Theory (IRT) approach with the QUEST program. The analysis revealed that the twenty items’ validity matched the Rasch model with a range of INFIT MNSQ values between 0.89 – 1.17. Items on the final semester exam can be used based on the estimated OUTFIT t-value less than equal to 2.00. The OUTFIT t analysis obtained nineteen qualified items and one unqualified item. 


Author(s):  
Nina Wahyuni ◽  
Yoga Budi Bhakti ◽  
Tatan Zenal Mutakin ◽  
Irnin Agustina Dwi Astuti

Difficulty understanding students' concepts needs to be known so that improvements can be made so that eventually students can master the concepts well. Misconception diagnostic tests can be an alternative to uncovering causes or factors of misconception in students. This study aims to develop a Four-Tier Test diagnostic test instrument to diagnose students' misconceptions on circular motion material. This research is a development study using 4D models.  The developed four-tier test diagnostic test instrument consists of 40 multiple-choice questions. The results of the content validation by five experts stated that the instrument was very feasible with an average percentage of 86,34%. Empirical trial results show as many as 40 valid diagnostic test items have enough to good power a level of difficulty and discrimination power. The reliability of the questions was analyzed using the Kuder Richardson formula of 0.785 with a high category. This instrument can be used to minimize the time spent in the process of identifying misconceptions because it is in the form of multiple-choice with varied questions.


2019 ◽  
Vol 5 (2) ◽  
pp. 69-76
Author(s):  
Ridho Adi Negoro ◽  
Viga Karina

Difficulty understanding students' concepts needs to be known so that improvements can be made so that eventually students can master the concepts well. Misconception diagnostic tests can be an alternative to uncovering causes or factors of misconception in students. This study aims to develop a Four-Tier Test diagnostic test instrument to diagnose students' misconceptions on vibration and wave material to adapt the Mechanical Waves Conceptual Survey by Thongchai which has content compatibility with the segmentation of Indonesian curriculum material. This research is a development study using 3D development models. The 3D development model consists of 3 stages, namely: 1) Define, 2) Design and 3) Develop. The developed four-tier test diagnostic test instrument consists of 22 multiple-choice questions. The results of the content validation by three experts stated that the instrument was very feasible with an average percentage of 87%. Empirical trial results show as many as 22 valid diagnostic test items have enough to good power a level of difficulty and discrimination power. The reliability of the questions was analyzed using the Kuder Richardson formula of 0.765 with a high category. This instrument can be used to minimize the time spent in the process of identifying misconceptions because it is in the form of multiple-choice with varied questions.


2019 ◽  
Vol 16 (2) ◽  
pp. 89-98
Author(s):  
E. B. Belov ◽  
M. V. Alekseev ◽  
N. P. Kitaev ◽  
A. I. Kuchumov

Purpose of the study. Rasch model is often used in processing test results. However, when using this model and the maximum likelihood method (ML), the estimates the levels of ability of respondents depend only on the number of correctly performed test items and do not depend on the difficulties of the items. The purpose of the research is to analyze the influence of the difficulties of the items on the levels of abilities of the respondents based on the weighted maximum likelihood method (WML). To obtain the weights of the WML, the item total scores are used. Materials and methods. The analysis of the influence of the difficulties of the items on the levels of abilities of the respondents is investigated using the dichotomous table obtained when testing the knowledge of 19 respondents in the course “Fundamentals of Electronics”. Indicator variables of 16 test items were used. For items, we calculate the item total scores that determine their items difficulties. The weighting coefficients of the used WML depend on the item total scores and on the coefficient of influence K. When K = 0, WML convert into ML. As K increases from 0 to 2, the weighting coefficients increase and it becomes possible to analyze in detail the influence of the difficulties of the items on the respondents’ ability levels. To calculate the parameters of the Rasch model based on WML, programs (M-files) for the MATLAB environment and Ministep (Winsteps) are used. Results. The use of WML with weighting coefficient obtained on the basis of the item total scores of the difficulties of the items allowed us to further differentiate the levels of respondents’ abilities in the dichotomous Rasch model. The results of the analysis performed using the data of the test on electronics show that, ceteris paribus, new levels of person’s abilities increase if respondents perform difficult items and, conversely, the respondent’s ability levels decrease if respondents perform light items. At the same time, the difficulty levels of the items practically do not change. As a rule, the greater the coefficient of influence K, the more different the estimation of abilities of respondents, obtained on the basis of WML, from the estimation on the basis of ML. However, there are respondents whose ability level does not change or change slightly when the coefficient K is increased from 0 to 2. For the data of the test on electronics with a coefficient K ≤ 1, the original order of respondents in their ability levels calculated on the basis of ML is preserved. With an increased coefficient of influence K ≥ 1,5, new levels of ability, calculated using WML, cause a change in the order of distribution of respondents according to ability levels. Calculations performed using the MATLAB package are confirmed by data obtained using the Winsteps program. Differences without extreme respondents do not exceed 0.01 logit with the maximum value of the coefficient K equal to 2. Conclusion. On the basis of WML, a method is proposed for taking into account the influence of the difficulties of items on the levels of respondents’ abilities in the Rasch dichotomous model when using the item total scores. The results of the analysis performed using the data of the test on electronics show that in this case we will obtain a differentiation of the levels of abilities of the respondents who score the same points. Note that the results obtained using WML and using the data of the test on electronics do not reject the data obtained on the basis of the classical dichotomous Rasch model and ML. The results obtained on the basis of WML, allow to refine the levels of abilities of the respondents, obtained on the basis of ML.


2018 ◽  
Vol 9 (1) ◽  
pp. 67-86
Author(s):  
Komalasari

The main purpose of this study was to evaluate the quality of post test utilized by LPMP Central Kalimantan Indonesia in curriculum 2013 training for X grade teachers. It uses Rasch analysis to explore the item fit, the reliability ( item and person), item difficulty, and the Wrigh map of post test. This study also applies Classical Test Teory (CTT) to determine item discrimination and distracters. Following a series of iterative Rasch analyses that adopted the “data should fit the model” approach, 30 items post test of curriculum 2013 training was analyzed using Acer Conquest 4 software, software based on Rasch measurement model. All items of post test of curriculum 2013 training are sufficient fit to the Rasch model. The difficulty levels (i.e. item measures) for the 30 items range from –1.746 logits to +1.861 logits. The item separation reliability is acceptable at 0.990 and person separation reliability is low at 0.485. The wright map indicates that the test is difficult for the teachers or the teachers have low ability in knowledge of curriculum 2013. The post test items cannot cover all the ranges of the teachers’ ability levels. Items discrimination of post test of curriculum 2013 training grouped into fair discrimination (item 2, 4, 5, 8, 11, 18) and poor discrimination (1, 3, 6, 7, 9, 10,12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30). Some distracters from item 1, 2, 6, 7, 8, 9, 11, 13, 16, 17, 18, 19, 20, 22, 24, 25, 27, 28, 29 and 30 are problematic. These distracters require further investigation or revision.   Key words: Rasch analysis, training, curriculum 2013, post test


2021 ◽  
Vol 9 (3) ◽  
pp. 329-345
Author(s):  
Ayi Darmana ◽  
Ani Sutiani ◽  
Haqqi Annazili Nasution ◽  
Ismanisa Ismanisa* ◽  
Nurhaswinda Nurhaswinda

Information about score obtained from a test is often interpreted as an indicator of the student's ability level. This is one of the weaknesses of classical analysis that are unable to provide meaningful and fair information. The acquisition of the same score if it comes from a test item with a different level of difficulty, must show different abilities. Analysis of the Rasch model will overcome this weakness. The purpose of this study was to analyze the quality of the items by validating the national chemistry exam instrument using the Rasch model. The research sample was 212 new students of the Department of Chemistry at the State University of Medan. The data collected was in the form of respondent's answer data to the 2013 chemistry UN questions, which amounted to 40 items multiple choice and uses the documentation method. Data analysis technique used the Rasch Model with Ministep software. The results of the analysis show the quality of the Chemistry National Exam (UN) questions is categorized as very good based on the following aspects: unidimension, item fit test, person map item, difficulty test level, person and item reliability. There is one item found to be gender bias, in which men benefit more than women. The average chemistry ability of respondents is above the average level of difficulty of the test items


2017 ◽  
Vol 2 (1) ◽  
pp. 1-10
Author(s):  
Aulia Sanova ◽  
Abu Bakar ◽  
Afrida Afrida

Test question is a measuring tool that very important to determine the success of teaching and learning process. However, the main weaknesses of the teacher-made items are less structured construction. This weakness is usually because of the limited time teachers have in SMPN 17 Jambi for constructing the item test. If the analysis of these items is done manually, it will require a lot of time. Therefore, in the activities of community service will be used the program ANATES V4 for analyzing the test items. This program is used because it is  easy and simple. The method used in this training are lectures, discussion-information/question and answer, demonstration and direct experiment. During the activity process, the participants are excited to practice the analysis of items directly and they are also already capable to carry out the analysis of the test form both multiple choice questions and essay. By analyzing the test, indirectly teachers can collect the test items or create “test item bank” that have validity and reliability.


Pythagoras ◽  
2009 ◽  
Vol 0 (69) ◽  
Author(s):  
Belinda Huntley ◽  
Johann Engelbrecht ◽  
Ansie Harding

In this study we propose a taxonomy for assessment in mathematics, which we call the assessment component taxonomy, to identify those components of mathematics that can be successfully assessed using alternative assessment formats. Based on the literature on assessment models and taxonomies in mathematics, this taxonomy consists of seven mathematics assessment components, hierarchically ordered by cognitive level, as well as the nature of the mathematical tasks associated with each component. Using a model that we developed earlier for measuring the quality of mathematics test items, we investigate which of the assessment components can be successfully assessed in the provided response question (PRQ) format, in particular multiple choice questions (MCQs), and which can be better assessed in the constructed response question (CRQ) format. The results of this study show that MCQs can be constructed to evaluate higher order levels of thinking and learning. The conclusion is that MCQs can be successfully used as an assessment format in undergraduate mathematics, more so in some assessment components than in others. The inclusion of the PRQ assessment format in all seven assessment components can reduce the large marking loads, associated with continuous assessment practices in undergraduate mathematics, without compromising the validity of the assessment.


Author(s):  
Ebru EZBERCI-CEVIK ◽  
Mehmet Altan KURNAZ

The aim of this study is to develop a concept test on the subject of stars that is suitable for model analysis and to evaluate the conformity of this model. The study was performed using a screening model, which is a type of quantitative research method. The study also tried to fill the gap in the existing literature regarding the use of quantitative methods. To develop the test, existing studies on stars were first examined; interviews with candidates who were teachers were conducted; and finally, necessary amendments to test items were made after referring to experts’ opinions. To assure its validity and reliability, the test was applied to a total of 175 candidates who were studying at schools of education in the departments of science teaching of three different universities to become teachers; all candidates took astronomy courses and were educated on the subject matter. The final form of the test comprised of 26 multiple-choice questions, each with 5 possible answers. The Cronbach’s alpha reliability coefficient of the test was calculated to be 0.735. In addition, the mean strength of the test was found to be 0.370 and the distinctiveness was found to be 0.390. Statistical analyses revealed that the concept test developed in this study is a valid and reliable test that conforms to the model analysis.


2020 ◽  
Vol 6 (2) ◽  
pp. 203-209
Author(s):  
Dinar Pratama ◽  
Ihda Husnayaini

There have been studies suggesting that students' reading comprehension in English subject is influenced by the accuracy of teaching strategy use. The use of appropriate teaching strategies is required to accommodate the diversity of students' abilities. Therefore, measurements are needed to provide actual information about students' abilities. This study aims to examine the levels of students' reading comprehension by using RASCH model. The subjects of this study were8th grade students, totaling 200 responses on the teacher's reading comprehension test with five alternative responses. The data analysis used was the Rasch 1 parameter model consisting of person reliability, item-person distribution maps, and item-person suitability. The findings of this study indicated that the average reading comprehension ability of students was included in the high category with a value (Meanperson) of 1.29 logits above the average level of difficulty of the test items (Meanitem) of 0. Further research is expected to be conducted to prove whether the addition of test items affect the value of person reliability.


Sign in / Sign up

Export Citation Format

Share Document