scholarly journals Analysis of Rasch Model for the Validation of Chemistry National Exam Instruments

2021 ◽  
Vol 9 (3) ◽  
pp. 329-345
Author(s):  
Ayi Darmana ◽  
Ani Sutiani ◽  
Haqqi Annazili Nasution ◽  
Ismanisa Ismanisa* ◽  
Nurhaswinda Nurhaswinda

Information about score obtained from a test is often interpreted as an indicator of the student's ability level. This is one of the weaknesses of classical analysis that are unable to provide meaningful and fair information. The acquisition of the same score if it comes from a test item with a different level of difficulty, must show different abilities. Analysis of the Rasch model will overcome this weakness. The purpose of this study was to analyze the quality of the items by validating the national chemistry exam instrument using the Rasch model. The research sample was 212 new students of the Department of Chemistry at the State University of Medan. The data collected was in the form of respondent's answer data to the 2013 chemistry UN questions, which amounted to 40 items multiple choice and uses the documentation method. Data analysis technique used the Rasch Model with Ministep software. The results of the analysis show the quality of the Chemistry National Exam (UN) questions is categorized as very good based on the following aspects: unidimension, item fit test, person map item, difficulty test level, person and item reliability. There is one item found to be gender bias, in which men benefit more than women. The average chemistry ability of respondents is above the average level of difficulty of the test items

2018 ◽  
Vol 9 (1) ◽  
pp. 67-86
Author(s):  
Komalasari

The main purpose of this study was to evaluate the quality of post test utilized by LPMP Central Kalimantan Indonesia in curriculum 2013 training for X grade teachers. It uses Rasch analysis to explore the item fit, the reliability ( item and person), item difficulty, and the Wrigh map of post test. This study also applies Classical Test Teory (CTT) to determine item discrimination and distracters. Following a series of iterative Rasch analyses that adopted the “data should fit the model” approach, 30 items post test of curriculum 2013 training was analyzed using Acer Conquest 4 software, software based on Rasch measurement model. All items of post test of curriculum 2013 training are sufficient fit to the Rasch model. The difficulty levels (i.e. item measures) for the 30 items range from –1.746 logits to +1.861 logits. The item separation reliability is acceptable at 0.990 and person separation reliability is low at 0.485. The wright map indicates that the test is difficult for the teachers or the teachers have low ability in knowledge of curriculum 2013. The post test items cannot cover all the ranges of the teachers’ ability levels. Items discrimination of post test of curriculum 2013 training grouped into fair discrimination (item 2, 4, 5, 8, 11, 18) and poor discrimination (1, 3, 6, 7, 9, 10,12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30). Some distracters from item 1, 2, 6, 7, 8, 9, 11, 13, 16, 17, 18, 19, 20, 22, 24, 25, 27, 28, 29 and 30 are problematic. These distracters require further investigation or revision.   Key words: Rasch analysis, training, curriculum 2013, post test


2018 ◽  
Vol 3 (1) ◽  
pp. 73
Author(s):  
Yulinda Erma Suryani

<p class="IABSTRAK"><strong>Abstract:</strong> The concept of objective measurement in the social sciences and educational assessment must have five criteria: 1) Gives a linear measure with the same interval; 2) Conduct a proper estimation process; 3) Finding unfeasible items (misfits) or outliers; 4) Overcoming the lost data; 5) Generate replicable measurements (independent of the parameters studied). These five conditions of measurement, so far only Rasch model that can fulfill it. The quality of intelligence measurements made with the Rasch model will have the same quality as the measurements made in the physical dimension in the field of physics. The logit scale (log odds unit) generated in the Rasch model is the scale of the same interval and is linear from the data ratio (odds ratio).  Based on the results of the analysis that has been done on the IST test instrument can be seen that in general the quality of IST test included in either category. Of the 176 IST test items there is only 1 item that is not good, ie aitem 155 (WU19) so that aitem 155 should be discarded. Based on the DIF analysis it can be seen that there are 28 items in favor of one gender only, so the twenty-eight items should be revised.</p><strong>Abstrak: </strong>Konsep pengukuran objektif dalam ilmu sosial dan penilaian pendidikan harus memiliki lima kriteria: 1) Memberikan ukuran yang linier dengan interval yang sama; 2) Melakukan proses estimasi yang tepat; 3) Menemukan item yang tidak tepat (misfits) atau tidak umum (outlier); 4) Mengatasi data yang hilang; 5) Hasilkan pengukuran yang <em>replicable </em>(independen dari parameter yang diteliti). Kelima kondisi pengukuran ini, sejauh ini hanya model Rasch yang bisa memenuhinya. Kualitas pengukuran kecerdasan yang dibuat dengan model Rasch akan memiliki kualitas yang sama dengan pengukuran yang dibuat dalam dimensi fisik di bidang fisika. Skala logit (<em>log odds unit</em>) yang dihasilkan dalam Rasch model adalah skala interval yang sama dan linear dari rasio data (<em>odds ratio</em>). Berdasarkan hasil analisis yang telah dilakukan pada instrumen tes IST dapat diketahui bahwa secara umum kualitas tes IST termasuk dalam kategori baik. Dari 176 item tes IST hanya ada 1 item yang tidak bagus, yaitu aitem 155 (WU19) sehingga aitem 155 harus dibuang. Berdasarkan analisis DIF dapat dilihat bahwa ada 28 item yang mendukung satu jenis kelamin saja, sehingga dua puluh delapan item harus direvisi.


2013 ◽  
Vol 2013 ◽  
pp. 1-7 ◽  
Author(s):  
Michaela Wagner-Menghin ◽  
Ingrid Preusche ◽  
Michael Schmidts

Background. Relevant literature reports no increase in individual scores when test items are reused, but information on change in item difficulty is lacking. Purpose. To test an approach for quantifying the effects of reusing items on item difficulty. Methods. A total of 671 students sat a newly introduced exam in four testing shifts. The test forms experimentally combined published, unused, and reused items. Figures quantifying reuse effects were obtained using the Rasch model to compare item difficulties from different person samples. Results. The observed decrease in mean item difficulty for reused items was insignificant. Students who self-scheduled to the last test performed worse than other students did. Conclusion. Availability of leaked material did not translate into higher individual scores, as mastering leaked material does not guarantee transfer of knowledge to new exam items. Exam quality will not automatically deteriorate when a low ratio of randomly selected items is reused.


2016 ◽  
Vol 12 (8) ◽  
pp. 212
Author(s):  
Muhammad Iqbal Tariq Idris ◽  
Abdul Hafidz Omar ◽  
Dayang Hjh Tiawa Awang Hj Hamid ◽  
Fahmi Bahri Sulaiman

<p>Hajj Instrument (HAJI) was developed to determine hajj pilgrim’s wellness. This study used Rasch measurement to evaluate the psychometric properties including validity and reliability of the HAJI. The respondents involved in this study were 300 comprised of Malaysian hajj pilgrims. HAJI consists of eight constructs namely physical care, physical activity, healthy eating, knowledge, mental toughness, intrapersonal, interpersonal and relationship with Creator and natures. Validity of each construct and content was determined through dimensionality, item fit and item polarity while the reliability was achieved by administered person and item separation. The results showed that the reliability for both item and person were 0.99 and 0.96 respectively. Besides, there were no items need to be dropped based on PTMEA CORR and INFIT MNSQ results. The study revealed that the items of HAJI fit the Rasch model as well as able to measure hajj pilgrim’s wellness. </p>


2021 ◽  
Vol 6 (2) ◽  
pp. 256
Author(s):  
Sayit Abdul Karim ◽  
Suryo Sudiro ◽  
Syarifah Sakinah

Apart from teaching, English language teachers need to assess their students by giving a test to know the students� achievements. In general, teachers are barely conducting item analysis on their tests. As a result, they have no idea about the quality of their test distributed to the students. The present study attempts to figure out the levels of difficulty (LD) and the discriminating power (DP) of the multiple-choice (MC) test item constructed by an English teacher in the reading comprehension test utilizing test item analysis. This study employs a qualitative approach. For this purpose, a test of 50-MC test items of reading comprehension was obtained from the students� test results. Thirty-five students of grade eight took part in the MC test try-out. They are both male (15) and female (20) students of junior high school 2 Kempo, in West Nusa Tenggara Province. The findings revealed that16 items out of 50 test items were rejected due to the poor and worst quality level of difficulty and discriminating index. Meanwhile, 12 items need to be reviewed due to their mediocre quality, and 11 items are claimed to have good quality items. Besides, 11 items out of 50 test items were considered as the excellent quality as their DP scores reached around 0.44 through 0.78. The implications of the present study will shed light on the quality of teacher-made test items, especially for the MC test.


2017 ◽  
Vol 6 (1) ◽  
pp. 52
Author(s):  
Wahyu Arta S ◽  
Abdul Asib ◽  
Dewi Sri Wahyuni

The objective of this study is to identify the quality of the test items<strong> </strong>used as a final test in the second semester for the eleventh grade students in SMA N in Magetan. This research used descriptive method. In collecting the data the writer used document (English final test items, syllabus, and students‟ answer sheets) as data sources. The data were analyzed by using the formula given by Ahmann and Glock. The results of this study shows that 57.5% of the total items have a good level in discriminating index, 45% of the items have fulfilled satisfactory criteria in difficulty level, 11 items had possessed the effective distracter, while the item‟s indicator 92.5 % of the items are compatible with the learning indicator mentioned in the syllabus, and in the construction aspect, 75% of the total items possess a good stem and 82.5% of the total items are able to fulfill all the aspects of good alternatives. In short, the items used as final test have good quality in constructing aspects, and its compatibility with the syllabus. However, some items are less effective viewed from its level of difficulty and the effectiveness of the disctracter aspect.


2020 ◽  
Vol 4 (2) ◽  
Author(s):  
Eyus Sudihartinih ◽  
Sufyani Prabawanto

The purpose of this study is to describe the results of the analysis of the quality of the mathematical students' initial ability in the field of geometry concepts through the Rasch model. This research is descriptive qualitative research at a university in Indonesia. The participants of this study were 44 students who studied analytic geometry (30 women and 14 men). This research instrument is a matter of description of the concept of geometry as many as four questions. Based on the results of the study note that all items meet the standard criteria as measurement so that the questions can be used as instruments in further research.


2017 ◽  
Vol 9 (3-2) ◽  
Author(s):  
Samah Ali Mohsen Mofreh ◽  
Mohamed Najib Abul Gafar ◽  
Abdul Hafidz Hj Omar ◽  
Adibah Abdul Latif ◽  
Dayang Hjh Tiawa Awg Hj Hamid

This study aims to explore a set of items measuring the lecturers’ teaching practice for developing a true scale of teaching practice. Rasch Model is applied to produce specific measurements on the lecturers’ teaching practice in order to generalize results and inferential studies. The items proved to measure a single dimension of lecturers’ teaching practice. The developed instrument termed LTP covers five dimensions. Construct validity was achieved through the Rasch Model Analysis using the dimensionality, item fit, and item polarity parameters. The reliability of the instrument was achieved by conducting person and item separation analysis, Cronbach alpha, and calculated person and item reliability estimates. The results of Rasch Model Analysis show that the items of LTP fit the model appropriately.


2020 ◽  
Vol 3 (1) ◽  
pp. 45-50
Author(s):  
Azizah Azizah ◽  
Sapti Wahyuningsih

This research was conducted to analyze the test instruments used to measure the ability of students in actuarial mathematics courses in mathematics department, State University of Malang. Participants in this course consist of 40 students. The form of questions given were  multiple choice questions totaling 50 items related to the premium calculation material on life insurance. The Rasch model is used to get fit items. This analysis is carried out with the help of Winsteps software. From the Winsteps program output, 25 items were obtained fit to the Rasch model with an average MNSQ Outfit score for person and item 0.98 and 0.98, respectively. While the Outfit ZSTD value for person and item are 0 and -0.01 respectively. While the reliability of the instrument stated in alpha cronbach is 0.85


Sign in / Sign up

Export Citation Format

Share Document