Test Item Analysis and Relationship Between Difficulty Level and Discrimination Index of Test Items in an Achievement Test in Biology

Apart from teaching, English language teachers need to assess their students by giving a test to know the students� achievements. In general, teachers are barely conducting item analysis on their tests. As a result, they have no idea about the quality of their test distributed to the students. The present study attempts to figure out the levels of difficulty (LD) and the discriminating power (DP) of the multiple-choice (MC) test item constructed by an English teacher in the reading comprehension test utilizing test item analysis. This study employs a qualitative approach. For this purpose, a test of 50-MC test items of reading comprehension was obtained from the students� test results. Thirty-five students of grade eight took part in the MC test try-out. They are both male (15) and female (20) students of junior high school 2 Kempo, in West Nusa Tenggara Province. The findings revealed that16 items out of 50 test items were rejected due to the poor and worst quality level of difficulty and discriminating index. Meanwhile, 12 items need to be reviewed due to their mediocre quality, and 11 items are claimed to have good quality items. Besides, 11 items out of 50 test items were considered as the excellent quality as their DP scores reached around 0.44 through 0.78. The implications of the present study will shed light on the quality of teacher-made test items, especially for the MC test.

Download Full-text

CONSTRUCTION AND STANDARDIZATION OF ACHIEVEMENT TEST IN ENGLISH

Scholarly Research Journal for Interdisciplinary Studies ◽

10.21922/srjis.v4i37.10785 ◽

2017 ◽

Vol 4 (37) ◽

Author(s):

Amardeep Kaur

Keyword(s):

Expert Opinion ◽

Content Validity ◽

English Teachers ◽

Achievement Test ◽

School Education ◽

Difficulty Level ◽

Test Items ◽

Class Viii ◽

Standard Ix

The present study was conducted to Construct and Standardize an Achievement Test in English for IX standard students . Test items were selected from syllabus of VIII grade prescribed by Punjab School Education Board, Mohali . Since the achievement test was intended for standard IX , therefore the VIII grade English textbook was used for constructing the achievement test. The entire syllabus was thoroughly scrutinized and then items were selected from the books of class VIII of P.S.E.B. In all 130 items from 14 aspects of class VIII were taken. After seeking expert opinion, items were reduced to 120. Each item was allotted one mark. Further , 20 items were rejected on the basis of difficulty level and discriminating value of the items. 100 items were selected which lie between .40 to .60. Content validity of the achievement test in English was established with help of experts' opinion i.e. English teachers of different schools. The split-half method was used to establish reliability and its calculated reliability is 0.86.

Download Full-text

Development of Phoneme Perception Test for School-Aged Children

Audiology and Speech Research ◽

10.21848/asr.210012 ◽

2021 ◽

Vol 17 (2) ◽

pp. 187-197

Author(s):

Eun-Yeong Shin

Keyword(s):

Test Item ◽

Multiple Choice ◽

Word List ◽

School Age Children ◽

Difficulty Level ◽

Test Items ◽

School Aged Children ◽

Phoneme Perception ◽

Hearing Children ◽

High Degree

Purpose: The purpose of the present study was to develop lists of phoneme perception tests for school-aged children.Methods: The 127 initial and 94 final consonant test items were modified by the difficulty level (reduced the number for the multiple-choice and controlled familiarity of target words and foil words). The validity of the results for normal hearing children was evaluated. Through discussions by experts in various fields, the target word list was revised. Words with a low percentage (<90%) of correct answers, vowel-consonant in an initial consonant test item, and consonant-vowel (CV) in final consonant item were eliminated and produced the last revised consonant perception test item for school-aged children.Results: The consonant test item consisted of three multiple-choice words type of consonant-vowel-consonant or CV. The 50 initial and 25 final consonant perception test items included the high degree of familiarity, corresponding frequency of phonemes for daily life speech sounds of children.Conclusion: To analyze the phoneme perception ability of school-age children by listening and to evaluate phoneme errors in children with congenital high frequency hearing loss, the results of this study are useful.

Download Full-text

Impact Measurement on Medical Faculty for Adhering to Appropriate Guidelines in Framing Effective Multiple-Choice Questions for Item Analysis

Journal of Medical Education ◽

10.5812/jme.103482 ◽

2020 ◽

Vol 19 (1) ◽

Author(s):

Surajit Kundu ◽

Jaideo M Ughade ◽

Anil R Sherke ◽

Yogita Kanwar ◽

Samta Tiwari ◽

...

Keyword(s):

Item Analysis ◽

Multiple Choice ◽

Medical Faculty ◽

Discrimination Index ◽

Cognitive Domain ◽

Difficulty Level ◽

Multiple Choice Questions ◽

Impact Measurement ◽

Difficulty Index ◽

The Impact

Background: Multiple-choice questions (MCQs) are the most frequently accepted tool for the evaluation of comprehension, knowledge, and application among medical students. In single best response MCQs (items), a high order of cognition of students can be assessed. It is essential to develop valid and reliable MCQs, as flawed items will interfere with the unbiased assessment. The present paper gives an attempt to discuss the art of framing well-structured items taking kind help from the provided references. This article puts forth a practice for committed medical educators to uplift the skill of forming quality MCQs by enhanced Faculty Development programs (FDPs). Objectives: The objective of the study is also to test the quality of MCQs by item analysis. Methods: In this study, 100 MCQs of set I or set II were distributed to 200 MBBS students of Late Shri Lakhiram Agrawal Memorial Govt. Medical College Raigarh (CG) for item analysis for quality MCQs. Set I and Set II were MCQs which were formed by 60 medical faculty before and after FDP, respectively. All MCQs had a single stem with three wrong and one correct answers. The data were entered in Microsoft excel 2016 software to analyze. The difficulty index (Dif I), discrimination index (DI), and distractor efficiency (DE) were the item analysis parameters used to evaluate the impact on adhering to the guidelines for framing MCQs. Results: The mean calculated difficulty index, discrimination index, and distractor efficiency were 56.54%, 0.26, and 89.93%, respectively. Among 100 items, 14 items were of higher difficulty level (DIF I < 30%), 70 were of moderate category, and 16 items were of easy level (DIF I > 60%). A total of 10 items had very good DI (0.40), 32 had recommended values (0.30 - 0.39), and 25 were acceptable with changes (0.20 - 0.29). Of the 100 MCQs, there were 27 MCQs with DE of 66.66% and 11 MCQs with DE of 33.33%. Conclusions: In this study, higher cognitive-domain MCQs increased after training, recurrent-type MCQ decreased, and MCQ with item writing flaws reduced, therefore making our results much more statistically significant. We had nine MCQs that satisfied all the criteria of item analysis.

Download Full-text

Item and distracter analysis of multiple choice questions (MCQs) from a preliminary examination of undergraduate medical students

International Journal of Research in Medical Sciences ◽

10.18203/2320-6012.ijrms20175453 ◽

2017 ◽

Vol 5 (12) ◽

pp. 5351 ◽

Cited By ~ 1

Author(s):

Durgesh Prasad Sahoo ◽

Rakesh Singh

Keyword(s):

Multiple Choice ◽

Discrimination Index ◽

Difficulty Level ◽

Validity And Reliability ◽

Multiple Choice Questions ◽

Undergraduate Medical Students ◽

Test Items ◽

Preliminary Examination ◽

High Discrimination ◽

Difficulty Index

Background: Multiple choice questions (MCQs) or Items forms an important part to assess students in different educational streams. It is an objective mode of assessment which requires both the validity and reliability depending on the characteristics of its items i.e. difficulty index, discrimination index and distracter efficiency. To evaluate MCQs or items and build a bank of high-quality test items by assessing with difficulty index, discrimination index and distracter efficiency and also to revise/store or remove errant items based on obtained results.Methods: A preliminary examination of Third MBBS Part-1 was conducted by Department of Community Medicine undertaken for 100 students. Two separate papers with total 30 MCQs or items and 90 distractors each in both papers were analyzed and compared. Descriptive as well as inferential statistics were used to analyze the data.Results: The findings show that most of the items were falling in acceptable range of difficulty level however some items were rejected due to poor discrimination index. Overall paper I was found to be more difficult and more discriminatory, but its distractor efficiency was slightly low as compared to paper II.Conclusions: The analysis helped us in selection of quality MCQs having high discrimination and average difficulty with three functional distractors. This should be incorporated into future evaluations to improve the test score and properly discriminate among the students.

Download Full-text

Reflection of the Test-Item Quality in State SMP and SMA in Bandar Lampung

AKSARA: Jurnal Bahasa dan Sastra ◽

10.23960/aksara/v20i2.pp72-87 ◽

2019 ◽

Vol 20 (2) ◽

pp. 72-87

Author(s):

Ujang Suparman ◽

Keyword(s):

Test Item ◽

Item Analysis ◽

Descriptive Statistics ◽

Test Items ◽

National Examination ◽

Discriminating Power ◽

Item Quality ◽

Level Of Difficulty

The objectives of this research are to analyze critically the quality of test items used in SMP and SMA (mid semester, final semester, and National Examination Practice) in terms of reliability as a whole, level of difficulty, discriminating power, the quality of answer keys and distractors. The methods used to analyze the test items are item analysis (ITEMAN), two types of descriptive statistics for analyzing test items and another for analyzing the options. The findings of the research are very far from what is believed, that is, the quality of majority of test items as well as key answers and distractors are unsatisfactory. Based the results of the analysis, conclusions are drawn and recommendations are put forward.

Download Full-text

Further Evidence Favoring Three-Option Items in Multiple-Choice Tests

European Journal of Psychological Assessment ◽

10.1027/1015-5759.14.3.197 ◽

1998 ◽

Vol 14 (3) ◽

pp. 197-201 ◽

Cited By ~ 20

Author(s):

Ana R. Delgado ◽

Gerardo Prieto

Keyword(s):

Test Item ◽

Multiple Choice ◽

Achievement Test ◽

Optimal Number ◽

Choice Test ◽

Multiple Choice Test ◽

Multiple Choice Tests ◽

Test Items ◽

Choice Tests ◽

Item Writing

This study examined the validity of an item-writing rule concerning the optimal number of options in the design of multiple-choice test items. Although measurement textbooks typically recommend the use of four or five options - and most ability and achievement tests still follow this rule - theoretical papers as well as empirical research over a period of more than half a century reveal that three options may be more suitable for most ability and achievement test items. Previous results show that three-option items, compared with their four-option versions, tend to be slightly easier (i. e., with higher traditional difficulty indexes) without showing any decrease in discrimination. In this study, two versions (with four and three options) of 90 items comprising three computerized examinations were applied in successive years, showing the expected trend. In addition, there were no systematic changes in reliability for the tests, which adds to the evidence favoring the use of the three-option test item.

Download Full-text

Item Analysis for a Better Quality Test

English Language in Focus (ELIF) ◽

10.24853/elif.2.1.59-70 ◽

2019 ◽

Vol 2 (1) ◽

pp. 59

Author(s):

Neti Hartati ◽

Hendro Pratama Supra Yogi

Keyword(s):

Quantitative Analysis ◽

Item Analysis ◽

Document Analysis ◽

Qualitative Approach ◽

Difficulty Level ◽

Small Scale ◽

Test Items ◽

Discriminating Power ◽

Multiple Choice Items

This study is a small-scale study of item analysis of a teacher’s own-made summative test. It examines the quality of multiple-choice items in terms of the difficulty level, the discriminating power, and the effectiveness of distractors. The study employed a qualitative approach which also used a simple quantitative analysis to analyze the quality of the test items through the document analysis of the teacher’s English summative test and the students’ answer sheets. The result shows that the summative test has more easy items than difficult items with the ratio of 19:25:6 while they should be 1:2:1 for easy, medium, and difficult. In terms of the Discriminating Power, there are 3, 13, and 16 for excellent, Good, and satisfactory level, but there are 17 and 2 for poor and bad levels of Discriminating Power. There are 43 (21.5%) of all distractors which are dysfunctional which, in turns, makes the items too easy which also makes the items fail to discriminate the upper-group students from the lower ones. Therefore, the 43 dysfunctional distractors should be revised to alter the difficulty level and improve the discriminating power. This research is expected to serve as a reflective means for teachers to examine their own-made test to ensure the quality of their test items.

Download Full-text

ANALISIS KUALITAS BUTIR SOAL PILIHAN GANDA MATA PELAJARAN KIMIA PADA UJIAN AKHIR SEMESTER (UAS) KELAS XI MAN 2 KEPULAUAN MERANTI

JEDCHEM (JOURNAL EDUCATION AND CHEMISTRY) ◽

10.36378/jedchem.v3i1.780 ◽

2021 ◽

Vol 3 (1) ◽

pp. 11-20

Author(s):

Ulfah Zahiroh ◽

Pangoloan Soleman Ritonga

Keyword(s):

Test Item ◽

Multiple Choice ◽

Answer Sheet ◽

Choice Test ◽

Difficulty Level ◽

Final Exam ◽

Good Test ◽

Test Items ◽

Quantitative Descriptive

This research aimed at knowing the quality of test item derived from its validity, reliability, difficulty level, discriminator power, and distractor effectiveness. Quantitative descriptive method was used in this research. Interview and documentation were the techniques of collecting the data. The data source used was in evensemester exam questions that were in the forms of multiple-choice, student answer sheet, and answer key. Anates 4.0.9 program was to analyze the quality of test items. The research findings of the analysis of multiple-choice test item quality on semester final exam of Chemistry subject at the eleventh grade of State Islamic Senior High School 2 Kepulauan Meranti showed that in the validity analysis there were 6 valid test items (17%) and 29 non validitems (83%); in the reliability analysis it was obtained the reliability score 0.955; in the difficulty level analysis there were 12 easy test items (34%), 17 medium items (49%), and 6 hard items (17%); in the discriminator power analysis there were 4 very good test items (11.5%), a good item (3%), 19 items (54%) that should be revised, and 11 items (31.5%) that should be eliminated; in the distractor effectiveness there 26 very good options (19%), 10 good options (7%), 25 poor options (18%), 55 bad options (39%), and 24 very bad options (17%). Therefore, it could be concluded that the quality of test items could be stated bad.

Download Full-text

ANALYSIS OF PROGRESS TEST RESULTS IN MEDICAL FACULTY STUDENTS

Jurnal Pendidikan Kedokteran Indonesia The Indonesian Journal of Medical Education ◽

10.22146/jpki.49224 ◽

2021 ◽

Vol 10 (1) ◽

pp. 63

Author(s):

Ade Pryta Romanauli Simaremare

Keyword(s):

Teaching And Learning ◽

Formative Evaluation ◽

Clinical Medicine ◽

Item Analysis ◽

Discrimination Index ◽

Difficulty Level ◽

Progress Test ◽

Cross Sectional ◽

Passing Rate ◽

Academic Year

Background: Assessment of learning outcomes is an important evaluation material to show how the teaching and learning process has been carried out. It can be obtained from formative and summative assessment, then students are given feedback from these results. One method for formative evaluation is a progress test. During the implementation at the HKBP Nommensen University Faculty of Medicine, there had never been an analysis of the results of the Progress Test activity. This study was conducted for analysing of the results of the Progress Test held in the even semester of the 2018/2019 academic year. Methods: This study used an observational descriptive design with cross sectional method. The sample were all students of the Faculty of Medicine who were actively studying in the even semester of the 2018/2019 Academic Year totaling 215 subjects. Item analysis was done on the questions for basic and clinical medicine category by the level of difficulty and the discrimination index based on students’ study period. Results: Students passing rate that attended the progress test in this study were very low. However, the score achieved by the students increased along with the length of students’ study period. Item analysis resulted the difficulty level majority at the medium level, and the discrimination index majority at the poor level for both the basic and clinical medicine science category. Conclusion: Progress testing can be used as a tool to help curriculum designer see the development of students’ knowledge skills both individually and in population.

Download Full-text