scholarly journals ITEM ANALYSIS OF AN ENGLISH SUMMATIVE TEST

Author(s):  
Leni Amelia Suek

While almost half of the teachers’ activities are assessing their students, they are not well-prepared with assessment literacy training. Hence, they are unable to produce good tests to measure students’ level of knowledge and skills. This study is aimed at analyzing item difficulty and item discrimination of a test made by an English teacher at a junior high school in Kupang. It was descriptive qualitative research and the instruments of the research were test items, answer keys, and students’ answer sheets. For the difficulty index, it was revealed that more than half of the test items were easy, while only 2% of the test items were difficult. In terms of the discrimination index, it was found that only 10% of the test items were excellent and most of the test items (46%) were poor. These findings indicated that the English test had a poor item difficulty index and low item discrimination index. Hence, it did not fulfill the criteria of a good test and could not measure students’ true ability. It is highly recommended for the teachers to improve the test items and for the government to provide assessment training for the teachers so that they can produce good tests.

2020 ◽  
Vol 3 (2) ◽  
pp. 133
Author(s):  
Thresia Trivict Semiun ◽  
Fransiska Densiana Luruk

This study aimed at examining the quality of an English summative test of grade VII in a public school located in Kupang. Particularly, this study examined content validity, reliability, and conducted item analysis including item validity, item difficulty, item discrimination, and distracter effectiveness. This study was descriptive evaluative research with documentation to collect data. The data was analyzed quantitatively except for content validity, which was done qualitatively. Content validity was analyzed by matching the test items with materials stated in the curriculum. The findings revealed that the English summative test had a high content validity. The reliability was estimated by applying the Kuder-Richardson’s formula (K-R20). The result showed that the test was reliable and very good for a classroom test. The item analysis was conducted by using ITEMAN 3.0. and it revealed that the the test was mostly constructed by easy items, most of the items could discriminate the students, most distracters were able to perform well, and the most of items were valid.


Author(s):  
Bai Koyu ◽  
Rajkumar Josmee Singh ◽  
L. Devarani ◽  
Ram Singh ◽  
L. Hemochandra

The knowledge test was developed to measure the knowledge level of kiwi growers. In all 36 items were predominantly fabricated on the basis of indorsing rational rather than root memorization and to discriminate the sound knowledgeable kiwi growers from the ailing knowledgeable ones. The scores obtained from sample respondents were imperilled to item analysis, embracing of item difficulty index & item discrimination index. In the ultimate selection, the scale consisted of 15 items with difficulty index ranging from 30-80 and discrimination index ranging from 0.30 to 0.55. Split-Half method was employed to check the reliability of knowledge test being developed and it was found to be 0.711.


Author(s):  
Bai Koyu ◽  
Rajkumar Josmee Singh ◽  
L. Devarani ◽  
Ram Singh ◽  
L. Hemochandra

The knowledge test was developed to measure the knowledge level of apple growers. 32 items were primarily fabricated on the basis of ratifying rational rather than root memorization and to classify the sound erudite apple growers from the ailing erudite ones. The scores obtained from sample respondents were exposed to item analysis, embracing of item difficulty index & item discrimination index. In the ultimate selection, the scale consisted of 22 items with difficulty index ranging from 30-80 and discrimination index ranging from 0.30 to 0.55. To check the reliability of knowledge test being developed Split-Half method was employed and it was found to be 0.701.


2013 ◽  
Vol 66 (1) ◽  
Author(s):  
Ong Eng Tek ◽  
Mohd Al-Junaidi Mohamad

This study aims to develop a valid and reliable multiple-choice test referred to as Test of Basic and Integrated Process Skills (T-BIPS) for secondary schools to measure the acquisition of a full range of 12 science process skills (SPS), namely 7 basic SPS and 5 integrated SPS. This study involves two phases. Phase one entails the generation of test items according to a set of item objectives, and the establishment of the content and face validities as well as response objectivity in a qualitative manner through the use of panel experts. Phase two involves validating the psychometric properties of the instrument using field testing data from 104 Form 4 students of top, average and bottom sets in urban and rural schools. The final set of T-BIPS consists of 60 items: 28 items for basic SPS (with the KR-20 reliability of 0.86) and 32 items for integrated SPS (with the KR-20 reliability of 0.89). The mean item difficulty index is 0.60, ranging between 0.37 and 0.75, while the mean item discrimination index is 0.52, ranging between 0.20 and 0.77. The results of item analysis indicate that T-BIPS with the appropriate psychometric characteristics is an acceptable, valid and reliable test to measure the acquisition of science process skills. 


2020 ◽  
Vol 41 (S1) ◽  
pp. s303-s303
Author(s):  
Kelly Holmes ◽  
Mishga Moinuddin ◽  
Sandi Steinfeld

Background: Development of an interrater reliability (IRR) process for healthcare-associated infection surveillance is a valuable learning tool for infection preventionists (IPs) and increases accuracy and consistency in applying National Healthcare Safety Network (NHSN) definitions (1-3). Case studies from numerous resources were distributed to infection preventionists of varying experience levels (4-6). Item analysis, including item difficulty index and item discrimination index, was applied to individual test questions to determine the validity of the case scenarios at measuring individual mastery of the NHSN surveillance definitions (7-8). Methods: Beginning in 2016, a mandatory internal IRR program was developed and distributed to infection preventionists (IPs) of varying experience level. Each year through 2019, a test containing 30–34 case studies was developed with multiple-choice questions. Case studies were analyzed using 2 statistical methods to determine item difficulty and validity of written scenarios. P values for each test question were calculated using the item difficulty index formula, with harder questions resulting in values closer to 0.0. Point biserial correlation was applied to each question to determine highly discriminating questions, measured in a range from −1.0 and 1.0. Results: Between 2016 and 2019, 124 questions were developed and 145 respondents participated in the mandatory IRR program. The overall test difficulty was 0.70 (range, 0.64–0.74). Moreover, 17 questions (14%) were determined to have high “excellent” discrimination, 41 questions (33%) were determined to have “good” discrimination, 57 questions (46%) were determined to have “poor” discrimination, and 9 questions (7%) were found to have negative discrimination values. Conclusions: IRR testing identifies educational opportunities for IPs responsible for the correct application of NHSN surveillance definitions. Valid test scenarios are foundational components of IRR tests. Case scenarios that are determined to have a high discrimination index should be used to develop future test questions to better assess mastery of application of surveillance definitions to clinical cases.Funding: NoneDisclosures: None


2020 ◽  
Vol 5 (2) ◽  
pp. 491
Author(s):  
Amalia Vidya Maharani ◽  
Nur Hidayanto Pancoro Setyo Putro

Numerous studies have been conducted on the item test analysis in English test. However, investigation on the characteristics of a good test of English final semester test is still rare in several districts in East Java. This research sought to examine the quality of the English final semester test in the academic year of 2018/2019 in Ponorogo. A total of 151 samples in the form of students’ answers to the test were analysed based on item difficulty, item discrimination, and distractors’ effectiveness using Quest program. This descriptive quantitative research revealed that the test does not have good proportion among easy, medium, and difficult item. In the item discrimination, the test had 39 excellent items (97.5%) which meant that the test could discriminate among high and low achievers. Besides, the distractors could distract students since there were 32 items (80%) that had effective distractors. The findings of this research provided insights that item analysis became important process in constructing test. It related to find the quality of the test that directly affects the accuracy of students’ score.


Kilat ◽  
2018 ◽  
Vol 7 (1) ◽  
pp. 15-23
Author(s):  
Redaksi Tim Jurnal

This paper discusses the design of an application for the analysis on items using quantitative method with the classical approach as a media evaluation for teachers in determining the quality of questions in an exam or a test. The classical approach is the process of examining each item in the test based on the students' answers for calculating the Difficulty Index (Dif I) and the Discrimination Index (DI) of each question. The Difficulty Index is the ratio between the number of students who answered an item (or a question) correctly and the total number of students who participated in the test. The Discrimination Index is a measure of how well an item (or a question) is able to distinguish between clever students and students who are less clever. The data used in this research was obtained from class VII SMPN 10 (Junior High School) Makassar, especially on math (subject). The data includes math questions that have been tested, the answers given bythe students of class VII, and the data about the students. The results of the item analysis will provide information whether a question can be accepted, corrected or to be rejected. The application generated from this research is expected to be able to offer assistance for the teacher in performing item analysis. The results of the analysis can be used to help teachers in compiling more qualified questions in the future.


2019 ◽  
pp. 1-6
Author(s):  
Bai Koyu ◽  
Rajkumar Josmee Singh ◽  
L. Devarani ◽  
Ram Singh ◽  
L. Hemochandra

The knowledge test was developed to measure the knowledge of large cardamom growers. All 32 items were primarily fabricated on the basis of indorsing rational rather than root memorization and discriminate the sound knowledgeable large cardamom growers from the ailing knowledgeable ones.The scores from selected respondents were subjected to item analysis, consisting of item difficulty index and item discrimination index.In the final selection, the scale consisted of 17 items with ranging from 30-80 and discrimination index ranging from 0.30 to 0.55. The reliability of knowledge test being developed was tested by using Split-Half method and it was found to be 0.704.


2020 ◽  
Vol 12 (2-2) ◽  
Author(s):  
Nor Aisyah Saat

Item analysis is the process of examining student responses to test items individually in order to get clear picture on the quality of the item and the overall test. Teachers are encouraged to perform item analysis for each administered test in order to determine which items should be retained, modified, or discarded in the given test. This study aims to analyse items in 2 summative examination question papers by using classical test theory (CTT). The instruments used were the SPM Mathematics Trial Examination Questions 1 2019 which involved 50 students in form 5 students and the SPM Mathematics Trial Examination Question 1 2019 which involved 20 students. The SPM Mathematics Trial Examination Question paper 1 contains 40 objective questions while the SPM Mathematics Trial Examination paper 1 contains 25 subjective questions. The data obtained were analysed using Microsoft Excel software based on the formulas of item difficulty index and discrimination index. This analysis can help teachers for better understanding about the difficulty level of the items used. Finally, based on the analysis items obtained, the items were classified as good, good but improved, marginal or weak items.


2020 ◽  
Vol 6 (1) ◽  
Author(s):  
Endrati Jati Siwi ◽  
Rosyita Anindyarini ◽  
Sabiqun Nahar

Yureka Education Center (YEC) is one of the institutions which has developed an online-based English proficiency test. The test is called  English Proficiency Online Test (EPOT) which follows the TOEFL ITP (Institutional Testing Program) framework. So, this study aimed to analyze the characteristics of EPOT instruments consisting of Listening, Structure, and Reading subtests, which later the quality of each EPOT test item is identified. This study used a descriptive quantitative approach by describing the characteristics of EPOT test items in terms of item difficulty index, item discrimination index, test information’s function and test measurement’s errors. The data were collected through EPOT trials conducted by 2652 online test takers as partisipant from 20 provinces in Indonesia. The collected data were then analyzed using the Item Response Theory (IRT) approach using BILOG program on all logistic parameter models which began with the item compatibility test against the model. Based on the results of the analysis, it can be concluded that all subtests matched the 3-PL model. Most of EPOT’s test items had a good range of difficulty index and discrimination index. The EPOT information’s function showed that accurate items were used on the 3-PL model for certain capability range. This study was expected to point out that EPOT test could be used as an alternative English proficiency test that is easy to use and useful.


Sign in / Sign up

Export Citation Format

Share Document