Item Analysis of English Final Semester Test

This study aimed at examining the quality of an English summative test of grade VII in a public school located in Kupang. Particularly, this study examined content validity, reliability, and conducted item analysis including item validity, item difficulty, item discrimination, and distracter effectiveness. This study was descriptive evaluative research with documentation to collect data. The data was analyzed quantitatively except for content validity, which was done qualitatively. Content validity was analyzed by matching the test items with materials stated in the curriculum. The findings revealed that the English summative test had a high content validity. The reliability was estimated by applying the Kuder-Richardson’s formula (K-R20). The result showed that the test was reliable and very good for a classroom test. The item analysis was conducted by using ITEMAN 3.0. and it revealed that the the test was mostly constructed by easy items, most of the items could discriminate the students, most distracters were able to perform well, and the most of items were valid.

Download Full-text

nalysis of a Research Instrument to Map English Teachers’ Proficiency

IJEE (Indonesian Journal of English Education) ◽

10.15408/ijee.v1i1.11888 ◽

2019 ◽

Vol 1 (1) ◽

pp. 48-64

Author(s):

Siti Mina Tamah ◽

Anita Lie

Keyword(s):

English Education ◽

Item Difficulty ◽

Item Analysis ◽

English Teachers ◽

Multiple Choice Questions ◽

Good Test ◽

Test Analysis ◽

Research Instrument ◽

D Values ◽

English Instruction

ABSTRACTTeachers’ English proficiency can be measured by designing a research instrument in a form of test. The devised test must fulfill the requirement of a good test. This article is aimed at discussing item analysis centering on multiple choice questions used to measure the proficiency of Indonesian High School teachers involved in English instruction. The first set of syllabus oriented test is tried out to 20 subjects, and the second set – general English oriented – to 28 subjects. The test analysis indicates the item difficulty indices range from .20 to 1 for the first set and .07 to .89 for the second set. With regard to item discrimination analysis, the study finds the d values range from -0.33 to 1.0 for the first set, and -0.11 to .78 for the second set. It is found that the whole test has ‘average’ level of difficulty and is ‘good’ at discriminating between high and low achieving test takers; to be used for the actual research, a revision of the test is done to eliminate the ‘bad’ items.ABSTRAKKecakapan bahasa Inggris guru dapat diukur dengan merancang instrumen penelitian dalam bentuk tes. Tes yang dirancang harus memenuhi persyaratan tes yang baik. Artikel ini bertujuan membahas analisis soal yang berpusat pada pertanyaan pilihan ganda yang digunakan untuk mengukur kemahiran guru-guru SMA Indonesia yang terlibat dalam pengajaran Bahasa Inggris. Tes set kesatu yang berorientasi silabus diujicobakan pada 20 subjek. Set kedua - berorientasi Bahasa Inggris umum - diujicobakan ke 28 subjek. Analisis tes menunjukkan bahwa indeks kesulitan soal berkisar dari .20 hingga 1 untuk set pertama dan .07 hingga .89 untuk set kedua. Terkait analisis diskriminasi item, studi ini menemukan bahwa nilai D berkisar dari -0,33 ke 1,0 untuk set pertama, dan -0,11 hingga 0,78 untuk set kedua. Ditemukan bahwa keseluruhan tes memiliki tingkat kesulitan 'rata-rata' dan 'baik' dalam membedakan antara peserta tes berprestasi tinggi dan rendah. Untuk digunakan dalam penelitian aktual, revisi tes dilakukan dengan menghilangkan soal 'buruk'. How to Cite: Tamah, S. M., Lie, A. (2019). Analysis of a Research Instrument to Map English Teachers’ Proficiency. IJEE (Indonesian Journal of English Education), 6(1), 48-64. doi:10.15408/ijee.v6i1.11888

Download Full-text

nalysis of a Research Instrument to Map English Teachers’ Proficiency

IJEE (Indonesian Journal of English Education) ◽

10.15408/ijee.v6i1.11888 ◽

2019 ◽

Vol 6 (1) ◽

pp. 48-64

Author(s):

Siti Mina Tamah ◽

Anita Lie

Keyword(s):

English Education ◽

Item Difficulty ◽

Item Analysis ◽

English Teachers ◽

Multiple Choice Questions ◽

Good Test ◽

Test Analysis ◽

Research Instrument ◽

D Values ◽

English Instruction

ABSTRACTTeachers’ English proficiency can be measured by designing a research instrument in a form of test. The devised test must fulfill the requirement of a good test. This article is aimed at discussing item analysis centering on multiple choice questions used to measure the proficiency of Indonesian High School teachers involved in English instruction. The first set of syllabus oriented test is tried out to 20 subjects, and the second set – general English oriented – to 28 subjects. The test analysis indicates the item difficulty indices range from .20 to 1 for the first set and .07 to .89 for the second set. With regard to item discrimination analysis, the study finds the d values range from -0.33 to 1.0 for the first set, and -0.11 to .78 for the second set. It is found that the whole test has ‘average’ level of difficulty and is ‘good’ at discriminating between high and low achieving test takers; to be used for the actual research, a revision of the test is done to eliminate the ‘bad’ items.ABSTRAKKecakapan bahasa Inggris guru dapat diukur dengan merancang instrumen penelitian dalam bentuk tes. Tes yang dirancang harus memenuhi persyaratan tes yang baik. Artikel ini bertujuan membahas analisis soal yang berpusat pada pertanyaan pilihan ganda yang digunakan untuk mengukur kemahiran guru-guru SMA Indonesia yang terlibat dalam pengajaran Bahasa Inggris. Tes set kesatu yang berorientasi silabus diujicobakan pada 20 subjek. Set kedua - berorientasi Bahasa Inggris umum - diujicobakan ke 28 subjek. Analisis tes menunjukkan bahwa indeks kesulitan soal berkisar dari .20 hingga 1 untuk set pertama dan .07 hingga .89 untuk set kedua. Terkait analisis diskriminasi item, studi ini menemukan bahwa nilai D berkisar dari -0,33 ke 1,0 untuk set pertama, dan -0,11 hingga 0,78 untuk set kedua. Ditemukan bahwa keseluruhan tes memiliki tingkat kesulitan 'rata-rata' dan 'baik' dalam membedakan antara peserta tes berprestasi tinggi dan rendah. Untuk digunakan dalam penelitian aktual, revisi tes dilakukan dengan menghilangkan soal 'buruk'. How to Cite: Tamah, S. M., Lie, A. (2019). Analysis of a Research Instrument to Map English Teachers’ Proficiency. IJEE (Indonesian Journal of English Education), 6(1), 48-64. doi:10.15408/ijee.v6i1.11888

Download Full-text

THE QUALITY OF TEACHER-MADE TEST IN EFL CLASSROOM AT THE ELEMENTARY SCHOOL AND ITS WASHBACK IN THE LEARNING

Journal of English Education ◽

10.31327/jee.v2i2.289 ◽

2017 ◽

Vol 2 (2) ◽

pp. 97-104

Author(s):

Desrin Lebagi ◽

S. Sumardi ◽

S. Sudjoko

Keyword(s):

Elementary School ◽

Language Learning ◽

Test Item ◽

Teaching And Learning ◽

Item Analysis ◽

Good Test ◽

Test Analysis ◽

Test Items ◽

Difficulty Index

One of essential phases in language learning is measurement. Test as a tool of measurement process must then be well constructed. The quality of test itself can be determined through test item analysis. However, in some occasions, teachers tend to ignore test item analysis because of time limitation and other responsibilities. Referring to this problem, this research aimed to describe the quality of test items including the difficulty index, the discrimination index, the distractor index, and the reliability of the test and the Washback of teacher-made test on students’ motivation in learning English. It was conducted at Gamaliel Elementary School in academic year of 2016-2017. This case study utilized purposive sampling. In collecting the data, the researcher used interview, observation, and document analysis as the techniques of collecting data. The informants were an English teacher and students of Gamaliel Elementary School. The documents were students’ answer sheets. In analyzing test items, the researchers used ITEMAN program. The result of this study shows that the teacher-made test can be classified in good test. The test brings both positive and negative Washback in students’ motivation in learning. Therefore, it is recommended for the teacher to conduct test analysis as a way of evaluating and improving his teaching and learning and test itself as well as to encourage the students to study even though they are not confronted with a test.

Download Full-text

ITEM ANALYSIS OF AN ENGLISH SUMMATIVE TEST

PEJLaC: Pattimura Excellence Journal of Language and Culture ◽

10.30598/pejlac.v1.i1.pp9-18 ◽

2021 ◽

Vol 1 (1) ◽

pp. 9-18

Author(s):

Leni Amelia Suek

Keyword(s):

Junior High School ◽

Item Difficulty ◽

Item Analysis ◽

Discrimination Index ◽

Item Discrimination ◽

Good Test ◽

Test Items ◽

Level Of Knowledge ◽

The Government ◽

Difficulty Index

While almost half of the teachersâ€™ activities are assessing their students, they are not well-prepared with assessment literacy training. Hence, they are unable to produce good tests to measure studentsâ€™ level of knowledge and skills. This study is aimed at analyzing item difficulty and item discrimination of a test made by an English teacher at a junior high school in Kupang. It was descriptive qualitative research and the instruments of the research were test items, answer keys, and studentsâ€™ answer sheets. For the difficulty index, it was revealed that more than half of the test items were easy, while only 2% of the test items were difficult. In terms of the discrimination index, it was found that only 10% of the test items were excellent and most of the test items (46%) were poor. These findings indicated that the English test had a poor item difficulty index and low item discrimination index. Hence, it did not fulfill the criteria of a good test and could not measure studentsâ€™ true ability. It is highly recommended for the teachers to improve the test items and for the government to provide assessment training for the teachers so that they can produce good tests.

Download Full-text

How Do We Evaluate the Good Test of Arabic in High School in Indonesia?/كيف نواصف الاختبار الجيد لمادة اللغة العربية في المدرسة الثانوية بإندونيسيا؟

IJ-ATL (International Journal of Arabic Teaching and Learning) ◽

10.33650/ijatl.v2i1.299 ◽

2018 ◽

Vol 2 (1) ◽

pp. 78-94

Author(s):

Eva Lathifah

Keyword(s):

Secondary School ◽

The Other ◽

Final Exam ◽

Good Test ◽

Islamic Schools ◽

Teaching Objectives ◽

Test Specifications ◽

Measurement And Evaluation ◽

Academic Year

The teaching process consists of four pillars: the teaching objectives, the needs and preparations of the students, the input of behavioral inputs, experiences or educational activities, and the measurement and evaluation. These four pillars are interconnected to each other organically so that one cant be dispensed with the other. Therefore, this research aims to know the calendar in general. The research also deals with the knowledge of the good test specifications in the Islamic schools, especially the secondary school, and the quality of the exam in the final exam in the academic year 2015/2016.إن العملية التدريسية تتكون من أربعة أركان وهي الأهداف التدريسية وحاجات واستعدادات التلاميذ المساة بالمدخلات السلوكية والخبرات أو الأنشطة التعليمية ثم القياس والتقويم. وترتبط هذه الأركان الأربع بعضها ببعض ارتباطا عضويا بحيث لا يستغنى أحدها عن الأخر. لذا، يهدف هذا البحث إلى معرفة التقويم بوجه عام، كما يتطرق البحث إلى معرفة مواصفات الاختبار الجيد في المدارس الإسلامية بخاصة المدرسة الثانوية ومعرفة جودة الاختبار في الامتحان النهائى المقنن فيها للعام الدراسي 2015/2016 مـ.

Download Full-text

Item Analysis of Final Test for the 9th Grade Students of SMPN 44 Surabaya in the Academic Year of 2019/2020

JournEEL (Journal of English Education and Literature) ◽

10.51836/journeel.v2i1.81 ◽

2020 ◽

Vol 2 (1) ◽

pp. 34-46

Author(s):

Siti Fatimah ◽

Achmad Bernhardo Elzamzami ◽

Joko Slamet

Keyword(s):

Test Scores ◽

Item Difficulty ◽

Statistical Tests ◽

Item Analysis ◽

Discrimination Power ◽

Test Items ◽

9Th Grade ◽

Research Findings ◽

Microsoft Office ◽

Academic Year

This research was conducted by focusing on the formulated question regarding the test scores validity, reliability and item analysis involving the discrimination power and index difficulty in order to provide detail information leading to the improvement of test items construction. The quality of each particular item was analyzed in terms of item difficulty, item discrimination and distractor analysis. The statistical tests were used to compute the reliability of the test by applying The Kuder-Richardson Formula (KR20). The analysis of 50 test items was computed using Microsoft Office Excel. A descriptive method was applied to describe and examined the data. The research findings showed the test fulfilled the criteria of having content validity which was categorized as a low validity. Meanwhile, the reliability value of the test scores was 0.521010831 (0.52) categorized as lower reliability and revision of test. Through the 50 items examined, there were 21 items that were in need of improvement which were classified into “easy” for the index difficulty and “poor” category for the discriminability by the total 26 items (52%). It means more than 50% of the test items need to be revised as the items do not meet the criteria. It is suggested that in order to measure students’ performance effectively, essential improvement need to be evaluated where items with “poor” discrimination index should be reviewed.

Download Full-text

ANALISIS BUTIR SOAL UJIAN AKHIR SEMESTER MATA KULIAH PENGANTAR PENYUNTINGAN

Bahtera Indonesia; Jurnal Penelitian Bahasa dan Sastra Indonesia ◽

10.31943/bi.v6i2.121 ◽

2021 ◽

Vol 6 (2) ◽

pp. 203-212

Author(s):

Santi Pratiwi T Utami ◽

Imas Juidah ◽

Eko Muharudin ◽

Suhendi Syam ◽

Ramadhan Kusuma Yuda ◽

...

Keyword(s):

Quantitative Research ◽

Validity And Reliability ◽

Research Subjects ◽

Final Exam ◽

Discriminating Power ◽

Quantitative Data Analysis ◽

Level Of Difficulty ◽

Academic Year ◽

Descriptive Method

This study aims to describe the quality of the odd semester final exam items for the Introduction to Editing course for the 2020/2021 academic year at the Indonesian Literature Program, Universitas Negeri Semarang. This research is a quantitative research with descriptive method. The research subjects were 60 students. Data collection techniques were carried out by documenting. Quantitative data analysis based on level of difficulty, discriminating power, quality of distractors, validity, and reliability with 100 questions. The results of the analysis show that the quality of the odd semester final exam items in the Introduction to Editing course has a moderate level of difficulty, sufficient discriminating power, and very good distracting qualities. In addition, there are 68 valid items and 0.78 reliability in the high category. The conclusion of the study is that the odd semester final exam items for the Introduction to Editing subject need to be revised to improve the quality of the items.

Download Full-text

Developing a Scratch-based coding achievement test

Information and Learning Sciences ◽

10.1108/ils-08-2018-0078 ◽

2019 ◽

Vol 120 (5/6) ◽

pp. 383-406 ◽

Cited By ~ 2

Author(s):

Ömer Demir ◽

Süleyman Sadi Seferoğlu

Keyword(s):

Developmental Stages ◽

Item Difficulty ◽

Item Analysis ◽

Achievement Test ◽

Measurement Tool ◽

Item Pool ◽

Spring Semester ◽

Content Type ◽

Difficulty Index ◽

Academic Year

Purpose The lack of a reliable and valid measurement tool for coding achievement emerges as a major problem in Turkey. Therefore, the purpose of this study is to develop a Scratch-based coding achievement test. Design/methodology/approach Initially, an item pool with 31 items was created. The item pool was classified within the framework of Bayman and Mayer’s (1988) types of coding knowledge to support content validity of the test. Then the item pool was applied to 186 volunteer undergraduates at Hacettepe University during the spring semester of the 2017-2018 academic year. Subsequently, the item analysis was conducted for construct validity of the test. Findings In all, 13 items were discarded from the test, leaving a total of 18 items. Out of the 18-item version of the coding achievement test, 4, 5 and 9 items measured syntactic, conceptual and strategic knowledge, respectively, among the types of coding knowledge. Furthermore, average item discrimination index (0.531), average item difficulty index (0.541) and Cronbach Alpha reliability coefficient (0.801) of the test were calculated. Practical implications Scratch users, especially those who are taking introductory courses at Turkish universities, could benefit from a reliable and valid coding achievement test developed in this study. Originality/value This paper has theoretical and practical value, as it provides detailed developmental stages of a reliable and valid Scratch-based coding achievement test.

Download Full-text

AN ANALYSIS OF STUDENTS READING FINAL EXAMINATION BY USING ITEM ANALYSIS PROGRAM ON ELEVENTH GRADE OF SMA NEGERI 8 MEDAN

JETAL: Journal of English Teaching & Applied Linguistic ◽

10.36655/jetal.v1i1.98 ◽

2019 ◽

Vol 1 (1) ◽

pp. 13-19

Author(s):

Dewi Manalu ◽

Kammer Tuahman Sipayung ◽

Febrika Dwi Lestari

Keyword(s):

Quantitative Research ◽

Item Analysis ◽

Multiple Choice ◽

Final Examination ◽

Multiple Choice Questions ◽

Discrimination Power ◽

Reliability Level ◽

The Subject ◽

Program Version

The purpose of this study was to determine the quality of the reading final examination in SMA N8 Medan grade eleventh in terms of reliability, level difficulty, discrimination power and level of distractor. This research is qualitative-quantitative research. The subject of research are the grade XI of SMA N8 Medan. Data is analyzed by ANATES program version 4.0.9. The analysis shows that: (1) items of multiple choice question that can be said as valid are 14 items ( 56%), while the invalid items amounted to 11 items ( 44%). (2) Items mutiple choice question can said as reliable because it equal 0,90 so it can said reliable. (3) items of multiple choice questions that categorized easy are 3 items (12%), satisfactory category 7 ( 28%) difficult category 2 (8%) and the other calculate categorized very easy 3 items (12%) and very difficult 3 (12%). (4) items of multiple choice questions that categorized poor are 12 items (48%),average category are 2 items (8%), good items category 1 items (4%) and excellent items are 8 items (32%).

Download Full-text