scholarly journals CALCULATION OF DISCRIMINATIVITY AND VALIDITY FOR EVALUATION OF THE QUALITY OF THE PEDAGOGICAL TEST FROM THE COURSE OF MATHEMATICS ON THE TOPIC "RANKS "

Author(s):  
L.V. Limanova ◽  
◽  
N.V. Popov ◽  

The last twenty years the Department of Higher Mathematics and Applied Informatics of Samara State Technical University has been using tests on all chapters of the Mathematics course to control students' knowledge. To understand how objectively a particular test allows a given assessment, it is necessary to examine its quality. Many theories are devoted to this problem. This article will analyze test quality and test tasks based on classical test theory. The study consists of the following stages: compiling a control test; conducting it on a sample of students that meets all the requirements of statistical analysis; processing of this sample and calculation of basic characteristics; analysis of the resulting coefficients, which allows us to conclude whether the test under study can be used as a control of the level of knowledge of students and what changes need to be made in order to improve it. This article analyses some characteristics when examining the quality of the test on the topic "Series" of the course of mathematics, namely, discrimination and validity. To analyze the validity of the test, point biserial correlation coefficients were found - these are correlation coefficients of some task with the student's individual score. After that, the total validity coefficient was calculated. As the calculations showed, the total validity coefficient is 0.53. This is a fairly high indicator, which cannot be said about some point biserial correlation coefficients, namely, for tasks No. 2 and 9. Therefore, these tasks must either be modified or replaced by others. Analysis of the discrimination coefficients of test tasks showed that their values satisfy the requirements of experts.

2017 ◽  
Vol 1 (1) ◽  
pp. 46
Author(s):  
A.A. Gede Yudha Paramartha

This study aimed at investigating the quality of 40-item multiple-choice taken from Reading III class in English Education Department. It was carried out in order to get clearer picture and provide feedbacks on how the Critical Reading test for the new curriculum should be developed and what factors should be taken into account when developing the test. The data were taken from 24 fourth semester students in English Education Department, Unidksha, Bali who took the reading test. The data were analyzed by conducting Classical Test Theory analysis with the assistance of jMetrik software. In general, the result shows that the test was consistent but it was easy and only 40% of total items are eligible to be used. Along the findings and discussion, some feedbacks are provided for future development for the reading test.


2021 ◽  
pp. 153944922110608
Author(s):  
Lorrie George-Paschal ◽  
Nancy E. Krusen ◽  
Chia-Wei Fan

This study evaluated the psychometric properties of the Relative Mastery Scale (RMS). Valid and reliable client-centered instruments support practice in value-based health care and community-based settings. Participants were 368 community-dwelling adults aged 18 to 95 years. Researchers conducted validity and reliability examinations of the RMS using classical test theory and Rasch measurement model. A partial credit model allowed exploration of individual scale properties. Spearman’s correlation coefficients between items were statistically significant at the .01 level. Cronbach’s alpha coefficient was .94 showing strong internal consistency. In exploratory factor analysis, Factor 1 accounted for 71% of variance with an eigenvalue of 4.26. In Rasch analysis, the 5-point rating scale demonstrated adequate functioning, confirmed unidimensionality, and person/item separation. The RMS instrument demonstrates sound psychometric characteristics. A valid and reliable measure of internal occupational adaptation supports application to monitor progress of internal occupational adaptation across a variety of individuals.


Pedagogika ◽  
2017 ◽  
Vol 127 (3) ◽  
pp. 104-118
Author(s):  
Gediminas Merkys ◽  
Daiva Bubelienė

In the article a newly created questionnaire intended for older schoolchildren – “evaluate the teacher and his lessons” is introduced. The theoretical and practical context of the instrument based on 87 primary questions is named, the dimensional structure and metrological quality of the formed integrated scales and sub-scales is presented. The scales and sub-scales were formed following the classical test theory, combining logical and factorial validation. The secondary sub-scale factorization has indicated that it is expedient to distinguish between two integrated lesson dimensions (scales). The first integrated scale reflects the quality of social relations and teacher-centered orientation. The second scale reflects the management and didactics of the educational process. High correlation between the evaluations of integrated scales (r = 0.86) indicates that a generalized integrated index of evaluation of the teacher and his lesson can be derived by aggregating even 81 primary variables defining the most various aspects of the lesson. In the article the basis of statistic norming of the questionnaire possessed at present is described: Nschool children = 4024 and Nteachers = 200 which encompasses schools of different types from various regions of the country. The wide coverage of the content of created questionnaire, quite good quality of the scales opens good opportunities for its application in both the practice of schools evaluation and research. First of all, the methodical purpose of the article has been to introduce a new standardized instrument of survey. Secondly, the question why such indicators as “abundance of homework” and “level of the requirements set by the teacher” practically do not correlate with all the remaining scales, although the latter intercorrelate very significantly, is set. In the paper the question (and hypotheses) whether the mentioned variables can truly affect the didactic quality of the lesson counterproductively is elaborated.


2016 ◽  
Vol 20 (1) ◽  
pp. 1-10
Author(s):  
C. Heri Sulistiawan

Penelitian ini bertujuan untuk mendeskripsikan kualitas soal Ujian Sekolah Matematika program IPA dan kontribusi skor Ujian Sekolah terhadap nilai Ujian Nasional. Jenis penelitian ini deskriptif eksploratif dengan pendekatan survei. Populasi penelitian ini adalah enam SMA Swasta Yayasan Katolik di Kota Yogyakarta. Sampel adalah lima SMA yang representatif dalam menggambarkan ciri populasi. Naskah soal ditelaah tiga orang ahli untuk analisis kualitas soal secara kualitatif. Jawaban siswa dianalisis secara Teori Tes Klasik dan Teori Respons Butir untuk mengetahui kualitas soal secara kuantitatif. Skor Ujian Sekolah dikorelasikan terhadap nilai Ujian Nasional untuk mengukur kontribusinya. Hasil penelitian: (1) kualitas soal secara kualitatif adalah sangat baik (satu SMA), baik (dua SMA), cukup baik (satu SMA), dan kurang baik (satu SMA); (2) secara kuantitatif menurut Teori Tes Klasik adalah cukup baik (satu SMA), kurang baik (tiga SMA), dan tidak baik (satu SMA), dengan reliabilitas Alpha termasuk reliabel; (3) secara kuantitatif menurut Teori Respons Butir adalah baik (tiga SMA), cukup baik (satu SMA), dan kurang baik (satu SMA); (4) Indeks konsistensi analisis butir soal termasuk cukup konsisten; (5) korelasi skor Ujian Sekolah terhadap nilai Ujian Nasional termasuk kategori besar; (6) berdasarkan telaah validitas isi perangkat, lebih dari 90% soal valid dan skor Ujian Sekolah valid/akurat dalam memprediksi hasil Ujian Nasional.Kata kunci: kualitas soal, ujian sekolah, matematika, ujian nasional THE QUALITY OF MATHEMATICS SCHOOL EXAMINATION OF SCIENCE MAJOR AND ITS CONTRIBUTION TO THE RESULT OF NATIONAL EXAMINATIONAbstractThe purpose of this research is to describe the quality of Mathematics School Examination of Science Major and the contribution of School Examination scores to the result of National Examination. This research is a descriptive explorative research with survey approach. The population in this research are six Senior High Schools in Yogyakarta. The samples are five representative senior high schools to describe the characteristics of the population. The test was analyzed by three experts to find out the test quality qualitatively. The students’ answer was analyzed with Classic Test Theory and Item Response Theory to find out the test quality quantitatively. The School Examination score was correlated with the National Examination score to measure their contribution. The research result: (1) the school exam quality is very good (one school), good (two schools), quite good (one school), and not so good (one school); (2) Quantitatively based on Classic Test Theory are quite good (one school), not so good (three schools), dan poor (one school), by Alpha reliability is reliable; (3) Quantitatively based on Item Response Theory are good (three schools), quite good (one school), and not so good (one school); (4) The test item index consistency are quite consistent; (5) The correlation of the School Examination scores with the result of National Examination is in large categories; (6) Based on content validity study, more than 90% items are valid and scores of School Examination is valid/accurate in predicting the result of the National Examination.Keywords: the test quality, school examination, mathematics, national examination


2016 ◽  
Vol 3 (2) ◽  
pp. 162 ◽  
Author(s):  
Hadi Sutrisno

This research aims to describe: (1) the quality of mathematics school examination test of the Junior High School for the academic year 2015/2016 in Kabupaten Bangkalan based on qualitative analysis of tests item, (2) ) the quality of mathematics school examination test of Junior High School for the academic year 2015/2016 in Kabupaten Bangkalan based on quantitative analysis of test items, and (3) the test equating on mathematics school examination test for the academic year of 2015/2016 of Junior High School in Kabupaten Bangkalan. A test is said to be quality if it fulfills validity, reliability, and it has good characteristic. A test is said to be equivalent to another if the test scores of one test can be exchanged with the scores of another test. The data is taken from school examination script accompanied with the answer sheets of students. Qualitative data analysis was conducted with the help of experts judgement. Quantitative data analysis was conducted with Classical Test Theory by Iteman and Item Response Theory by BilogMG. These programs are used to find out the test quality quantitatively. In order to analyze the equivalence between series of tests, item-characteristic curves were used. These curves were drawn by Geogebra. The research result shows: (1) qualitatively, the test plan quality of mathematics school examination test is quite good. Meanwhile, the school exam quality is quite good and not so good, (2) quantitatively, the school examination test quality is good, and (3) equating, based on the item-characteristic curves, the school examination tests is equal.


MADRASAH ◽  
2020 ◽  
Vol 12 (1) ◽  
pp. 29-39
Author(s):  
Nuril Huda ◽  
Tutik Sri Wahyuni

This research aims to: 1) find out the characteristics of the science items try out National Standar School Exams (USBN) in the academic year 2018/2019 based on Classical Test Theory (CTT); 2) find out the number of the science items try out USBN in the academic year 2018/2019 in relation to cognitive level. This type of research is a descriptive research with a quantitative approach. The data obtained was a computer answer sheet of 5022 students who took USBN try out of Elementary School 2019 on February 21, 2019 in Tulungagung Regency. The results showed that: 1) The characteristics of the science items try out USBN in the academic year 2018/2019 based on Classical Test Theory (CTT) in aspects of: a) validity of 35 items valid; b) the reliability value of 0.818 is very high; c) the level of difficulty level: 4 items (11.43%) are difficult, 9 items (25.71%) are moderate, 16 items (45.71%) are easy and 6 items (7.140%) are very easy; d) discriminating power: 3 items (8.57%) are bad, 12 items (34.29%) are good enough, 15 items (42.86%) are moderate, and 5 items (14.29%) are good; e) the quality of options: 17 items (48.57%) without revision, 9 items (25.71%) one option revision, 5 items (14.29%) 2 option revisions, and 4 items (11.43% ) wrong revision of 3 options; f) 13 items (37.14%) about the science try out USBN in the academic year 2018/2019 have quite good and good characteristics, so they can be included in the question bank; 2) items the science of  try out USBN in the academic year 2018/2019 in relationship with cognitive level, 11 items (31.43%) category L1 (knowledge), 10 items (28.57%) category L1 (understanding), 4 items (11.43%) category L2 (application), and 10 items (28.57%) category L3 (reasoning). Of the 13 items entered in the question bank with cognitive level, the science try out USBN in the academic year 2018/2019 was dominated at the cognitive level L1 (knowledge and understanding).


2020 ◽  
Author(s):  
chonghua wan ◽  
Ying Chen ◽  
Li Gao ◽  
Qingqing Zhang ◽  
Peng Quan ◽  
...  

Abstract Background: Quality of life (QOL) for patients with Peptic ulcer disease (PUD) is of interest worldwide and disease-specific instruments are needed for clinical research and practice. This paper focus on the development and validation of the PUD scale under the system of Quality of Life Instruments for Chronic Diseases (QLICD-PU) by the modular approach and both classical test theory and Generalizability Theory. Methods: The QLICD-PU is developed based on programmatic decision-making procedures, including multiple nominal and focus group discussions, in-depth interviews, and quantitative statistical procedures. Based on the data of 153 PUD inpatients, correlation analysis, factor analysis, t-test, and Generalizability Theory analysis (including generalizability study and decision study, ie. G-study and D-study) were used to assess the validity, reliability, and responsiveness of the scale. Results: When the popular scale health survey short form (SF-36) was used as the standard, correlation and factor analysis confirmed good construct validity and criterion-related validity of QLICD-PU. Except for the social domain (0.62), the internal consistency α of all domains is higher than 0.70. The overall score and the test-retest reliability coefficients (Pearson r and intra-class correlation ICC) in all domains are higher than 0.80 (0.77 in the social domain). After treatments, the overall score and scores of all domains have statistically significant changes (P <0.01), except for social impact and sexual function scores. The SRM (Standardized response mean) of domain-level scores ranges from 0.34 to 1.03. The G coefficient and reliability index (Ф coefficient) further confirm the reliability of the scale through more accurate variance components and decision-making information about changes in the number of items. Conclusions: The QLICD-PU can be used as a useful measurement to assess the quality of life of PUD patients with good psychometric characteristics and multiple advantages.


2020 ◽  
Author(s):  
chonghua wan ◽  
Ying Chen ◽  
Li Gao ◽  
Qingqing Zhang ◽  
Peng Quan ◽  
...  

Abstract Background: Quality of life (QOL) for patients with Peptic ulcer disease (PUD) is of interest worldwide and disease-specific instruments are needed for clinical research and practice. This paper focus on the development and validation of the PUD scale under the system of Quality of Life Instruments for Chronic Diseases (QLICD-PU) by the modular approach and both classical test theory and Generalizability Theory. Methods: QLICD-PU is developed based on programmatic decision-making procedures, including multiple nominal and focus group discussions, in-depth interviews, and quantitative statistical procedures. Based on the data of 153 PUD inpatients, G and D studies using correlation analysis, factor analysis, t-test, and generalized theoretical analysis were used to assess the validity, reliability, and responsiveness of the scale.Results: When SF-36 was used as the standard, correlation and factor analysis confirmed good structural validity and standard-related validity of QLICD-PU. Except for the social domain (0.62), the internal consistency α of all domains is higher than 0.70. The overall score and the retest reliability coefficients (Pearson r and intra-class correlation ICC) in all domains are higher than 0.80 (0.77 in the social domain). After treatment, the overall scores and scores of all domains have statistically significant changes (P <0.01), except for social impact and sexual function scores. The SRM of field-level scores ranges from 0.34 to 1.03. The G coefficient and reliability index (Ф coefficient) further confirm the reliability of the scale through more accurate variance components and decision-making information about changes in the number of items.Conclusions: QLICD-PU can be used as a useful measurement to assess the quality of life of PUD patients with good psychometric characteristics and multiple advantages.


Sign in / Sign up

Export Citation Format

Share Document