Common Errors in Teacher-Made Test Design

Author(s):  
Michael Fields

Teaching and test writing require different skill sets. While teachers are often required to create language tests, they often have inadequate training. This may lead to tests that lack validity and reliability, making tests unfair for test-takers. Tests may contain more general errors in their development, leading to construct underrepresentation or construct-irrelevant variance, decreasing reliability and validity of the entire test and rendering results meaningless. Well-designed tests may also contain items that are not well constructed, which may again lower overall validity and reliability. These item-related errors include word matches, testing single words, issues with phrasing, developing good sets and distractors, and testing outside the text (such as language in the item or math skills). Increasing awareness of these issues and improving teacher skills in test writing will ensure more fairness for students in decisions based on test results.

2017 ◽  
Vol 38 (2) ◽  
pp. 83-93
Author(s):  
Jeffrey M. Cucina ◽  
Nicholas L. Vasilopoulos ◽  
Arwen H. DeCostanza

Abstract. Varimax rotated principal component scores (VRPCS) have previously been offered as a possible solution to the non-orthogonality of scores for the Big Five factors. However, few researchers have examined the reliability and validity of VRPCS. To address this gap, we use a lab study and a field study to investigate whether using VRPCS increase orthogonality, reliability, and criterion-related validity. Compared to the traditional unit-weighting scoring method, the use of VRPCS enhanced the reliability and discriminant validity of the Big Five factors, although there was little improvement in criterion-related validity. Results are discussed in terms of the benefit of using VRPCS instead of traditional unit-weighted sum scores.


Author(s):  
Muhammad Mathori ◽  
Uswatun Chasanah

This study aims to identify the determinants that influence the attitudes of consumers of green purchases and the willingness to pay more on green products, specifically on energy-saving lighting products. The variables studied included environmental knowledge, environmental awareness and perceived effectiveness. The sample of this study was 196 students, but out of 196 respondents after the questionnaire was distributed only 189 could be processed. Validity and reliability test results show valid values of loading factors of more than 0.4 while for reliability testing using the cronbach’s alpha criteria above 0.5 indicates reliable. Through the multiple linear regression analysis, the variables of environmental knowledge, environmental awareness and effectiveness are felt to have a positive and significant effect on the attitude of buying and willingness to pay more. Green purchasing attitudes and willingness to pay more have a positive and significant effect on green purchasing behavior. Green purchasing attitudes have a greater influence on green purchasing behavior compared to willingness to pay more.


2018 ◽  
Vol 6 (1) ◽  
Author(s):  
Nuridin, SE., MM. ◽  
Dwi Ardika Prayudha

This research is aimed to examine the influence of brand image and product quality to car purchase decision at PT. Mitsubishi Krama Yudha Motors and Manufacturing. Data analysis method was used, is quantitative analysis, by using validity and reliability test, coefficient of determination, and multiple linear regression analysis. Result of regression equation is: Y = 0,665 + 0,517 X1 + 0,416 X2 Simultaneously testing of brand image variables and product quality to purchase decision, shown by F count equal to 85,955 bigger than F table 3,159 or with sig. F 0.000 is smaller than alpha 0.005. Based on the test results simultaneously, can be seen that the independent variables (brand image and product quality) have a positive and significant impact on the car purchase decision at Mitsubishi cars At PT. Krama Yudha Motors and Manufacturing. Suggestions which can be given for future developments and determinations of policy are, companies should pay attention to brand image, product quality, to make customers more satisfied and keep using Mitsubishi cars as their choice.


2020 ◽  
Vol 5 (2) ◽  
pp. 125
Author(s):  
Raden Alifian Setiawan ◽  
Hanna Hanna ◽  
Alberth Alberth

The use of videos in education makes it possible to overcome practical real-world constraints and explore far greater possibilities provided by digital spaces, especially for the video uploaded in online platform such as blog. This study examines whether online video blog as media have a significant effect on students’ achievement of passive voice. It used pre-experimental (one group pre-test and post-test) design. The samples of this study were 10 students at 4J Operation. A pre-test and post-test were conducted by using multiple choice questions as the instruments. Data analysis was through paired-sample t-test. Results showed that there was an increase in mean score of pre-test (49,1) and post-test (63,5). Data from Paired Sample t-test showed that Sig. (2-tailed) was 0.000 which was smaller than .05 which means that there was significance difference in mean score after employing treatment.


Author(s):  
Nugroho Budhiwaluyo ◽  
Rayandra Asyhar ◽  
Bambang Hariyadi

  This research aims to produce a final product in the form of a performance-assessment instrument on Cell Structure and Function experiment. The development model is ADDIE. Based on expert's judgment, the instrument was valid and can be tested in the field. Field-test results shown that the product performs high validity and reliability value on measuring student performance on Cell Structure and Function experiment. Therefore, it is concluded that this performance-assessment instrument theoretically and practically has a good quality for measuring student performance in both process and product performance on Cell Structure and Function experiment. Keywords: Development, Performance-Assessment Instrument, Cell Structure and Function Experiment 


2021 ◽  
Author(s):  
Qi Zhang ◽  
Ke Zhang ◽  
Miao Li ◽  
Jiaxin Gu ◽  
Xintong Li ◽  
...  

Abstract Objectives To examine the validity and reliability of the Mandarin version of the Treatment Burden Questionnaire (TBQ) among stroke patients. Background Stroke patients need long-term management of symptoms and life situation, and treatment burden has recently emerged as a new concept that can influence the health outcomes during the rehabilitation process. Methods The convenience sampling method was used to recruit 187 cases of stroke patients in a tertiary grade hospital in Tianjin for a formal investigation. Item analysis, reliability and validity tests were carried out. The reliability test included internal consistency and test–retest reliability. And as well as content, structure and convergent validity were performed for the validity test. Results Of the 187 completed questionnaires, only 180 (96.3%) were suitable for analysis. According to the experts’ evaluation, the I-CVI of each item was from 0.833 to 1.000, and the S-CVI was 0.967. The exploratory factor analysis yielded three-factor components with a cumulative variation of 53.054%. Convergent validity was demonstrated using measures of Morisky’s Medication Adherence Scale 8 (r = –0.450, P < 0.01). All correlations between items and global scores ranged from 0.403 to 0.638. Internal consistency reliability and test–retest reliability were found to be acceptable, as indicated by a Cronbach’s α of 0.824 and an intraclass correlation coefficient of 0.846, respectively. Conclusions The Mandarin TBQ had acceptable validity and reliability. The use of TBQ in the assessment of treatment burden of stroke survivor may benefit health resources allocation and provide tailor therapeutic interventions to construct minimally disruptive care.


2021 ◽  
pp. 1-8
Author(s):  
Angelo Picardi ◽  
Sara Panunzi ◽  
Sofia Misuraca ◽  
Chiara Di Maggio ◽  
Andrea Maugeri ◽  
...  

<b><i>Introduction:</i></b> The last decade has witnessed a resurgence of interest in the clinician’s subjectivity and its role in the diagnostic assessment. Integrating the criteriological, third-person approach to patient evaluation and psychiatric diagnosis with other approaches that take into account the patient’s subjective and intersubjective experience may bear particular importance in the assessment of very young patients. The ACSE (Assessment of Clinician’s Subjective Experience) instrument may provide a practical way to probe the intersubjective field of the clinical examination; however, its reliability and validity in child and adolescent psychiatrists seeing very young patients is still to be determined. <b><i>Methods:</i></b> Thirty-three clinicians and 278 first-contact patients aged 12–17 years participated in this study. The clinicians completed the ACSE instrument and the Brief Psychiatric Rating Scale after seeing the patient, and the Profile of Mood State (POMS) just before seeing the patient and immediately after. The ACSE was completed again for 45 patients over a short (1–4 days) retest interval. <b><i>Results:</i></b> All ACSE scales showed high internal consistency and moderate to high temporal stability. Also, they displayed meaningful correlations with the changes in conceptually related POMS scales during the clinical examination. <b><i>Discussion:</i></b> The findings corroborate and extend previous work on adult patients and suggest that the ACSE provides a valid and reliable measure of the clinician’s subjective experience in adolescent psychiatric practice, too. The instrument may prove to be useful to help identify patients in the early stages of psychosis, in whom subtle alterations of being with others may be the only detectable sign. Future studies are needed to determine the feasibility and usefulness of integrating the ACSE within current approaches to the evaluation of at-risk mental states.


2021 ◽  
Vol 11 (8) ◽  
pp. 402
Author(s):  
Linda Helene Sillat ◽  
Kairit Tammets ◽  
Mart Laanpere

The rapid increase in recent years in the number of different digital competency frameworks, models, and strategies has prompted an increasing popularity for making the argument in favor of the need to evaluate and assess digital competence. To support the process of digital competence assessment, it is consequently necessary to understand the different approaches and methods. This paper carries out a systematic literature review and includes an analysis of the existing proposals and conceptions of digital competence assessment processes and methods in higher education, with the aim of better understanding the field of research. The review follows three objectives: (i) describe the characteristics of digital competence assessment processes and methods in higher education; (ii) provide an overview of current trends; and, finally, (iii) identify challenges and issues in digital competence assessment in higher education with a focus on the reliability and validity of the proposed methods. On the basis of the findings, and as a result of the COVID-19 pandemic, digital competence assessment in higher education requires more attention, with a specific focus on instrument validity and reliability. Furthermore, it will be of great importance to further investigate the use of assessment tools to support systematic digital competence assessment processes. The analysis includes possible opportunities and ideas for future lines of work in digital competence evaluation in higher education.


2013 ◽  
Vol 2013 ◽  
pp. 1-16 ◽  
Author(s):  
Ruth A. Anderson ◽  
Donde Plowman ◽  
Kirsten Corazzini ◽  
Pi-Ching Hsieh ◽  
Hui Fang Su ◽  
...  

Objectives.To (1) describe participation in decision-making as a systems-level property of complex adaptive systems and (2) present empirical evidence of reliability and validity of a corresponding measure.Method.Study 1 was a mail survey of a single respondent (administrators or directors of nursing) in each of 197 nursing homes. Study 2 was a field study using random, proportionally stratified sampling procedure that included 195 organizations with 3,968 respondents.Analysis.In Study 1, we analyzed the data to reduce the number of scale items and establish initial reliability and validity. In Study 2, we strengthened the psychometric test using a large sample.Results.Results demonstrated validity and reliability of the participation in decision-making instrument (PDMI) while measuring participation of workers in two distinct job categories (RNs and CNAs). We established reliability at the organizational level aggregated items scores. We established validity of the multidimensional properties using convergent and discriminant validity and confirmatory factor analysis.Conclusions.Participation in decision making, when modeled as a systems-level property of organization, has multiple dimensions and is more complex than is being traditionally measured. Managers can use this model to form decision teams that maximize the depth and breadth of expertise needed and to foster connection among them.


Sign in / Sign up

Export Citation Format

Share Document