scholarly journals Balancing the demands of validity and reliability in practice: Case study of a changing system of primary science summative assessment

2020 ◽  
Vol 18 (2) ◽  
Author(s):  
Sarah G. Earle

Teacher summative judgements of children’s attainment in science, which are statutory at age 11 in England, require consideration of both valid sampling of the construct and reliable comparison of outcomes. In order to develop understanding of the enacted ‘trade off’ between validity and reliability, this three-year case study, within the Teacher Assessment in Primary Science (TAPS) project, was undertaken during a period of statutory assessment change in England. The case demonstrates an ongoing balancing act between the demands of reliability and validity, and resulted in the development of a teacher assessment seesaw, which provides a model for both interpreting and supporting practice, within and beyond primary science.

2017 ◽  
Vol 38 (2) ◽  
pp. 83-93
Author(s):  
Jeffrey M. Cucina ◽  
Nicholas L. Vasilopoulos ◽  
Arwen H. DeCostanza

Abstract. Varimax rotated principal component scores (VRPCS) have previously been offered as a possible solution to the non-orthogonality of scores for the Big Five factors. However, few researchers have examined the reliability and validity of VRPCS. To address this gap, we use a lab study and a field study to investigate whether using VRPCS increase orthogonality, reliability, and criterion-related validity. Compared to the traditional unit-weighting scoring method, the use of VRPCS enhanced the reliability and discriminant validity of the Big Five factors, although there was little improvement in criterion-related validity. Results are discussed in terms of the benefit of using VRPCS instead of traditional unit-weighted sum scores.


2020 ◽  
Vol 4 (02) ◽  
pp. 34-45
Author(s):  
Naufal Dzikri Afifi ◽  
Ika Arum Puspita ◽  
Mohammad Deni Akbar

Shift to The Front II Komplek Sukamukti Banjaran Project is one of the projects implemented by one of the companies engaged in telecommunications. In its implementation, each project including Shift to The Front II Komplek Sukamukti Banjaran has a time limit specified in the contract. Project scheduling is an important role in predicting both the cost and time in a project. Every project should be able to complete the project before or just in the time specified in the contract. Delay in a project can be anticipated by accelerating the duration of completion by using the crashing method with the application of linear programming. Linear programming will help iteration in the calculation of crashing because if linear programming not used, iteration will be repeated. The objective function in this scheduling is to minimize the cost. This study aims to find a trade-off between the costs and the minimum time expected to complete this project. The acceleration of the duration of this study was carried out using the addition of 4 hours of overtime work, 3 hours of overtime work, 2 hours of overtime work, and 1 hour of overtime work. The normal time for this project is 35 days with a service fee of Rp. 52,335,690. From the results of the crashing analysis, the alternative chosen is to add 1 hour of overtime to 34 days with a total service cost of Rp. 52,375,492. This acceleration will affect the entire project because there are 33 different locations worked on Shift to The Front II and if all these locations can be accelerated then the duration of completion of the entire project will be effective


2021 ◽  
Vol 34 (5) ◽  
pp. 303-318
Author(s):  
Maarten Baele ◽  
An Vermeulen ◽  
Dimitri Adons ◽  
Roos Peeters ◽  
Angelique Vandemoortele ◽  
...  

2021 ◽  
Author(s):  
Qi Zhang ◽  
Ke Zhang ◽  
Miao Li ◽  
Jiaxin Gu ◽  
Xintong Li ◽  
...  

Abstract Objectives To examine the validity and reliability of the Mandarin version of the Treatment Burden Questionnaire (TBQ) among stroke patients. Background Stroke patients need long-term management of symptoms and life situation, and treatment burden has recently emerged as a new concept that can influence the health outcomes during the rehabilitation process. Methods The convenience sampling method was used to recruit 187 cases of stroke patients in a tertiary grade hospital in Tianjin for a formal investigation. Item analysis, reliability and validity tests were carried out. The reliability test included internal consistency and test–retest reliability. And as well as content, structure and convergent validity were performed for the validity test. Results Of the 187 completed questionnaires, only 180 (96.3%) were suitable for analysis. According to the experts’ evaluation, the I-CVI of each item was from 0.833 to 1.000, and the S-CVI was 0.967. The exploratory factor analysis yielded three-factor components with a cumulative variation of 53.054%. Convergent validity was demonstrated using measures of Morisky’s Medication Adherence Scale 8 (r = –0.450, P < 0.01). All correlations between items and global scores ranged from 0.403 to 0.638. Internal consistency reliability and test–retest reliability were found to be acceptable, as indicated by a Cronbach’s α of 0.824 and an intraclass correlation coefficient of 0.846, respectively. Conclusions The Mandarin TBQ had acceptable validity and reliability. The use of TBQ in the assessment of treatment burden of stroke survivor may benefit health resources allocation and provide tailor therapeutic interventions to construct minimally disruptive care.


2021 ◽  
pp. 1-8
Author(s):  
Angelo Picardi ◽  
Sara Panunzi ◽  
Sofia Misuraca ◽  
Chiara Di Maggio ◽  
Andrea Maugeri ◽  
...  

<b><i>Introduction:</i></b> The last decade has witnessed a resurgence of interest in the clinician’s subjectivity and its role in the diagnostic assessment. Integrating the criteriological, third-person approach to patient evaluation and psychiatric diagnosis with other approaches that take into account the patient’s subjective and intersubjective experience may bear particular importance in the assessment of very young patients. The ACSE (Assessment of Clinician’s Subjective Experience) instrument may provide a practical way to probe the intersubjective field of the clinical examination; however, its reliability and validity in child and adolescent psychiatrists seeing very young patients is still to be determined. <b><i>Methods:</i></b> Thirty-three clinicians and 278 first-contact patients aged 12–17 years participated in this study. The clinicians completed the ACSE instrument and the Brief Psychiatric Rating Scale after seeing the patient, and the Profile of Mood State (POMS) just before seeing the patient and immediately after. The ACSE was completed again for 45 patients over a short (1–4 days) retest interval. <b><i>Results:</i></b> All ACSE scales showed high internal consistency and moderate to high temporal stability. Also, they displayed meaningful correlations with the changes in conceptually related POMS scales during the clinical examination. <b><i>Discussion:</i></b> The findings corroborate and extend previous work on adult patients and suggest that the ACSE provides a valid and reliable measure of the clinician’s subjective experience in adolescent psychiatric practice, too. The instrument may prove to be useful to help identify patients in the early stages of psychosis, in whom subtle alterations of being with others may be the only detectable sign. Future studies are needed to determine the feasibility and usefulness of integrating the ACSE within current approaches to the evaluation of at-risk mental states.


2021 ◽  
Vol 11 (8) ◽  
pp. 402
Author(s):  
Linda Helene Sillat ◽  
Kairit Tammets ◽  
Mart Laanpere

The rapid increase in recent years in the number of different digital competency frameworks, models, and strategies has prompted an increasing popularity for making the argument in favor of the need to evaluate and assess digital competence. To support the process of digital competence assessment, it is consequently necessary to understand the different approaches and methods. This paper carries out a systematic literature review and includes an analysis of the existing proposals and conceptions of digital competence assessment processes and methods in higher education, with the aim of better understanding the field of research. The review follows three objectives: (i) describe the characteristics of digital competence assessment processes and methods in higher education; (ii) provide an overview of current trends; and, finally, (iii) identify challenges and issues in digital competence assessment in higher education with a focus on the reliability and validity of the proposed methods. On the basis of the findings, and as a result of the COVID-19 pandemic, digital competence assessment in higher education requires more attention, with a specific focus on instrument validity and reliability. Furthermore, it will be of great importance to further investigate the use of assessment tools to support systematic digital competence assessment processes. The analysis includes possible opportunities and ideas for future lines of work in digital competence evaluation in higher education.


2019 ◽  
Vol 5 (2) ◽  
pp. 37
Author(s):  
Faieza Chowdhury

 In this current age of highly competitive global environment, teachers are under tremendous pressure to assess student learning in the most effective manner. Two tools that teachers commonly utilize to assess students in their classes are formative and summative assessment. In formative assessment, teachers gather data in order to improve student learning and in summative assessment they use the data to assess students’ learning at the end of a specific course of study. The scores on both types of assessment should meet the minimum standards of both reliability and validity. In this article we highlight the differences between the two forms of assessment, discuss the theories pertaining to summative and formative assessment, identify how educators at tertiary level in Bangladesh commonly utilize the two types of assessment and disclose opinions of teachers regarding whether the current assessment system is appropriate or need any further improvements. Findings from the study indicate that most teachers have an incomplete and unharmonious understanding about assessment often failing to clearly distinguish between formative and summative assessments.


2013 ◽  
Vol 2013 ◽  
pp. 1-16 ◽  
Author(s):  
Ruth A. Anderson ◽  
Donde Plowman ◽  
Kirsten Corazzini ◽  
Pi-Ching Hsieh ◽  
Hui Fang Su ◽  
...  

Objectives.To (1) describe participation in decision-making as a systems-level property of complex adaptive systems and (2) present empirical evidence of reliability and validity of a corresponding measure.Method.Study 1 was a mail survey of a single respondent (administrators or directors of nursing) in each of 197 nursing homes. Study 2 was a field study using random, proportionally stratified sampling procedure that included 195 organizations with 3,968 respondents.Analysis.In Study 1, we analyzed the data to reduce the number of scale items and establish initial reliability and validity. In Study 2, we strengthened the psychometric test using a large sample.Results.Results demonstrated validity and reliability of the participation in decision-making instrument (PDMI) while measuring participation of workers in two distinct job categories (RNs and CNAs). We established reliability at the organizational level aggregated items scores. We established validity of the multidimensional properties using convergent and discriminant validity and confirmatory factor analysis.Conclusions.Participation in decision making, when modeled as a systems-level property of organization, has multiple dimensions and is more complex than is being traditionally measured. Managers can use this model to form decision teams that maximize the depth and breadth of expertise needed and to foster connection among them.


Sign in / Sign up

Export Citation Format

Share Document