scholarly journals Validation Evidence from using Generalizability Theory in a Basic-Science Course

2021 ◽  
Vol 12 (1) ◽  
pp. 16
Author(s):  
Michael Peeters ◽  
M Kenneth Cor ◽  
Sai Boddu ◽  
Jerry Nesamony

Description of the Problem: Reliability is critical validation evidence on which to base high-stakes decision-making. Many times, one exam in a didactic course may not be acceptably reliable on its own. But how much might multiple exams add when combined together? The Innovation: To improve validation evidence towards high-stakes decision-making, Generalizability Theory (G-Theory) can combine reliabilities from multiple exams into one composite-reliability (G_String IV software). Further, G-Theory decision-studies can illustrate changes in course-grade reliability, depending on the number of exams and exam-items. Critical Analysis: 101 first-year PharmD students took two midterm-exams and one final-exam in a pharmaceutics course. Individually, Exam1 had 50MCQ (KR-20=0.69), Exam2 had 43MCQ (KR-20=0.65), and Exam3 had 67MCQ (KR-20=0.67). After combining exam occasions using G-Theory, the composite-reliability was 0.71 for overall course-grades—better than any exam alone. Remarkably, increased numbers of exam occasions showed fewer items per exam were needed, and fewer items over all exams, to obtain an acceptable composite-reliability. Acceptable reliability could be achieved with different combinations of number of MCQs on each exam and number of exam occasions. Implications: G-Theory provided reliability critical validation evidence towards high-stakes decision-making. Final course-grades appeared quite reliable after combining multiple course exams—though this reliability could and should be improved. Notably, more exam occasions allowed fewer items per exam and fewer items over all the exams. Thus, one added benefit of more exam occasions for educators is developing fewer items per exam and fewer items over all exams.

2021 ◽  
Vol 12 (1) ◽  
pp. 17
Author(s):  
Michael Peeters ◽  
M Kenneth Cor ◽  
Erik Maki

Description of the Problem: High-stakes decision-making should have sound validation evidence; reliability is vital towards this. A short exam may not be very reliable on its own within didactic courses, and so supplementing it with quizzes might help. But how much? This study’s objective was to understand how much reliability (for the overall module-grades) could be gained by adding quiz data to traditional exam data in a clinical-science module. The Innovation: In didactic coursework, quizzes are a common instructional strategy. However, individual contexts/instructors can vary quiz use formatively and/or summatively. Second-year PharmD students took a clinical-science course, wherein a 5-week module focused on cardiovascular therapeutics. Generalizability Theory (G-Theory) combined seven quizzes leading to an exam into one module-level reliability, based on a model where students were crossed with items nested in eight fixed testing occasions (mGENOVA used). Furthermore, G-Theory decision-studies were planned to illustrate changes in module-grade reliability, where the number of quiz-items and relative-weighting of quizzes were altered. Critical Analysis: One-hundred students took seven quizzes and one exam. Individually, the exam had 32 multiple-choice questions (MCQ) (KR-20 reliability=0.67), while quizzes had a total of 50MCQ (5-9MCQ each) with most individual quiz KR-20s less than or equal to 0.54. After combining the quizzes and exam using G-Theory, estimated reliability of module-grades was 0.73; improved from the exam alone. Doubling the quiz-weight, from the syllabus’ 18% quizzes and 82% exam, increased the composite-reliability of module-grades to 0.77. Reliability of 0.80 was achieved with equal-weight for quizzes and exam. Next Steps: Expectedly, more items lent to higher reliability. However, using quizzes predominantly formatively had little impact on reliability, while using quizzes more summatively (i.e., increasing their relative-weight in module-grade) improved reliability further. Thus, depending on use, quizzes can add to a course’s rigor.


2010 ◽  
Vol 24 (1) ◽  
pp. 30-34 ◽  
Author(s):  
Marc P. McRae

Purpose: Organic chemistry has been shown to correlate with academic success in the preclinical years of medicine, dentistry, and graduate physiology. The purpose of this study is to examine the relationship between undergraduate organic chemistry grades and first-semester biochemistry grades at a Midwest chiropractic doctoral program. Methods: Students enrolled in a first-semester biochemistry course who had completed the prerequisite courses in organic chemistry offered at this same institution were entered into the study. The total grade for each of the three courses was calculated using the midterm and final exam raw scores with a weighting of 50% each. Analysis consisted of obtaining correlation coefficients between the total grades of organic 1 with biochemistry and organic 2 with biochemistry. Using the biochemistry total grade, the students were divided into quartiles and course grades for both organic chemistry 1 and 2 were calculated. Results: For the 109 students in the study, the correlation coefficient between the biochemistry and organic chemistry 1 and biochemistry and organic chemistry 2 courses was r = 0.744 and r = 0.725, respectively. The difference in organic chemistry grades between those in the first and fourth quartiles was 63.2% and 86.9% for organic chemistry 1 (p < .001) and 60.9% and 79.4% for organic chemistry 2 (p < .001). Conclusion: This study shows that organic chemistry can be used as an indicator of future academic success in a chiropractic biochemistry course. Knowledge of such a relationship could prove useful to identify students who may potentially run into academic difficulty with first-year biochemistry.


Author(s):  
Carol Hulls ◽  
Chris Rennick

In the first year programming course given to ME and MTE students at uWaterloo, four hours of traditional classroom instruction have been replaced with a series of short online mini-lectures that deliver some of the basic facts necessary to be able to code programs. The students’ comprehension of this content is assessed online by quizzes and on the midterm exam. This approach was used in a course which was not otherwise delivered online. The goal was to front-load the course to make space for a design project later in the term. The online mini-lectures were designed to be “lecture-time neutral”. The accelerated start of term allowed threshold concepts to show up on assignments a week earlier than with the traditional approach, giving students an additional week of practice with these topics. This led to noticeable gains in understanding on the final exam. Survey data was collected, and focus groups were run, to capture student feedback on the approach; additionally, course grades were analyzed to assess impact on student knowledge of course material.


2019 ◽  
Vol 29 (4) ◽  
pp. 555-578 ◽  
Author(s):  
John Zerilli ◽  
Alistair Knott ◽  
James Maclaurin ◽  
Colin Gavaghan

AbstractThe danger of human operators devolving responsibility to machines and failing to detect cases where they fail has been recognised for many years by industrial psychologists and engineers studying the human operators of complex machines. We call it “the control problem”, understood as the tendency of the human within a human–machine control loop to become complacent, over-reliant or unduly diffident when faced with the outputs of a reliable autonomous system. While the control problem has been investigated for some time, up to this point its manifestation in machine learning contexts has not received serious attention. This paper aims to fill that gap. We argue that, except in certain special circumstances, algorithmic decision tools should not be used in high-stakes or safety-critical decisions unless the systems concerned are significantly “better than human” in the relevant domain or subdomain of decision-making. More concretely, we recommend three strategies to address the control problem, the most promising of which involves a complementary (and potentially dynamic) coupling between highly proficient algorithmic tools and human agents working alongside one another. We also identify six key principles which all such human–machine systems should reflect in their design. These can serve as a framework both for assessing the viability of any such human–machine system as well as guiding the design and implementation of such systems generally.


2015 ◽  
Vol 22 (02) ◽  
pp. 159-162
Author(s):  
Naeem Yaqoob ◽  
Saeed Akram Bhatti ◽  
Muhammad Shahid Javed

One of the criteria for admission in Annual Professional Exam for medical studentsin Pakistan is to have 75% attendance during the session and the other is to pass Send upexam with 50% marks. Objectives: To assess the usefulness of send-up exams in predictingthe annual marks by comparing their results with annual University examination results forpreclinical medical students. Study Design: A cross sectional analytical study. Methodology: Atotal of 173 preclinical students of 1st and 2nd year MBBS passing the First Professional Examin first attempt were included in the study. Send up result of each student during the year wasentered as percentage and compared with percentage total marks of same students in their 1stprofessional exam conducted by the University. Data maintained by Physiology Departmentwas entered and analysed by SPSS 21. Descriptive statistics in the form of numbers andpercentages were used and further analyzed using Pearson Correlation and Paired T Test ofSignificance. The p value of < 0.05 was considered significant. Results: A total of 173 students(81 from 1st Year and 92 from 2nd Year) who had passed the annual exam in first attempt wereincluded in the study. Out of these, 132(76.3%) were females and 41(23.7%) males. All studentswere within the age group of 18-24 years, mean age being 21.06 years. Mean send-up scorein the subject of Physiology was 57.37 (Range=33-78). Percentage total marks in all subjects(Anatomy, Physiology, Biochemistry) in annual exam had a mean of 69.46 (Range=55-84).Send-up result in percentage was directly assessed against the percentage marks obtainedin annual exam for each student. The send-up marks were significantly related to the marks inthe final exam (p=0.01). The strength of association was same as that for average test markswith annual marks. Girls performed better than boys during the send up (Mean 58.35 Vs 54.21)as well as in the annual exam (Mean 70.12 Vs 67.33). There was also a difference amongstdifferent classes with 2nd year performing better than first year in both send-up (Mean 58.60vs 55.98) and annual exams (Median 70.83 vs 67.91). Conclusions: Send-up results may begood predictors of the academic performance in professional examination in preclinical years ina medical college. Female students perform better than their counterparts during both send-upand annual exams.


2008 ◽  
Vol 63 (3) ◽  
pp. 607-608
Author(s):  
Csaba Pléh

ErősFerenc, LénárdKataés BókayAntal(szerk.) Typus Budapestiensis. Tanulmányok a pszichoanalízis budapesti iskolájának történetéről éshatásáról. Thalassa, Budapest, 2008, 447 oldalHargittaiIstván: Doktor DNS. Őszinte beszélgetések James D. Watsonnal. Vince Kiadó, Budapest, 2008, 223 oldalKutrovátzGábor,LángBenedekésZemplénGábor: A tudomány határa. Typotex,Budapest, 2008, 376 oldalEngerl, C. andSinger, W. (eds) Better than conscious? Decision making, the human mind, and implications for institutions . MIT Press, Cambridge, 2008, xiv + 449 oldalKondor, Zsuzsanna: Embedded thinking. Multimedia and the new rationality. Peter Lang, Frankfurt am Main, 2008, xi + 169 oldalSíklakiIstván(szerk.): Szóbeli befolyásolás. I–II. Typotex, Budapest,_n


2020 ◽  
Vol 13 (5) ◽  
pp. 884-892
Author(s):  
Sartaj Ahmad ◽  
Ashutosh Gupta ◽  
Neeraj Kumar Gupta

Background: In recent time, people love online shopping but before any shopping feedbacks or reviews always required. These feedbacks help customers in decision making for buying any product or availing any service. In the country like India this trend of online shopping is increasing very rapidly because awareness and the use of internet which is increasing day by day. As result numbers of customers and their feedbacks are also increasing. It is creating a problem that how to read all reviews manually. So there should be some computerized mechanism that provides customers a summary without spending time in reading feedbacks. Besides big number of reviews another problem is that reviews are not structured. Objective: In this paper, we try to design, implement and compare two algorithms with manual approach for the crossed domain Product’s reviews. Methods: Lexicon based model is used and different types of reviews are tested and analyzed to check the performance of these algorithms. Results: Algorithm based on opinions and feature based opinions are designed, implemented, applied and compared with the manual results and it is found that algorithm # 2 is performing better than algorithm # 1 and near to manual results. Conclusion: Algorithm # 2 is found better on the different product’s reviews and still to be applied on other product’s reviews to enhance its scope. Finally, it will be helpful to automate existing manual process.


2019 ◽  
Author(s):  
Andrew Mwila

BACKGROUND The Copperbelt University is the second public University in Zambia. The School of Medicine has four major programs namely; Bachelor of Medicine and Surgery, Bachelor of Dental Surgery, Bachelor of Clinical Medicine and Bachelor of Biomedical sciences. The Copperbelt University School of Medicine runs a five-year training program for both the BDS and the MBCHB programs. Students are admitted into the Medical school after successfully completing their first year at the Main campus in the School of Natural Sciences with an average of 4 B grades or higher (B grade is a mark of 65 to 74%). OBJECTIVE The study was done to determine the association between admission criteria and academic performance among preclinical students. Hence, the study compares the academic performance among preclinical students admitted into the Bachelor of Dental Surgery and Bachelor of Medicine and Surgery at the Copperbelt University School of Medicine. METHODS This is a retrospective cohort study conducted at Michael Chilufya Sata School of medicine Campus. A pilot study was conducted with 30 BDS and 30 MBCHB students and the obtained information helped determine the sample size. SPSS was used to analyze the data. The study period lasted approximately 7 weeks at a cost of K1621. RESULTS In 2014, there was an improvement in average performance between 2nd and 3rd year for each program. An average score of 15.4 (SD 4.2) was obtained in 3rd year compared to 12.8 (SD 4.9) in 2nd year (p<0.001). Meanwhile, 3rd MB ChB mean score was 12.6 (SD 3.7) compared to 10.7 (SD 3.6) in 2nd years (p<0.05). However, in 2016, both programs, 3rd year mean scores were lower than 2nd year (MB ChB 2nd year mean score was 12.0 (SD 4.3) compared to 3rd year with a mean score of 9.5 (SD 4.5), p<0.001; BDS 2nd year mean score was 10.6 (SD 4.0) compared to 3rd year mean score of 8.2 (SD 3.4), p<0.01. On average MB ChB students performed better than BDS students in all the years (p<0.05), except in 2016 when the results were comparable. CONCLUSIONS Results from the study shows that entry criteria has a correlation to academic performance as students admitted with higher grades perform much better than those with lower grades.


Author(s):  
Bahador Bahrami

Evidence for and against the idea that “two heads are better than one” is abundant. This chapter considers the contextual conditions and social norms that predict madness or wisdom of crowds to identify the adaptive value of collective decision-making beyond increased accuracy. Similarity of competence among members of a collective impacts collective accuracy, but interacting individuals often seem to operate under the assumption that they are equally competent even when direct evidence suggest the opposite and dyadic performance suffers. Cross-cultural data from Iran, China, and Denmark support this assumption of similarity (i.e., equality bias) as a sensible heuristic that works most of the time and simplifies social interaction. Crowds often trade off accuracy for other collective benefits such as diffusion of responsibility and reduction of regret. Consequently, two heads are sometimes better than one, but no-one holds the collective accountable, not even for the most disastrous of outcomes.


Sign in / Sign up

Export Citation Format

Share Document