Validating human and automated scoring of essays against “True” scores

Issues regarding tolerance and confidence intervals are discussed within the context of educational measurement and conceptual distinctions are drawn between these two types of intervals. Points are raised about the advantages of tolerance intervals when the focus is on a particular observed score rather than a particular examinee. Because tolerance intervals depend on strong true score models, a practical implication of the study is that true score tolerance intervals are fairly insensitive to differences in assumptions among the five models studied.

Download Full-text

Elementary Teachers’ Perceptions of Automated Feedback and Automated Scoring: Transforming the Teaching and Learning of Writing Using Automated Writing Evaluation

Computers & Education ◽

10.1016/j.compedu.2021.104208 ◽

2021 ◽

pp. 104208

Author(s):

Joshua Wilson ◽

Cristina Ahrendt ◽

Emily A. Fudge ◽

Alexandria Raiche ◽

Gaysha Beard ◽

...

Keyword(s):

Elementary Teachers ◽

Teaching And Learning ◽

Automated Scoring ◽

Teachers Perceptions ◽

Writing Evaluation ◽

Automated Writing Evaluation ◽

Automated Feedback

Download Full-text

Handbook of Automated Scoring: Theory into Practice Duanli Yan, André A. Rupp and Peter W. Foltz Chapman & Hall/CRC, 2020, 580 pages, £120, hardcover ISBN: 978‐1‐1385‐7827‐2

International Statistical Review ◽

10.1111/insr.12455 ◽

2021 ◽

Author(s):

Magdalen Beiting‐Parrish ◽

Jay Verkuilen

Keyword(s):

Automated Scoring

Download Full-text

High-throughput behavioral screen in C. elegans reveals Parkinson’s disease drug candidates

Communications Biology ◽

10.1038/s42003-021-01731-z ◽

2021 ◽

Vol 4 (1) ◽

Author(s):

Salman Sohrabi ◽

Danielle E. Mor ◽

Rachel Kaletsky ◽

William Keyes ◽

Coleen T. Murphy

Keyword(s):

Neural Network ◽

Parkinson’S Disease ◽

Parkinson's Disease ◽

High Throughput ◽

Potential Candidate ◽

Automated Scoring ◽

C Elegans ◽

Drug Candidates ◽

Branched Chain ◽

Approved Drugs

AbstractWe recently linked branched-chain amino acid transferase 1 (BCAT1) dysfunction with the movement disorder Parkinson’s disease (PD), and found that RNAi-mediated knockdown of neuronal bcat-1 in C. elegans causes abnormal spasm-like ‘curling’ behavior with age. Here we report the development of a machine learning-based workflow and its application to the discovery of potentially new therapeutics for PD. In addition to simplifying quantification and maintaining a low data overhead, our simple segment-train-quantify platform enables fully automated scoring of image stills upon training of a convolutional neural network. We have trained a highly reliable neural network for the detection and classification of worm postures in order to carry out high-throughput curling analysis without the need for user intervention or post-inspection. In a proof-of-concept screen of 50 FDA-approved drugs, enasidenib, ethosuximide, metformin, and nitisinone were identified as candidates for potential late-in-life intervention in PD. These findings point to the utility of our high-throughput platform for automated scoring of worm postures and in particular, the discovery of potential candidate treatments for PD.

Download Full-text

Automated scoring of fear-related behavior using EthoVision software

Journal of Neuroscience Methods ◽

10.1016/j.jneumeth.2008.12.021 ◽

2009 ◽

Vol 178 (2) ◽

pp. 323-326 ◽

Cited By ~ 38

Author(s):

Jon Pham ◽

Sara M. Cabrera ◽

Carles Sanchis-Segura ◽

Marcelo A. Wood

Keyword(s):

Automated Scoring

Download Full-text

A Review of Strategies for Validating Computer-Automated Scoring

Applied Measurement in Education ◽

10.1207/s15324818ame1504_04 ◽

2002 ◽

Vol 15 (4) ◽

pp. 391-412 ◽

Cited By ~ 49

Author(s):

Yongwei Yang ◽

Chad W. Buckendahl ◽

Piotr J. Juszkiewicz ◽

Dennison S. Bhola

Keyword(s):

Automated Scoring

Download Full-text

Automated scoring of rehabilitative tests with singular spectrum analysis

2015 23rd European Signal Processing Conference (EUSIPCO) ◽

10.1109/eusipco.2015.7362849 ◽

2015 ◽

Cited By ~ 5

Author(s):

Tracey K. M. Lee ◽

K. H. Leo ◽

Saeid Sanei ◽

Effie Chew

Keyword(s):

Spectrum Analysis ◽

Singular Spectrum Analysis ◽

Automated Scoring ◽

Singular Spectrum

Download Full-text

Validation of automated scores of TOEFL iBT tasks against non-test indicators of writing ability

Language Testing ◽

10.1177/0265532210364406 ◽

2010 ◽

Vol 27 (3) ◽

pp. 335-353 ◽

Cited By ~ 29

Author(s):

Sara Cushing Weigle

Keyword(s):

Validity Evidence ◽

Educational Testing ◽

Automated Scoring ◽

Self Assessment ◽

Writing Ability ◽

Testing Service ◽

Independent Writing ◽

Complex Skills ◽

Toefl Ibt ◽

Educational Testing Service

Automated scoring has the potential to dramatically reduce the time and costs associated with the assessment of complex skills such as writing, but its use must be validated against a variety of criteria for it to be accepted by test users and stakeholders. This study approaches validity by comparing human and automated scores on responses to TOEFL® iBT Independent writing tasks with several non-test indicators of writing ability: student self-assessment, instructor assessment, and independent ratings of non-test writing samples. Automated scores were produced using e-rater ®, developed by Educational Testing Service (ETS). Correlations between both human and e-rater scores and non-test indicators were moderate but consistent, providing criterion-related validity evidence for the use of e-rater along with human scores. The implications of the findings for the validity of automated scores are discussed.

Download Full-text