Similarity measures in automated essay scoring systems: A ten-year review

Author(s):  
Vidasha Ramnarain-Seetohul ◽  
Vandana Bassoo ◽  
Yasmine Rosunally
Author(s):  
Dougal Hutchison

This chapter gives a relatively non-technical introduction to computer programs for marking of essays, generally known as Automated Essay Scoring (AES) systems. It identifies four stages in the process, which may be distinguished as training, summarising mechanical and structural aspects, describing content, and scoring, and describes how these are carried out in a number of commercially available programs. It considers how the validity of the process may be assessed, and reviews some of the evidence on how successful they are. It also discusses some of the ways in which they may fall down and describes some research investigating this. The chapter concludes with a discussion of possible future developments, and offers a number of searching questions for administrators considering the possibility of introducing AES in their own schools.


2019 ◽  
Vol 5 ◽  
pp. e208 ◽  
Author(s):  
Mohamed Abdellatif Hussein ◽  
Hesham Hassan ◽  
Mohammad Nassef

Background Writing composition is a significant factor for measuring test-takers’ ability in any language exam. However, the assessment (scoring) of these writing compositions or essays is a very challenging process in terms of reliability and time. The need for objective and quick scores has raised the need for a computer system that can automatically grade essay questions targeting specific prompts. Automated Essay Scoring (AES) systems are used to overcome the challenges of scoring writing tasks by using Natural Language Processing (NLP) and machine learning techniques. The purpose of this paper is to review the literature for the AES systems used for grading the essay questions. Methodology We have reviewed the existing literature using Google Scholar, EBSCO and ERIC to search for the terms “AES”, “Automated Essay Scoring”, “Automated Essay Grading”, or “Automatic Essay” for essays written in English language. Two categories have been identified: handcrafted features and automatically featured AES systems. The systems of the former category are closely bonded to the quality of the designed features. On the other hand, the systems of the latter category are based on the automatic learning of the features and relations between an essay and its score without any handcrafted features. We reviewed the systems of the two categories in terms of system primary focus, technique(s) used in the system, the need for training data, instructional application (feedback system), and the correlation between e-scores and human scores. The paper includes three main sections. First, we present a structured literature review of the available Handcrafted Features AES systems. Second, we present a structured literature review of the available Automatic Featuring AES systems. Finally, we draw a set of discussions and conclusions. Results AES models have been found to utilize a broad range of manually-tuned shallow and deep linguistic features. AES systems have many strengths in reducing labor-intensive marking activities, ensuring a consistent application of scoring criteria, and ensuring the objectivity of scoring. Although many techniques have been implemented to improve the AES systems, three primary challenges have been identified. The challenges are lacking of the sense of the rater as a person, the potential that the systems can be deceived into giving a lower or higher score to an essay than it deserves, and the limited ability to assess the creativity of the ideas and propositions and evaluate their practicality. Many techniques have only been used to address the first two challenges.


2019 ◽  
Vol 58 (4) ◽  
pp. 771-790
Author(s):  
Leyi Qian ◽  
Yali Zhao ◽  
Yan Cheng

Automated writing scoring can not only provide holistic scores but also instant and corrective feedback on L2 learners’ writing quality. It has been increasing in use throughout China and internationally. Given the advantages, the past several years has witnessed the emergence and growth of writing evaluation products in China. To the best of our knowledge, no previous studies have touched upon the validity of China’s automated essay scoring systems. By drawing on the four major categories of argument for validity framework proposed by Kane—scoring, generalization, extrapolation, and implication, this article aims to evaluate the performance of one of the China’s automated essay scoring systems—iWrite against human scores. The results show that iWrite fails to be a valid tool to assess L2 writings and predict human scores. Therefore, iWrite currently should be restricted to nonconsequential uses and cannot be employed as an alternative to or a substitute for human raters.


2018 ◽  
Author(s):  
Arab World English Journal ◽  
Beata Lewis Sevcikova

In the last 30 years, numerous scholars have described the possible changes in marking writingassignments. The paper reflects these developments as it charts the paths recently taken in the field, evaluates automated and human essay scoring systems in academic environments and analyzes the implications that both systems offer. In recent years, ways and opportunities for giving feedback have changed as computer programs have been more widely used in assessing students writing. Numerous researchers have studied computerized feedback and its potential. Different problems, such as quality of this type of feedback, validity, and reliability have been analyzed. This critical review examines two major types of academic writing support. The objective of the study based on the literature review is to examine the potential support of human and automated proofreaders for teaching and learning purposes.


Sign in / Sign up

Export Citation Format

Share Document