Automatic Essay Scoring System Using N-Gram and Cosine Similarity for Gamification Based E-Learning

This paper presents open problems in Indonesian Scoring System. The previous study exposes the comparison of several similarity metrics on automated essay scoring in Indonesian. The metrics are Cosine Similarity, Euclidean Distance, and Jaccard. The data being used in the research are about 2,000 texts. This data are obtained from 50 students who answered 40 questions on politics, sports, lifestyle, and technology. The study also evaluates the stemming approach for the system performance. The difference between all methods between using stemming or not is around 4-9%. The results show Jaccard is the best metric both for the system with stemming or not. Jaccard method with stemming has the percentage error lowest than the others. The politic category has the highest average similarity score than lifestyle, sport, and technology. The percentage error of Jaccard with stemming is 52.31%, Cosine Similarity is 59.49%, and Euclidean Distance is 332.90%. In addition, Jaccard without stemming is also the best than the others. The percentage error without stemming of Jaccard is 56.05%, Cosine Similarity is 57.99%, and Euclidean Distance is 339.41%. However, this percentage error is high enough to be used for a functional essay grading system. The percentage errors are relatively high, more than 50%. Therefore this paper explores several ideas of open problems in this issue. The openly available dataset can be used to develop better approaches than the standard similarity metrics. The approaches expose are ranging from feature extraction, similarity metrics, learning algorithm, environment implementation, and performance evaluation.

Download Full-text

Automatic Essay Scoring in E-learning System Using LSA Method with N-Gram Feature for Bahasa Indonesia

MATEC Web of Conferences ◽

10.1051/matecconf/201816401037 ◽

2018 ◽

Vol 164 ◽

pp. 01037

Author(s):

Rico Setiadi Citawan ◽

Viny Christanti Mawardi ◽

Bagus Mulyawan

Keyword(s):

Learning Outcomes ◽

Semantic Analysis ◽

Evaluation Process ◽

Educational Process ◽

Learning System ◽

Assessment Process ◽

E Learning ◽

Essay Scoring ◽

N Gram ◽

Form Type

In the world of education, e-learning system is a system that can be used to support the educational process. E-learning system is usually used by educators to learners in evaluating learning outcomes. In the process of evaluating learning outcomes in the e-learning system, the form type of exam questions that are often used are multiple choice and short stuffing. For exam questions in the form of essays are rarely used in the evaluation process of educational because of the difference in the subjectivity and time consuming in the assessment process. In this design aims to create an automatic essay scoring feature on e-learning system that can be used to support the learning process. The method used in automatic essay scoring is Latent Semantic Analysis (LSA) with n-gram feature. The evaluation results of the design features automatic essay scoring showed that the accuracy of the average achieved in the amount of 78.65 %, 58.89 %, 14.91 %, 71.37 %, 64.49 % in the LSA unigram, bigram, trigram, unigram + bigram, unigram + bigram + trigram.

Download Full-text

A Comparative Analysis of Euclidean, Jaccard and Cosine Similarity Measure and Arabic Wordnet for Automated Arabic Essay Scoring

2021 Fifth International Conference on Information Retrieval and Knowledge Management (CAMP) ◽

10.1109/camp51653.2021.9498119 ◽

2021 ◽

Author(s):

Mohammad Alobed ◽

Abdallah M M Altrad ◽

Zainab Binti Abu Bakar

Keyword(s):

Comparative Analysis ◽

Similarity Measure ◽

Cosine Similarity ◽

Cosine Similarity Measure ◽

Essay Scoring

Download Full-text

SCESS: a WFSA-based automated simplified chinese essay scoring system with incremental latent semantic analysis

Natural Language Engineering ◽

10.1017/s1351324914000138 ◽

2014 ◽

Vol 22 (2) ◽

pp. 291-319 ◽

Cited By ~ 2

Author(s):

SHUDONG HAO ◽

YANYAN XU ◽

DENGFENG KE ◽

KAILE SU ◽

HENGLI PENG

Keyword(s):

Latent Semantic Analysis ◽

Scoring System ◽

Semantic Analysis ◽

Scoring Systems ◽

Recall Rate ◽

Experimental Results ◽

Important Indicator ◽

Simplified Chinese ◽

Language Tests ◽

Essay Scoring

AbstractWriting in language tests is regarded as an important indicator for assessing language skills of test takers. As Chinese language tests become popular, scoring a large number of essays becomes a heavy and expensive task for the organizers of these tests. In the past several years, some efforts have been made to develop automated simplified Chinese essay scoring systems, reducing both costs and evaluation time. In this paper, we introduce a system called SCESS (automated Simplified Chinese Essay Scoring System) based on Weighted Finite State Automata (WFSA) and using Incremental Latent Semantic Analysis (ILSA) to deal with a large number of essays. First, SCESS uses ann-gram language model to construct a WFSA to perform text pre-processing. At this stage, the system integrates a Confusing-Character Table, a Part-Of-Speech Table, beam search and heuristic search to perform automated word segmentation and correction of essays. Experimental results show that this pre-processing procedure is effective, with a Recall Rate of 88.50%, a Detection Precision of 92.31% and a Correction Precision of 88.46%. After text pre-processing, SCESS uses ILSA to perform automated essay scoring. We have carried out experiments to compare the ILSA method with the traditional LSA method on the corpora of essays from the MHK test (the Chinese proficiency test for minorities). Experimental results indicate that ILSA has a significant advantage over LSA, in terms of both running time and memory usage. Furthermore, experimental results also show that SCESS is quite effective with a scoring performance of 89.50%.

Download Full-text