Automated Essay Scoring Using Transformer Models

Automated essay scoring (AES) is gaining increasing attention in the education sector as it significantly reduces the burden of manual scoring and allows ad hoc feedback for learners. Natural language processing based on machine learning has been shown to be particularly suitable for text classification and AES. While many machine-learning approaches for AES still rely on a bag of words (BOW) approach, we consider a transformer-based approach in this paper, compare its performance to a logistic regression model based on the BOW approach, and discuss their differences. The analysis is based on 2088 email responses to a problem-solving task that were manually labeled in terms of politeness. Both transformer models considered in the analysis outperformed without any hyperparameter tuning of the regression-based model. We argue that, for AES tasks such as politeness classification, the transformer-based approach has significant advantages, while a BOW approach suffers from not taking word order into account and reducing the words to their stem. Further, we show how such models can help increase the accuracy of human raters, and we provide a detailed instruction on how to implement transformer-based models for one’s own purposes.

Download Full-text

Text Classification Algorithms: A Survey

Information ◽

10.3390/info10040150 ◽

2019 ◽

Vol 10 (4) ◽

pp. 150 ◽

Cited By ~ 93

Author(s):

Kowsari ◽

Jafari Meimandi ◽

Heidarysafa ◽

Mendu ◽

Barnes ◽

...

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Language Processing ◽

Text Classification ◽

Classification Algorithms ◽

Learning Approaches ◽

Machine Learning Methods ◽

Linear Relationships ◽

Reduction Methods ◽

Complex Models

In recent years, there has been an exponential growth in the number of complex documentsand texts that require a deeper understanding of machine learning methods to be able to accuratelyclassify texts in many applications. Many machine learning approaches have achieved surpassingresults in natural language processing. The success of these learning algorithms relies on their capacityto understand complex models and non-linear relationships within data. However, finding suitablestructures, architectures, and techniques for text classification is a challenge for researchers. In thispaper, a brief overview of text classification algorithms is discussed. This overview covers differenttext feature extractions, dimensionality reduction methods, existing algorithms and techniques, andevaluations methods. Finally, the limitations of each technique and their application in real-worldproblems are discussed.

Download Full-text

A Comparison of Traditional Machine Learning Approaches for Supervised Feedback Classification in Bahasa Indonesia

International Journal of New Media Technology ◽

10.31937/ijnmt.v1i1.1485 ◽

2020 ◽

Vol 7 (1) ◽

pp. 28-32

Author(s):

Andre Rusli ◽

Alethea Suryadibrata ◽

Samiaji Bintang Nusantara ◽

Julio Christian Young

Keyword(s):

Machine Learning ◽

Language Processing ◽

Text Classification ◽

Weighted Average ◽

Supervised Machine Learning ◽

Learning Approaches ◽

K Nearest Neighbors ◽

Machine Learning Classification ◽

Logistics Regression ◽

Learning Machine

The advancement of machine learning and natural language processing techniques hold essential opportunities to improve the existing software engineering activities, including the requirements engineering activity. Instead of manually reading all submitted user feedback to understand the evolving requirements of their product, developers could use the help of an automatic text classification program to reduce the required effort. Many supervised machine learning approaches have already been used in many fields of text classification and show promising results in terms of performance. This paper aims to implement NLP techniques for the basic text preprocessing, which then are followed by traditional (non-deep learning) machine learning classification algorithms, which are the Logistics Regression, Decision Tree, Multinomial Naïve Bayes, K-Nearest Neighbors, Linear SVC, and Random Forest classifier. Finally, the performance of each algorithm to classify the feedback in our dataset into several categories is evaluated using three F1 Score metrics, the macro-, micro-, and weighted-average F1 Score. Results show that generally, Logistics Regression is the most suitable classifier in most cases, followed by Linear SVC. However, the performance gap is not large, and with different configurations and requirements, other classifiers could perform equally or even better.

Download Full-text

More efficient processes for creating automated essay scoring frameworks: A demonstration of two algorithms

Language Testing ◽

10.1177/0265532220937830 ◽

2020 ◽

pp. 026553222093783

Author(s):

Jinnie Shin ◽

Mark J. Gierl

Keyword(s):

Machine Learning ◽

Language Processing ◽

Model Development ◽

Weighted Kappa ◽

Neural Model ◽

Support Vector ◽

High Stakes ◽

Automated Essay Scoring ◽

Educational Assessments ◽

Essay Scoring

Automated essay scoring (AES) has emerged as a secondary or as a sole marker for many high-stakes educational assessments, in native and non-native testing, owing to remarkable advances in feature engineering using natural language processing, machine learning, and deep-neural algorithms. The purpose of this study is to compare the effectiveness and the performance of two AES frameworks, each based on machine learning with deep language features, or complex language features, and deep neural algorithms. More specifically, support vector machines (SVMs) in conjunction with Coh-Metrix features were used for a traditional AES model development, and the convolutional neural networks (CNNs) approach was used for more contemporary deep-neural model development. Then, the strengths and weaknesses of the traditional and contemporary models under different circumstances (e.g., types of the rubric, length of the essay, and the essay type) were tested. The results were evaluated using the quadratic weighted kappa (QWK) score and compared with the agreement between the human raters. The results indicated that the CNNs model performs better, meaning that it produced more comparable results to the human raters than the Coh-Metrix + SVMs model. Moreover, the CNNs model also achieved state-of-the-art performance in most of the essay sets with a high average QWK score.

Download Full-text

Mol2vec: Unsupervised Machine Learning Approach with Chemical Intuition

10.26434/chemrxiv.5513581.v1 ◽

2017 ◽

Author(s):

Sabrina Jaeger ◽

Simone Fulle ◽

Samo Turk

Keyword(s):

Machine Learning ◽

Language Processing ◽

Supervised Machine Learning ◽

Learning Approach ◽

Learning Approaches ◽

Unsupervised Machine Learning ◽

Feature Representations ◽

Machine Learning Approach ◽

The Individual ◽

Vector Representations

Inspired by natural language processing techniques we here introduce Mol2vec which is an unsupervised machine learning approach to learn vector representations of molecular substructures. Similarly, to the Word2vec models where vectors of closely related words are in close proximity in the vector space, Mol2vec learns vector representations of molecular substructures that are pointing in similar directions for chemically related substructures. Compounds can finally be encoded as vectors by summing up vectors of the individual substructures and, for instance, feed into supervised machine learning approaches to predict compound properties. The underlying substructure vector embeddings are obtained by training an unsupervised machine learning approach on a so-called corpus of compounds that consists of all available chemical matter. The resulting Mol2vec model is pre-trained once, yields dense vector representations and overcomes drawbacks of common compound feature representations such as sparseness and bit collisions. The prediction capabilities are demonstrated on several compound property and bioactivity data sets and compared with results obtained for Morgan fingerprints as reference compound representation. Mol2vec can be easily combined with ProtVec, which employs the same Word2vec concept on protein sequences, resulting in a proteochemometric approach that is alignment independent and can be thus also easily used for proteins with low sequence similarities.

Download Full-text

A Comprehensive Study of Artificial Intelligence and Machine Learning Approaches in Confronting the Coronavirus (COVID-19) Pandemic

International Journal of Health Services ◽

10.1177/00207314211017469 ◽

2021 ◽

pp. 002073142110174

Author(s):

Md Mijanur Rahman ◽

Fatema Khatun ◽

Ashik Uzzaman ◽

Sadia Islam Sami ◽

Md Al-Amin Bhuiyan ◽

...

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Health Care ◽

Language Processing ◽

Probabilistic Models ◽

Health Care Systems ◽

Learning Approaches ◽

Care Systems ◽

Novel Coronavirus ◽

Comprehensive Study

The novel coronavirus disease (COVID-19) has spread over 219 countries of the globe as a pandemic, creating alarming impacts on health care, socioeconomic environments, and international relationships. The principal objective of the study is to provide the current technological aspects of artificial intelligence (AI) and other relevant technologies and their implications for confronting COVID-19 and preventing the pandemic’s dreadful effects. This article presents AI approaches that have significant contributions in the fields of health care, then highlights and categorizes their applications in confronting COVID-19, such as detection and diagnosis, data analysis and treatment procedures, research and drug development, social control and services, and the prediction of outbreaks. The study addresses the link between the technologies and the epidemics as well as the potential impacts of technology in health care with the introduction of machine learning and natural language processing tools. It is expected that this comprehensive study will support researchers in modeling health care systems and drive further studies in advanced technologies. Finally, we propose future directions in research and conclude that persuasive AI strategies, probabilistic models, and supervised learning are required to tackle future pandemic challenges.

Download Full-text

Tagging terms in text

Terminology ◽

10.1075/term.21010.rig ◽

2022 ◽

Author(s):

Ayla Rigouts Terryn ◽

Véronique Hoste ◽

Els Lefever

Keyword(s):

Neural Network ◽

Machine Learning ◽

Language Processing ◽

Conditional Random Fields ◽

Traditional Approach ◽

Learning Approaches ◽

Term Extraction ◽

The Neural Network ◽

Predicted Probability ◽

Automatic Term Extraction

Abstract As with many tasks in natural language processing, automatic term extraction (ATE) is increasingly approached as a machine learning problem. So far, most machine learning approaches to ATE broadly follow the traditional hybrid methodology, by first extracting a list of unique candidate terms, and classifying these candidates based on the predicted probability that they are valid terms. However, with the rise of neural networks and word embeddings, the next development in ATE might be towards sequential approaches, i.e., classifying each occurrence of each token within its original context. To test the validity of such approaches for ATE, two sequential methodologies were developed, evaluated, and compared: one feature-based conditional random fields classifier and one embedding-based recurrent neural network. An additional comparison was added with a machine learning interpretation of the traditional approach. All systems were trained and evaluated on identical data in multiple languages and domains to identify their respective strengths and weaknesses. The sequential methodologies were proven to be valid approaches to ATE, and the neural network even outperformed the more traditional approach. Interestingly, a combination of multiple approaches can outperform all of them separately, showing new ways to push the state-of-the-art in ATE.

Download Full-text

Machine Learning Approaches for Bangla Statistical Machine Translation

Technical Challenges and Design Issues in Bangla Language Processing ◽

10.4018/978-1-4666-3970-6.ch004 ◽

2013 ◽

pp. 79-95

Author(s):

Maxim Roy

Keyword(s):

Machine Learning ◽

Active Learning ◽

Machine Translation ◽

Language Processing ◽

Statistical Machine Translation ◽

Low Density ◽

Learning Approaches ◽

Translation Quality ◽

Selection Strategies ◽

Translation Accuracy

Machine Translation (MT) from Bangla to English has recently become a priority task for the Bangla Natural Language Processing (NLP) community. Statistical Machine Translation (SMT) systems require a significant amount of bilingual data between language pairs to achieve significant translation accuracy. However, being a low-density language, such resources are not available in Bangla. In this chapter, the authors discuss how machine learning approaches can help to improve translation quality within as SMT system without requiring a huge increase in resources. They provide a novel semi-supervised learning and active learning framework for SMT, which utilizes both labeled and unlabeled data. The authors discuss sentence selection strategies in detail and perform detailed experimental evaluations on the sentence selection methods. In semi-supervised settings, reversed model approach outperformed all other approaches for Bangla-English SMT, and in active learning setting, geometric 4-gram and geometric phrase sentence selection strategies proved most useful based on BLEU score results over baseline approaches. Overall, in this chapter, the authors demonstrate that for low-density language like Bangla, these machine-learning approaches can improve translation quality.

Download Full-text

Detection of Economy-Related Turkish Tweets Based on Machine Learning Approaches

10.4018/978-1-7998-8413-2.ch008 ◽

2022 ◽

pp. 171-195

Author(s):

Jale Bektaş

Keyword(s):

Machine Learning ◽

Text Mining ◽

Text Classification ◽

Integration Method ◽

Classification Problem ◽

Feature Representation ◽

Learning Approaches ◽

Machine Learning Methods ◽

Linguistic Approach ◽

Turkish Language

Conducting NLP for Turkish is a lot harder than other Latin-based languages such as English. In this study, by using text mining techniques, a pre-processing frame is conducted in which TF-IDF values are calculated in accordance with a linguistic approach on 7,731 tweets shared by 13 famous economists in Turkey, retrieved from Twitter. Then, the classification results are compared with four common machine learning methods (SVM, Naive Bayes, LR, and integration LR with SVM). The features represented by the TF-IDF are experimented in different N-grams. The findings show the success of a text classification problem is relative with the feature representation methods, and the performance superiority of SVM is better compared to other ML methods with unigram feature representation. The best results are obtained via the integration method of SVM with LR with the Acc of 82.9%. These results show that these methodologies are satisfying for the Turkish language.

Download Full-text

Text classification to streamline online wildlife trade analyses

PLoS ONE ◽

10.1371/journal.pone.0254007 ◽

2021 ◽

Vol 16 (7) ◽

pp. e0254007

Author(s):

Oliver C. Stringham ◽

Stephanie Moncayo ◽

Katherine G. W. Hill ◽

Adam Toomes ◽

Lewis Mitchell ◽

...

Keyword(s):

Machine Learning ◽

Sensitivity Analysis ◽

Language Processing ◽

Text Classification ◽

Model Performance ◽

Wildlife Trade ◽

Online Data ◽

Vast Number ◽

Pet Birds ◽

Text Classifiers

Automated monitoring of websites that trade wildlife is increasingly necessary to inform conservation and biosecurity efforts. However, e-commerce and wildlife trading websites can contain a vast number of advertisements, an unknown proportion of which may be irrelevant to researchers and practitioners. Given that many wildlife-trade advertisements have an unstructured text format, automated identification of relevant listings has not traditionally been possible, nor attempted. Other scientific disciplines have solved similar problems using machine learning and natural language processing models, such as text classifiers. Here, we test the ability of a suite of text classifiers to extract relevant advertisements from wildlife trade occurring on the Internet. We collected data from an Australian classifieds website where people can post advertisements of their pet birds (n = 16.5k advertisements). We found that text classifiers can predict, with a high degree of accuracy, which listings are relevant (ROC AUC ≥ 0.98, F1 score ≥ 0.77). Furthermore, in an attempt to answer the question ‘how much data is required to have an adequately performing model?’, we conducted a sensitivity analysis by simulating decreases in sample sizes to measure the subsequent change in model performance. From our sensitivity analysis, we found that text classifiers required a minimum sample size of 33% (c. 5.5k listings) to accurately identify relevant listings (for our dataset), providing a reference point for future applications of this sort. Our results suggest that text classification is a viable tool that can be applied to the online trade of wildlife to reduce time dedicated to data cleaning. However, the success of text classifiers will vary depending on the advertisements and websites, and will therefore be context dependent. Further work to integrate other machine learning tools, such as image classification, may provide better predictive abilities in the context of streamlining data processing for wildlife trade related online data.

Download Full-text

Text classification to streamline online wildlife trade analyses

10.32942/osf.io/593ve ◽

2021 ◽

Author(s):

Oliver C. Stringham ◽

Stephanie Moncayo ◽

Katherine G.W. Hill ◽

Adam Toomes ◽

Lewis Mitchell ◽

...

Keyword(s):

Machine Learning ◽

Sensitivity Analysis ◽

Language Processing ◽

Text Classification ◽

Model Performance ◽

Wildlife Trade ◽

Online Data ◽

Vast Number ◽

Pet Birds ◽

Text Classifiers

1.Automated monitoring of websites that trade wildlife is increasingly necessary to inform conservation and biosecurity efforts. However, e-commerce and wildlife trading websites can contain a vast number of advertisements, an unknown proportion of which may be irrelevant to researchers and practitioners. Given that many of these advertisements have an unstructured text format, automated identification of relevant listings has not traditionally been possible, nor attempted. Other scientific disciplines have solved similar problems using machine learning and natural language processing models, such as text classifiers. 2.Here, we test the ability of a suite of text classifiers to extract relevant advertisements from an Australian classifieds website where people can post advertisements of their pet birds (n = 16.5k advertisements). Furthermore, in an attempt to answer the question ‘how much data is required to have an adequately performing model?’, we conducted a sensitivity analysis by simulating decreases in sample sizes to measure the subsequent change in model performance.3.We found that text classifiers can predict, with a high degree of accuracy, which listings are relevant (ROC AUC ≥ 0.98, F1 score ≥ 0.77). From our sensitivity analysis, we found that text classifiers required a minimum sample size of 33% (c. 5.5k listings) to accurately identify relevant listings (for our dataset), providing a reference point for future applications of this sort. 4.Our results suggest that text classification is a viable tool that can be applied to the online trade of wildlife to reduce time dedicated to data cleaning. However, the success of text classifiers will vary depending on the advertisements and websites, and will therefore be context dependent. Further work to integrate other machine learning tools, such as image classification, may provide better predictive abilities in the context of streamlining data processing for wildlife trade related online data.

Download Full-text