Machine Learning Based Essay Grading System

Ojasvi Daga

doi:10.22214/ijraset.2021.36180

Machine Learning Based Essay Grading System

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.36180 ◽

2021 ◽

Vol 9 (VI) ◽

pp. 5465-5472

Author(s):

Ojasvi Daga

Keyword(s):

Machine Learning ◽

Language Processing ◽

Regression Models ◽

Grading System ◽

Maximum Correlation ◽

Essay Writing ◽

Machine Learning Model ◽

Human Effort ◽

Essay Grading ◽

Best Fit

Machine Learning and automation has progressed immensely over the years and has tend to make human lives simpler with reducing human effort and time on tasks by enabling a machine to perform them. One such task is to grade essays. Essay writing is an integral part for anyone willing to learn a language or skill or to simply exhibit one’s thoughts and ideas on a topic. This leads us to the reason why essay grading is important. When a work is scored against some parameters, a scope of improvement is possible. Hence, when essays are graded and feedbacks are provided, it guides the writer to analyse the work and to have a better understanding of the topic in general. Although, manual grading of essays could create discrepancy because of being graded by different individuals having different perceptions of the same content. It also consumes a lot of human time and effort. Therefore, automatic grading of essays could prove to be the saviour. In this project, we build a machine learning model which grades essays based on various features extracted using Natural Language Processing. We also test the model’s performance using several regression models like Linear, Lasso, and Ridge, and methods like Artificial Neural Network to find the best fit giving the maximum correlation with human grades.

Download Full-text

Word prediction in computational historical linguistics

Journal of Language Modelling ◽

10.15398/jlm.v8i2.268 ◽

2021 ◽

Vol 8 (2) ◽

Author(s):

Peter Dekker ◽

Willem Zuidema

Keyword(s):

Machine Learning ◽

Language Processing ◽

Historical Linguistics ◽

Data Representation ◽

Target Language ◽

Prediction Methods ◽

Word Prediction ◽

Tree Reconstruction ◽

Source Language ◽

Machine Learning Model

In this paper, we investigate how the prediction paradigm from machine learning and Natural Language Processing (NLP) can be put to use in computational historical linguistics. We propose word prediction as an intermediate task, where the forms of unseen words in some target language are predicted from the forms of the corresponding words in a source language. Word prediction allows us to develop algorithms for phylogenetic tree reconstruction, sound correspondence identification and cognate detection, in ways close to attested methods for linguistic reconstruction. We will discuss different factors, such as data representation and the choice of machine learning model, that have to be taken into account when applying prediction methods in historical linguistics. We present our own implementations and evaluate them on different tasks in historical linguistics.

Download Full-text

Weather prediction using random forest machine learning model

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v22.i2.pp1208-1215 ◽

2021 ◽

Vol 22 (2) ◽

pp. 1208

Author(s):

R. Meenal ◽

Prawin Angel Michael ◽

D. Pamela ◽

E. Rajasekaran

Keyword(s):

Machine Learning ◽

Wind Speed ◽

Random Forest ◽

Solar Radiation ◽

Regression Models ◽

Tamil Nadu ◽

Weather Prediction ◽

Learning Model ◽

Statistical Regression ◽

Machine Learning Model

The complex numerical climate models pose a big challenge for scientists in weather predictions, especially for tropical system. This paper is focused on presenting the importance of weather prediction using machine learning (ML) technique. Recently many researchers recommended that the machine learning models can produce sensible weather predictions in spite of having no precise knowledge of atmospheric physics. In this work, global solar radiation (GSR) in MJ/m2/day and wind speed in m/s is predicted for Tamil Nadu, India using a random forest ML model. The random forest ML model is validated with measured wind and solar radiation data collected from IMD, Pune. The prediction results based on the random forest ML model are compared with statistical regression models and SVM ML model. Overall, random forest machine learning model has minimum error values of 0.750 MSE and R2 score of 0.97. Compared to regression models and SVM ML model, the prediction results of random forest ML model are more accurate. Thus, this study neglects the need for an expensive measuring instrument in all potential locations to acquire the solar radiation and wind speed data.

Download Full-text

Fenix: A Semantic Search Engine Based on an Ontology and a Model Trained with Machine Learning to Support Research

10.5121/csit.2021.110709 ◽

2021 ◽

Author(s):

Felipe Cujar-Rosero ◽

David Santiago Pinchao Ortiz ◽

Silvio Ricardo Timaran Pereira ◽

Jimmy Mateo Guerrero Restrepo

Keyword(s):

Machine Learning ◽

Virtual Environment ◽

Search Engine ◽

Language Processing ◽

Machine Learning Algorithms ◽

Semantic Search ◽

Research Projects ◽

Machine Learning Model ◽

The University ◽

Semantic Search Engine

This paper presents the final results of the research project that aimed to build a Semantic Search Engine that uses an Ontology and a model trained with Machine Learning to support the semantic search of research projects of the System of Research from the University of Nariño. For the construction of FENIX, as this Engine is called, it was used a methodology that includes the stages: appropriation of knowledge, installation and configuration of tools, libraries and technologies, collection, extraction and preparation of research projects, design and development of the Semantic Search Engine. The main results of the work were three: a) the complete construction of the Ontology with classes, object properties (predicates), data properties (attributes) and individuals (instances) in Protegé, SPARQL queries with Apache Jena Fuseki and the respective coding with Owlready2 using Jupyter Notebook with Python within the virtual environment of anaconda; b) the successful training of the model for which Machine Learning algorithms and specifically Natural Language Processing algorithms were used such as: SpaCy, NLTK, Word2vec and Doc2vec, this was also done in Jupyter Notebook with Python within the virtual environment of anaconda and with Elasticsearch; and c) the creation of FENIX managing and unifying the queries for the Ontology and for the Machine Learning model. The tests showed that FENIX was successful in all the searches that were carried out because its results were satisfactory.

Download Full-text

Natural language processing and entrustable professional activity text feedback in surgery: A machine learning model of resident autonomy

The American Journal of Surgery ◽

10.1016/j.amjsurg.2020.11.044 ◽

2020 ◽

Author(s):

Christopher C. Stahl ◽

Sarah A. Jung ◽

Alexandra A. Rosser ◽

Aaron S. Kraut ◽

Benjamin H. Schnapp ◽

...

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Learning Model ◽

Professional Activity ◽

Entrustable Professional Activity ◽

Machine Learning Model

Download Full-text

Development of a Machine Learning Model for Knowledge Acquisition, Relationship Extraction and Discovery in Domain Ontology Engineering using Jaccord Relationship Extraction and Neural Network

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.c6362.098319 ◽

2019 ◽

Vol 8 (3) ◽

pp. 7809-7817

Keyword(s):

Neural Network ◽

Machine Learning ◽

Knowledge Acquisition ◽

Language Processing ◽

Learning Model ◽

Heterogeneous Data ◽

Relationship Extraction ◽

Machine Learning Model ◽

Proposed Model ◽

Domain Independent

Creating a fast domain independent ontology through knowledge acquisition is a key problem to be addressed in the domain of knowledge engineering. Updating and validation is impossible without the intervention of domain experts, which is an expensive and tedious process. Thereby, an automatic system to model the ontology has become essential. This manuscript presents a machine learning model based on heterogeneous data from multiple domains including agriculture, health care, food and banking, etc. The proposed model creates a complete domain independent process that helps in populating the ontology automatically by extracting the text from multiple sources by applying natural language processing and various techniques of data extraction. The ontology instances are classified based on the domain. A Jaccord Relationship extraction process and the Neural Network Approval for Automated Theory is used for retrieval of data, automated indexing, mapping and knowledge discovery and rule generation. The results and solutions show the proposed model can automatically and efficiently construct automated Ontology

Download Full-text

Automated En Masse Machine Learning Model Generation Shows Comparable Performance as Classic Regression Models for Predicting Delayed Graft Function in Renal Allografts

Transplantation Journal ◽

10.1097/tp.0000000000003640 ◽

2021 ◽

Vol Publish Ahead of Print ◽

Cited By ~ 1

Author(s):

Kuang-Yu Jen ◽

Samer Albahra ◽

Felicia Yen ◽

Junichiro Sageshima ◽

Ling-Xin Chen ◽

...

Keyword(s):

Machine Learning ◽

Regression Models ◽

Graft Function ◽

Learning Model ◽

Delayed Graft Function ◽

Model Generation ◽

Renal Allografts ◽

Machine Learning Model ◽

Comparable Performance

Download Full-text

Optimal Value Estimation of Intentional-Value-Substitution for Learning Regression Models

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2021.p0153 ◽

2021 ◽

Vol 25 (2) ◽

pp. 153-161

Author(s):

Takuya Fukushima ◽

Tomoharu Nakashima ◽

Taku Hasegawa ◽

Vicenç Torra ◽

◽

...

Keyword(s):

Machine Learning ◽

Prediction Error ◽

Regression Models ◽

Missing Values ◽

Target Function ◽

Training Dataset ◽

Machine Learning Model ◽

Optimal Value ◽

Value Estimation ◽

Feature Values

This paper focuses on a method to train a regression model from incomplete input values. It is assumed in this paper that there are no missing values in a training dataset while missing values exist during a prediction phase using the trained model. Under this assumption, we propose Intentional-Value-Substitution (IVS) training to obtain a machine learning model that makes the prediction error as minimum as possible. In IVS training, a model is trained to approximate the target function using a modified training dataset in which some feature values are substituted with a certain value even though their values are not missing. It is shown through a series of computational experiments that the substitution values calculated from a mathematical analysis help the models correctly predict outputs for inputs with missing values.

Download Full-text

Week 3–4 Prediction of Wintertime CONUS Temperature Using Machine Learning Techniques

Frontiers in Climate ◽

10.3389/fclim.2021.697423 ◽

2021 ◽

Vol 3 ◽

Author(s):

Paul Buchmann ◽

Timothy DelSole

Keyword(s):

Machine Learning ◽

Regression Models ◽

Large Scale ◽

Climate Model ◽

Dynamical Model ◽

Learning Models ◽

Machine Learning Model ◽

Climate Model Output ◽

Machine Learning Models ◽

Better Than

This paper shows that skillful week 3–4 predictions of a large-scale pattern of 2 m temperature over the US can be made based on the Nino3.4 index alone, where skillful is defined to be better than climatology. To find more skillful regression models, this paper explores various machine learning strategies (e.g., ridge regression and lasso), including those trained on observations and on climate model output. It is found that regression models trained on climate model output yield more skillful predictions than regression models trained on observations, presumably because of the larger training sample. Nevertheless, the skill of the best machine learning models are only modestly better than ordinary least squares based on the Nino3.4 index. Importantly, this fact is difficult to infer from the parameters of the machine learning model because very different parameter sets can produce virtually identical predictions. For this reason, attempts to interpret the source of predictability from the machine learning model can be very misleading. The skill of machine learning models also are compared to those of a fully coupled dynamical model, CFSv2. The results depend on the skill measure: for mean square error, the dynamical model is slightly worse than the machine learning models; for correlation skill, the dynamical model is only modestly better than machine learning models or the Nino3.4 index. In summary, the best predictions of the large-scale pattern come from machine learning models trained on long climate simulations, but the skill is only modestly better than predictions based on the Nino3.4 index alone.

Download Full-text

SENTIMENT ANALYSIS OF CUSTOMER REVIEWS

Azerbaijan Journal of High Performance Computing ◽

10.32010/26166127.2021.4.1.113.125 ◽

2021 ◽

Vol 4 (1) ◽

pp. 113-125

Author(s):

Syed Rashiq Nazar ◽

◽

Tapalina Bhattasali

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Language Processing ◽

Binary Classification ◽

Classification Problem ◽

Supervised Machine Learning ◽

Frequency Model ◽

Customer Reviews ◽

Machine Learning Model ◽

Logistic Regression Algorithm

Sentiment analysis is a process in which we classify text data as positive, negative, or neutral or into some other category, which helps understand the sentiment behind the data. Mainly machine learning and natural language processing methods are combined in this process. One can find customer sentiment in reviews, tweets, comments, etc. A company needs to evaluate the sentiment behind the reviews of its product. Customer sentiment can be a valuable asset to the company. This ultimately helps the company make better decisions regarding its product marketing and improving product quality. This paper focuses on the sentiment analysis of customer reviews from Amazon. The reviews contain textual feedback along with a rating system. The aim is to build a supervised machine learning model to classify the review as positive or negative. As reviews are in the text format, there is a need to vectorize the text to numerical format for the computer to process the data. To do this, we use the Bag-of-words model and the TF-IDF (Term Frequency-Inverse Document Frequency) model. These two models are related to each other, and the aim is to find which model performs better in our case. The problem in our case is a binary classification problem; the logistic regression algorithm is used. Finally, the performance of the model is calculated using a metric called the F1 score.

Download Full-text

Development of a machine-learning model to assess terminal ileum Endoscopic healing in pediatric Crohn's disease from Magnetic Resonance Enterography data

10.1101/2021.08.29.21262424 ◽

2021 ◽

Author(s):

Itai Guez ◽

Gili Focht ◽

Mary-Louise C.Greer ◽

Ruth Cytter-Kuint ◽

Li-tal Pratt ◽

...

Keyword(s):

Machine Learning ◽

Magnetic Resonance ◽

Linear Regression ◽

Regression Models ◽

Magnetic Resonance Enterography ◽

Linear Regression Models ◽

Learning Models ◽

Machine Learning Model ◽

Relevant Variables ◽

Machine Learning Models

Background and Aims: Endoscopic healing (EH), is a major treatment goal for Crohn's disease(CD). However, terminal ileum (TI) intubation failure is common, especially in children. We evaluated the added-value of machine-learning models in imputing a TI Simple Endoscopic Score for CD (SES-CD) from Magnetic Resonance Enterography (MRE) data of pediatric CD patients. Methods: This is a sub-study of the prospective ImageKids study. We developed machine-learning and baseline linear-regression models to predict TI SES-CD score from the Magnetic Resonance Index of Activity (MaRIA) and the Pediatric Inflammatory Crohn's MRE Index (PICMI) variables. We assessed TI SES-CD predictions' accuracy for intubated patients with a stratified 2-fold validation experimental setup, repeated 50 times. We determined clinical impact by imputing TI SES-CD in patients with ileal intubation failure during ileocolonscopy. Results: A total of 223 children were included (mean age 14.1+-2.5 years), of whom 132 had all relevant variables (107 with TI intubation and 25 with TI intubation failure). The combination of a machine-learning model with the PICMI variables achieved the lowest SES-CD prediction error compared to a baseline MaRIA-based linear regression model for the intubated patients (N=107, 11.7 (10.5-12.5) vs. 12.1 (11.4-12.9), p<0.05). The PICMI-based models suggested a higher rate of patients with TI disease among the non-intubated patients compared to a baseline MaRIA-based linear regression model (N=25, up to 25/25 (100%) vs. 23/25 (92%)). Conclusions: Machine-learning models with clinically-relevant variables as input are more accurate than linear-regression models in predicting TI SES-CD and EH when using the same MRE-based variables.

Download Full-text