Automated Essay Scoring using Ontology Generator and Natural Language Processing with Question Generator based on Blooms Taxonomy’s Cognitive Level

Essay writing examination is commonly used learning activity in all levels of education and disciplines. It is advantageous in evaluating the student’s learning outcomes because it gives them the chance to exhibit their knowledge and skills freely. For these reasons, a lot of researchers turned their interest in Automated essay scoring (AES) is one of the most remarkable innovations in text mining using Natural Language Processing and Machine learning algorithms. The purpose of this study is to develop an automated essay scoring that uses ontology and Natural Language Processing. Different learning algorithms showed agreeing prediction outcomes but still regression algorithm with the proper features incorporated with it may produce more accurate essay score. This study aims to increase the accuracy, reliability and validity of the AES by implementing the Gradient ridge regression with the domain ontology and other features. Linear regression, linear lasso regression and ridge regression were also used in conjunction with the different features that was extracted. The different features extracted are the domain concepts, average word length, orthography (spelling mistakes), grammar and sentiment score. The first dataset used is the ASAP dataset from Kaggle website is used to train and test different machine learning algorithms that is consist of linear regression, linear lasso regression, ridge regression and gradient boosting regression together with the different features identified. The second dataset used is the one extracted from the student’s essay exam in Human Computer Interaction course. The results show that the Gradient Boosting Regression has the highest variance and kappa scores. However, we can tell that there are similarities when it comes to performances for Linear, Ridge and Lasso regressions due to the dataset used which is ASAP. Furthermore, the results were evaluated using Cohen Weighted Kappa (CWA) score and compared the agreement between the human raters. The CWA result is 0.659 that can be interpreted as Strong level of agreement between the Human Grader and the automated essay score. Therefore, the proposed AES has 64-81% reliability level.

Download Full-text

Computerized Answer Grading

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.35044 ◽

2021 ◽

Vol 9 (VI) ◽

pp. 618-619

Author(s):

Anurag Langan

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Computer Technology ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Grade Student ◽

Processing Techniques

Grading student answers is a tedious and time-consuming task. A study had found that almost on average around 25% of a teacher's time is spent in scoring the answer sheets of students. This time could be utilized in much better ways if computer technology could be used to score answers. This system will aim to grade student answers using the various Natural Language processing techniques and Machine Learning algorithms available today.

Download Full-text

Natural language processing and recurrent network models for identifying genomic mutation-associated cancer treatment change from patient progress notes

JAMIA Open ◽

10.1093/jamiaopen/ooy061 ◽

2019 ◽

Vol 2 (1) ◽

pp. 139-149 ◽

Cited By ~ 9

Author(s):

Meijian Guan ◽

Samuel Cho ◽

Robin Petro ◽

Wei Zhang ◽

Boris Pasche ◽

...

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Cancer Patients ◽

Language Processing ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Free Text ◽

Treatment Change ◽

Progress Notes

Abstract Objectives Natural language processing (NLP) and machine learning approaches were used to build classifiers to identify genomic-related treatment changes in the free-text visit progress notes of cancer patients. Methods We obtained 5889 deidentified progress reports (2439 words on average) for 755 cancer patients who have undergone a clinical next generation sequencing (NGS) testing in Wake Forest Baptist Comprehensive Cancer Center for our data analyses. An NLP system was implemented to process the free-text data and extract NGS-related information. Three types of recurrent neural network (RNN) namely, gated recurrent unit, long short-term memory (LSTM), and bidirectional LSTM (LSTM_Bi) were applied to classify documents to the treatment-change and no-treatment-change groups. Further, we compared the performances of RNNs to 5 machine learning algorithms including Naive Bayes, K-nearest Neighbor, Support Vector Machine for classification, Random forest, and Logistic Regression. Results Our results suggested that, overall, RNNs outperformed traditional machine learning algorithms, and LSTM_Bi showed the best performance among the RNNs in terms of accuracy, precision, recall, and F1 score. In addition, pretrained word embedding can improve the accuracy of LSTM by 3.4% and reduce the training time by more than 60%. Discussion and Conclusion NLP and RNN-based text mining solutions have demonstrated advantages in information retrieval and document classification tasks for unstructured clinical progress notes.

Download Full-text

An Analysis of Machine Learning Algorithms and Deep Neural Networks for Email Spam Classification using Natural Language Processing

10.1109/soli54607.2021.9672398 ◽

2021 ◽

Author(s):

Md. Mohidul Hasan ◽

Syed Mahbubuz Zaman ◽

Md. Asif Talukdar ◽

Ayesha Siddika ◽

Md. Golam Rabiul Alam

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Deep Neural Networks ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Email Spam

Download Full-text

Answer Script Evaluation using Machine Learning

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.35070 ◽

2021 ◽

Vol 9 (VI) ◽

pp. 849-852

Author(s):

Dr. K. Suresh

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Computational Methods ◽

Language Processing ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Text Extraction ◽

Processing Techniques

The current way of checking answer scripts is hectic for the college. They need to manually check the answers and allocate the marks to the students. Our proposed system uses Machine Learning and Natural Language Processing techniques to beat this. Machine learning algorithms use computational methods to find out directly from data without hopping on predetermined rules. NLP algorithms identify specific entities within the text, explore for key elements during a document, run a contextual search for synonyms and detect misspelled words or similar entries, and more. Our algorithm performs similarity checking and also the number of words associated with the question exactly matched between two documents. It also checks whether the grammar is correctly used or not within the student's answer. Our proposed system performs text extraction and evaluation of marks by applying Machine Learning and Natural Language Processing techniques.

Download Full-text

Classifying lymphoma and tuberculosis case reports using machine learning algorithms

Bulletin of Electrical Engineering and Informatics ◽

10.11591/eei.v10i5.3132 ◽

2021 ◽

Vol 10 (5) ◽

pp. 2857-2865

Author(s):

Moanda Diana Pholo ◽

Yskandar Hamam ◽

Abdel Baset Khalaf ◽

Chunling Du

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Performance Metrics ◽

Case Reports ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Tuberculosis Case ◽

Starting Point

Available literature reports several lymphoma cases misdiagnosed as tuberculosis, especially in countries with a heavy TB burden. This frequent misdiagnosis is due to the fact that the two diseases can present with similar symptoms. The present study therefore aims to analyse and explore TB as well as lymphoma case reports using Natural Language Processing tools and evaluate the use of machine learning to differentiate between the two diseases. As a starting point in the study, case reports were collected for each disease using web scraping. Natural language processing tools and text clustering were then used to explore the created dataset. Finally, six machine learning algorithms were trained and tested on the collected data, which contained 765 lymphoma and 546 tuberculosis case reports. Each method was evaluated using various performance metrics. The results indicated that the multi-layer perceptron model achieved the best accuracy (93.1%), recall (91.9%) and precision score (93.7%), thus outperforming other algorithms in terms of correctly classifying the different case reports.

Download Full-text

Automated Essay Scoring: A Survey of the State of the Art

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/879 ◽

2019 ◽

Cited By ~ 2

Author(s):

Zixuan Ke ◽

Vincent Ng

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

State Of The Art ◽

The State ◽

Educational Values ◽

Automated Essay Scoring ◽

Research Challenges ◽

Essay Scoring ◽

Made In

Despite being investigated for over 50 years, the task of automated essay scoring is far from being solved. Nevertheless, it continues to draw a lot of attention in the natural language processing community in part because of its commercial and educational values as well as the associated research challenges. This paper presents an overview of the major milestones made in automated essay scoring research since its inception.

Download Full-text

Citation Classification Prediction Implying Text Features Using Natural Language Processing and Supervised Machine Learning Algorithms

Communications in Computer and Information Science - Recent Trends in Image Processing and Pattern Recognition ◽

10.1007/978-981-16-0507-9_46 ◽

2021 ◽

pp. 540-552

Author(s):

Priya Porwal ◽

Manoj H. Devare

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Text Features ◽

Classification Prediction

Download Full-text

Grammatical categories determination for Turkish and Kazakh languages based on machine learning algorithms and fulfilling dictionaries of link grammar parser

Eastern-European Journal of Enterprise Technologies ◽

10.15587/1729-4061.2021.238743 ◽

2021 ◽

Vol 5 (2 (113)) ◽

pp. 55-65

Author(s):

Aigerim Yerimbetova ◽

Madina Tussupova ◽

Madina Sambetbayeva ◽

Mussa Turdalyuly ◽

Bakzhan Sakenov

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Parts Of Speech ◽

Grammatical Categories ◽

Learning Techniques

This research is aimed at identifying the parts of speech for the Kazakh and Turkish languages in an information retrieval system. The proposed algorithms are based on machine learning techniques. In this paper, we consider the binary classification of words according to parts of speech. We decided to take the most popular machine learning algorithms. In this paper, the following approaches and well-known machine learning algorithms are studied and considered. We defined 7 dictionaries and tagged 135 million words in Kazakh and 9 dictionaries and 50 million words in the Turkish language. The main problem considered in the paper is to create algorithms for the execution of dictionaries of the so-called Link Grammar Parser (LGP) system, in particular for the Kazakh and Turkish languages, using machine learning techniques. The focus of the research is on the review and comparison of machine learning algorithms and methods that have accomplished results on various natural language processing tasks such as grammatical categories determination. For the operation of the LGP system, a dictionary is created in which a connector for each word is indicated – the type of connection that can be created using this word. The authors considered methods of filling in LGP dictionaries using machine learning. The complexities of natural language processing, however, do not exclude the possibility of identifying narrower tasks that can already be solved algorithmically: for example, determining parts of speech or splitting texts into logical groups. However, some features of natural languages significantly reduce the effectiveness of these solutions. Thus, taking into account all word forms for each word in the Kazakh and Turkish languages increases the complexity of text processing by an order of magnitude

Download Full-text