Sentiment Analysis on E-Learning Using Machine Learning Classifiers in Python

Early and precisely predicting the students’ dropout based on available educational data belongs to the widespread research topic of the learning analytics research field. Despite the amount of already realized research, the progress is not significant and persists on all educational data levels. Even though various features have already been researched, there is still an open question, which features can be considered appropriate for different machine learning classifiers applied to the typical scarce set of educational data at the e-learning course level. Therefore, the main goal of the research is to emphasize the importance of the data understanding, data gathering phase, stress the limitations of the available datasets of educational data, compare the performance of several machine learning classifiers, and show that also a limited set of features, which are available for teachers in the e-learning course, can predict student’s dropout with sufficient accuracy if the performance metrics are thoroughly considered. The data collected from four academic years were analyzed. The features selected in this study proved to be applicable in predicting course completers and non-completers. The prediction accuracy varied between 77 and 93% on unseen data from the next academic year. In addition to the frequently used performance metrics, the comparison of machine learning classifiers homogeneity was analyzed to overcome the impact of the limited size of the dataset on obtained high values of performance metrics. The results showed that several machine learning algorithms could be successfully applied to a scarce dataset of educational data. Simultaneously, classification performance metrics should be thoroughly considered before deciding to deploy the best performance classification model to predict potential dropout cases and design beneficial intervention mechanisms.

Download Full-text

Myers-briggs Personality Prediction and Sentiment Analysis of Twitter using Machine Learning Classifiers and BERT

International Journal of Information Technology and Computer Science ◽

10.5815/ijitcs.2021.06.04 ◽

2021 ◽

Vol 13 (6) ◽

pp. 48-60

Author(s):

Prajwal Kaushal ◽

◽

Nithin Bharadwaj B P ◽

Pranav M S ◽

Koushik S ◽

...

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Analysis Model ◽

Research Recruitment ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Myers Briggs ◽

Personality Prediction ◽

Twitter Users ◽

Psychological Instruments

Twitter being one of the most sophisticated social networking platforms whose users base is growing exponentially, terabytes of data is being generated every day. Technology Giants invest billions of dollars in drawing insights from these tweets. The huge amount of data is still going underutilized. The main of this paper is to solve two tasks. Firstly, to build a sentiment analysis model using BERT (Bidirectional Encoder Representations from Transformers) which analyses the tweets and predicts the sentiments of the users. Secondly to build a personality prediction model using various machine learning classifiers under the umbrella of Myers-Briggs Personality Type Indicator. MBTI is one of the most widely used psychological instruments in the world. Using this we intend to predict the traits and qualities of people based on their posts and interactions in Twitter. The model succeeds to predict the personality traits and qualities on twitter users. We intend to use the analyzed results in various applications like market research, recruitment, psychological tests, consulting, etc, in future.

Download Full-text

A performance comparison of supervised machine learning models for Covid-19 tweets sentiment analysis

PLoS ONE ◽

10.1371/journal.pone.0245909 ◽

2021 ◽

Vol 16 (2) ◽

pp. e0245909

Author(s):

Furqan Rustam ◽

Madiha Khalid ◽

Waqar Aslam ◽

Vaibhav Rupapara ◽

Arif Mehmood ◽

...

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Short Term Memory ◽

Performance Comparison ◽

Supervised Machine Learning ◽

Accuracy Score ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Analysis Technique ◽

Realistic Assessment

The spread of Covid-19 has resulted in worldwide health concerns. Social media is increasingly used to share news and opinions about it. A realistic assessment of the situation is necessary to utilize resources optimally and appropriately. In this research, we perform Covid-19 tweets sentiment analysis using a supervised machine learning approach. Identification of Covid-19 sentiments from tweets would allow informed decisions for better handling the current pandemic situation. The used dataset is extracted from Twitter using IDs as provided by the IEEE data port. Tweets are extracted by an in-house built crawler that uses the Tweepy library. The dataset is cleaned using the preprocessing techniques and sentiments are extracted using the TextBlob library. The contribution of this work is the performance evaluation of various machine learning classifiers using our proposed feature set. This set is formed by concatenating the bag-of-words and the term frequency-inverse document frequency. Tweets are classified as positive, neutral, or negative. Performance of classifiers is evaluated on the accuracy, precision, recall, and F1 score. For completeness, further investigation is made on the dataset using the Long Short-Term Memory (LSTM) architecture of the deep learning model. The results show that Extra Trees Classifiers outperform all other models by achieving a 0.93 accuracy score using our proposed concatenated features set. The LSTM achieves low accuracy as compared to machine learning classifiers. To demonstrate the effectiveness of our proposed feature set, the results are compared with the Vader sentiment analysis technique based on the GloVe feature extraction approach.

Download Full-text