An Ensemble Machine Learning Approach to Understanding the Effect of a Global Pandemic on Twitter Users’ Attitudes

It is thought that the COVID-19 outbreak has significantly fuelled racism and discrimination, especially towards Asian individuals[10]. In order to test this hypothesis, in this paper, we build upon existing work in order to classify racist tweets before and after COVID-19 was declared a global pandemic. To overcome the difficult linguistic and unbalanced nature of the classification task, we combine an ensemble of machine learning techniques such as a Linear Support Vector Classifiers, Logistic Regression models, and Deep Neural Networks. We fill the gap in existing literature by (1) using a combined Machine Learning approach to understand the effect of COVID-19 on Twitter users’ attitudes and by (2) improving on the performance of automatic racism detectors. Here we show that there has not been a sharp increase in racism towards Asian people on Twitter and that users that posted racist Tweets before the pandemic are prone to post an approximately equal amount during the outbreak. Previous research on racism and other virus outbreaks suggests that racism towards communities associated with the region of the origin of the virus is not exclusively attributed to the outbreak but rather it is a continued symptom of deep-rooted biases towards minorities[13]. Our research supports these previous findings. We conclude that the COVID-19 outbreak is an additional outlet to discriminate against Asian people, instead of it being the main cause.

Download Full-text

Arabic English Cross-Lingual Plagiarism Detection Based on Keyphrases Extraction, Monolingual and Machine Learning Approach

Asian Journal of Research in Computer Science ◽

10.9734/ajrcos/2018/v2i330075 ◽

2019 ◽

pp. 1-12

Author(s):

Mokhtar Al-Suhaiqi ◽

Muneer A. S. Hazaa ◽

Mohammed Albared

Keyword(s):

Machine Learning ◽

Machine Learning Techniques ◽

Detection Methods ◽

Support Vector ◽

Svm Classifier ◽

Learning Approach ◽

Plagiarism Detection ◽

Machine Learning Approach ◽

Cross Lingual ◽

Cross Language

Due to rapid growth of research articles in various languages, cross-lingual plagiarism detection problem has received increasing interest in recent years. Cross-lingual plagiarism detection is more challenging task than monolingual plagiarism detection. This paper addresses the problem of cross-lingual plagiarism detection (CLPD) by proposing a method that combines keyphrases extraction, monolingual detection methods and machine learning approach. The research methodology used in this study has facilitated to accomplish the objectives in terms of designing, developing, and implementing an efficient Arabic – English cross lingual plagiarism detection. This paper empirically evaluates five different monolingual plagiarism detection methods namely i)N-Grams Similarity, ii)Longest Common Subsequence, iii)Dice Coefficient, iv)Fingerprint based Jaccard Similarity and v) Fingerprint based Containment Similarity. In addition, three machine learning approaches namely i) naïve Bayes, ii) Support Vector Machine, and iii) linear logistic regression classifiers are used for Arabic-English Cross-language plagiarism detection. Several experiments are conducted to evaluate the performance of the key phrases extraction methods. In addition, Several experiments to investigate the performance of machine learning techniques to find the best method for Arabic-English Cross-language plagiarism detection. According to the experiments of Arabic-English Cross-language plagiarism detection, the highest result was obtained using SVM classifier with 92% f-measure. In addition, the highest results were obtained by all classifiers are achieved, when most of the monolingual plagiarism detection methods are used.

Download Full-text

Sentiment Analysis on Social Media using Machine Learning Approach

10.22541/au.163620143.37655829/v1 ◽

2021 ◽

Author(s):

Erick Omuya ◽

George Okeyo ◽

Michael Kimwele

Keyword(s):

Machine Learning ◽

Social Media ◽

Sentiment Analysis ◽

Language Processing ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Learning Approach ◽

K Nearest Neighbor ◽

Machine Learning Approach

Social media has been embraced by different people as a convenient and official medium of communication. People write messages and attach images and videos on Twitter, Facebook and other social media which they share. Social media therefore generates a lot of data that is rich in sentiments from these updates. Sentiment analysis has been used to determine opinions of clients, for instance, relating to a particular product or company. Knowledge based approach and Machine learning approach are among the strategies that have been used to analyze these sentiments. The performance of sentiment analysis is however distorted by noise, the curse of dimensionality, the data domains and size of data used for training and testing. This research aims at developing a model for sentiment analysis in which dimensionality reduction and the use of different parts of speech improves sentiment analysis performance. It uses natural language processing for filtering, storing and performing sentiment analysis on the data from social media. The model is tested using Naïve Bayes, Support Vector Machines and K-Nearest neighbor machine learning algorithms and its performance compared with that of two other Sentiment Analysis models. Experimental results show that the model improves sentiment analysis performance using machine learning techniques.

Download Full-text

SENTIMENT ANALYSIS OF COVID-19 TWEETS

FUDMA Journal of Sciences ◽

10.33003/fjs-2021-0501-690 ◽

2021 ◽

Vol 5 (1) ◽

pp. 566-576

Author(s):

Azeez A. Nureni ◽

Victor E. Ogunlusi ◽

Emmanuel Junior Uloko

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Nearest Neighbors ◽

Support Vector ◽

Classification Algorithms ◽

Learning Approach ◽

K Nearest Neighbors ◽

Machine Learning Classification ◽

Global Pandemic ◽

Machine Learning Approach

Sentiment analysis involves techniques used in analyzing texts in order to identify the sentiment and emotion dominant in such texts and classify them accordingly. Techniques involved include but not limited to preprocessing of texts and the use a machine learning or lexical based approach in classifying these texts. In this research, attempt was made to adopt a machine learning approach to classify tweets on Covid-19 which is considered a global pandemic. To achieve this noble objective, a cross-dataset approach was applied to train four machine learning classification algorithms: Support Vector Machine (SVM), Random Forest (RF) and Naïve Bayes (NB), as well as K-Nearest Neighbors algorithm (KNN). The final result will not only assist us in knowing the best performing algorithm, it will also assist in creating awareness on Covid-19 with the final objective of destigmatizing the patients through the analysis of sentiments and emotions on Covid-19 and finally use the same result for containing the spread of the pandemic

Download Full-text

Cloud based ensemble machine learning approach for smart detection of epileptic seizures using higher order spectral analysis

Physical and Engineering Sciences in Medicine ◽

10.1007/s13246-021-00970-y ◽

2021 ◽

Author(s):

Kuldeep Singh ◽

Jyoteesh Malhotra

Keyword(s):

Machine Learning ◽

Spectral Analysis ◽

Epileptic Seizures ◽

Higher Order ◽

Learning Approach ◽

Ensemble Machine Learning ◽

Machine Learning Approach

Download Full-text

Distribution Grids Fault Location employing ST based Optimized Machine Learning Approach

Energies ◽

10.3390/en11092328 ◽

2018 ◽

Vol 11 (9) ◽

pp. 2328 ◽

Cited By ~ 12

Author(s):

Md Shafiullah ◽

M. Abido ◽

Taher Abdel-Fattah

Keyword(s):

Machine Learning ◽

Fault Location ◽

Percentage Error ◽

Support Vector ◽

Learning Approach ◽

Efficiency Coefficient ◽

Learning Tools ◽

Performance Indices ◽

Machine Learning Approach ◽

Distribution Grids

Precise information of fault location plays a vital role in expediting the restoration process, after being subjected to any kind of fault in power distribution grids. This paper proposed the Stockwell transform (ST) based optimized machine learning approach, to locate the faults and to identify the faulty sections in the distribution grids. This research employed the ST to extract useful features from the recorded three-phase current signals and fetches them as inputs to different machine learning tools (MLT), including the multilayer perceptron neural networks (MLP-NN), support vector machines (SVM), and extreme learning machines (ELM). The proposed approach employed the constriction-factor particle swarm optimization (CF-PSO) technique, to optimize the parameters of the SVM and ELM for their better generalization performance. Hence, it compared the obtained results of the test datasets in terms of the selected statistical performance indices, including the root mean squared error (RMSE), mean absolute percentage error (MAPE), percent bias (PBIAS), RMSE-observations to standard deviation ratio (RSR), coefficient of determination (R2), Willmott’s index of agreement (WIA), and Nash–Sutcliffe model efficiency coefficient (NSEC) to confirm the effectiveness of the developed fault location scheme. The satisfactory values of the statistical performance indices, indicated the superiority of the optimized machine learning tools over the non-optimized tools in locating faults. In addition, this research confirmed the efficacy of the faulty section identification scheme based on overall accuracy. Furthermore, the presented results validated the robustness of the developed approach against the measurement noise and uncertainties associated with pre-fault loading condition, fault resistance, and inception angle.

Download Full-text

Sentiment Analysis Using Tuned Ensemble Machine Learning Approach

Advances in Data and Information Sciences - Lecture Notes in Networks and Systems ◽

10.1007/978-981-10-8360-0_27 ◽

2018 ◽

pp. 287-297

Author(s):

Pradeep Singh

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Learning Approach ◽

Ensemble Machine Learning ◽

Machine Learning Approach

Download Full-text

InfluenceRank: A machine learning approach to measure influence of Twitter users

2016 International Conference on Recent Trends in Information Technology (ICRTIT) ◽

10.1109/icrtit.2016.7569535 ◽

2016 ◽

Cited By ~ 3

Author(s):

Ashish Nargundkar ◽

Y. S. Rao

Keyword(s):

Machine Learning ◽

Learning Approach ◽

Machine Learning Approach ◽

Twitter Users

Download Full-text

Hybrid Machine Learning Approach for Skin Disease Detection Using Optimal Support Vector Machine

Intelligent Data Communication Technologies and Internet of Things - Lecture Notes on Data Engineering and Communications Technologies ◽

10.1007/978-3-030-34080-3_73 ◽

2019 ◽

pp. 647-658

Author(s):

K. Melbin ◽

Y. Jacob Vetha Raj

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Skin Disease ◽

Support Vector ◽

Disease Detection ◽

Learning Approach ◽

Machine Learning Approach ◽

Hybrid Machine

Download Full-text

Predicting Future Service Use in Dutch Mental Healthcare: A Machine Learning Approach

Administration and Policy in Mental Health and Mental Health Services Research ◽

10.1007/s10488-021-01150-6 ◽

2021 ◽

Author(s):

Kasper van Mens ◽

Sascha Kwakernaak ◽

Richard Janssen ◽

Wiepke Cahn ◽

Joran Lokkerbol ◽

...

Keyword(s):

Machine Learning ◽

Service Use ◽

Visual Analysis ◽

Mental Healthcare ◽

Machine Learning Techniques ◽

Individual Treatment ◽

Learning Approach ◽

Group Level ◽

Starting Point ◽

Machine Learning Approach

AbstractA mental healthcare system in which the scarce resources are equitably and efficiently allocated, benefits from a predictive model about expected service use. The skewness in service use is a challenge for such models. In this study, we applied a machine learning approach to forecast expected service use, as a starting point for agreements between financiers and suppliers of mental healthcare. This study used administrative data from a large mental healthcare organization in the Netherlands. A training set was selected using records from 2017 (N = 10,911), and a test set was selected using records from 2018 (N = 10,201). A baseline model and three random forest models were created from different types of input data to predict (the remainder of) numeric individual treatment hours. A visual analysis was performed on the individual predictions. Patients consumed 62 h of mental healthcare on average in 2018. The model that best predicted service use had a mean error of 21 min at the insurance group level and an average absolute error of 28 h at the patient level. There was a systematic under prediction of service use for high service use patients. The application of machine learning techniques on mental healthcare data is useful for predicting expected service on group level. The results indicate that these models could support financiers and suppliers of healthcare in the planning and allocation of resources. Nevertheless, uncertainty in the prediction of high-cost patients remains a challenge.

Download Full-text