scholarly journals Exploration of the best performance method of emotions classification for arabic tweets

Author(s):  
Mohammed Abdullah Al-Hagery ◽  
Manar Abdullah Al-assaf ◽  
Faiza Mohammad Al-kharboush

<p><span>Arab users of social media have significantly increased, thus increasing the opportunities for extracting knowledge from various areas of life such as trade, education, psychological health services, etc. The active Arab presence on Twitter motivates many researchers to classify and analysis Arabic tweets from numerous aspects. This study aimed to explore the best performance scenarios in the classification of emotions conveyed through Arabic tweets. Hence, various experiments were conducted to investigate the effects of feature extraction techniques and the N-gram model on the performance of three supervised machine learning algorithms, which are Support Vector Machine (SVM), Naïve Bayes (NB), and Logistic Regression (LR). The general method of the experiments was based on five steps; data collection, preprocessing, feature extraction, emotion classification, and evaluation of results. To implement these experiments, a real-world Twitter dataset was gathered. The best result achieved by the SVM classifier when using a bag of words (BoW) weighting schema (with unigrams and bigrams or with unigrams, bigrams, and trigrams) exceeded the best performance results of other algorithms.</span></p>

2021 ◽  
Vol 36 (1) ◽  
pp. 713-720
Author(s):  
S.K.L. Sameer ◽  
P. Sriramya

Aim: The objective of the research work is to use the two machine learning algorithms Decision Tree(DT) and Support vector machine(SVM) for detection of heart disease on earlier stages and give more accurate prediction. Materials and methods: Prediction of heart disease is performed using two machine learning classifier algorithms namely, Decision Tree and Support Vector Machine methods. Decision tree is the predictive modeling approach used in machine learning, it is a type of supervised machine learning. Support-vector machines are directed learning models with related learning calculations that break down information for order and relapse investigation. The significance value for calculating Accuracy was found to be 0.005. Result and discussion: During the process of testing 10 iterations have been taken for each of the classification algorithms respectively. The experimental results shows that the decision tree algorithm with mean accuracy of 80.257% is compared with the SVM classifier algorithm of mean accuracy 75.337% Conclusion: Based on the results achieved the Decision Tree classification algorithm better prediction of heart disease than the SVM classifier algorithm.


2021 ◽  
Vol 11 (10) ◽  
pp. 4443
Author(s):  
Rokas Štrimaitis ◽  
Pavel Stefanovič ◽  
Simona Ramanauskaitė ◽  
Asta Slotkienė

Financial area analysis is not limited to enterprise performance analysis. It is worth analyzing as wide an area as possible to obtain the full impression of a specific enterprise. News website content is a datum source that expresses the public’s opinion on enterprise operations, status, etc. Therefore, it is worth analyzing the news portal article text. Sentiment analysis in English texts and financial area texts exist, and are accurate, the complexity of Lithuanian language is mostly concentrated on sentiment analysis of comment texts, and does not provide high accuracy. Therefore in this paper, the supervised machine learning model was implemented to assign sentiment analysis on financial context news, gathered from Lithuanian language websites. The analysis was made using three commonly used classification algorithms in the field of sentiment analysis. The hyperparameters optimization using the grid search was performed to discover the best parameters of each classifier. All experimental investigations were made using the newly collected datasets from four Lithuanian news websites. The results of the applied machine learning algorithms show that the highest accuracy is obtained using a non-balanced dataset, via the multinomial Naive Bayes algorithm (71.1%). The other algorithm accuracies were slightly lower: a long short-term memory (71%), and a support vector machine (70.4%).


Author(s):  
Htwe Pa Pa Win ◽  
Phyo Thu Thu Khine ◽  
Khin Nwe Ni Tun

This paper proposes a new feature extraction method for off-line recognition of Myanmar printed documents. One of the most important factors to achieve high recognition performance in Optical Character Recognition (OCR) system is the selection of the feature extraction methods. Different types of existing OCR systems used various feature extraction methods because of the diversity of the scripts’ natures. One major contribution of the work in this paper is the design of logically rigorous coding based features. To show the effectiveness of the proposed method, this paper assumed the documents are successfully segmented into characters and extracted features from these isolated Myanmar characters. These features are extracted using structural analysis of the Myanmar scripts. The experimental results have been carried out using the Support Vector Machine (SVM) classifier and compare the pervious proposed feature extraction method.


2018 ◽  
Vol 10 (7) ◽  
pp. 1123 ◽  
Author(s):  
Yuhang Zhang ◽  
Hao Sun ◽  
Jiawei Zuo ◽  
Hongqi Wang ◽  
Guangluan Xu ◽  
...  

Aircraft type recognition plays an important role in remote sensing image interpretation. Traditional methods suffer from bad generalization performance, while deep learning methods require large amounts of data with type labels, which are quite expensive and time-consuming to obtain. To overcome the aforementioned problems, in this paper, we propose an aircraft type recognition framework based on conditional generative adversarial networks (GANs). First, we design a new method to precisely detect aircrafts’ keypoints, which are used to generate aircraft masks and locate the positions of the aircrafts. Second, a conditional GAN with a region of interest (ROI)-weighted loss function is trained on unlabeled aircraft images and their corresponding masks. Third, an ROI feature extraction method is carefully designed to extract multi-scale features from the GAN in the regions of aircrafts. After that, a linear support vector machine (SVM) classifier is adopted to classify each sample using their features. Benefiting from the GAN, we can learn features which are strong enough to represent aircrafts based on a large unlabeled dataset. Additionally, the ROI-weighted loss function and the ROI feature extraction method make the features more related to the aircrafts rather than the background, which improves the quality of features and increases the recognition accuracy significantly. Thorough experiments were conducted on a challenging dataset, and the results prove the effectiveness of the proposed aircraft type recognition framework.


2021 ◽  
Author(s):  
Lamya Alderywsh ◽  
Aseel Aldawood ◽  
Ashwag Alasmari ◽  
Farah Aldeijy ◽  
Ghadah Alqubisy ◽  
...  

BACKGROUND There is a serious threat from fake news spreading in technologically advanced societies, including those in the Arab world, via deceptive machine-generated text. In the last decade, Arabic fake news identification has gained increased attention, and numerous detection approaches have revealed some ability to find fake news throughout various data sources. Nevertheless, many existing approaches overlook recent advancements in fake news detection, explicitly to incorporate machine learning algorithms system. OBJECTIVE Tebyan project aims to address the problem of fake news by developing a fake news detection system that employs machine learning algorithms to detect whether the news is fake or real in the context of Arab world. METHODS The project went through numerous phases using an iterative methodology to develop the system. This study analysis incorporated numerous stages using an iterative method to develop the system of misinformation and contextualize fake news regarding society's information. It consists of implementing the machine learning algorithms system using Python to collect genuine and fake news datasets. The study also assesses how information-exchanging behaviors can minimize and find the optimal source of authentication of the emergent news through system testing approaches. RESULTS The study revealed that the main deliverable of this project is the Tebyan system in the community, which allows the user to ensure the credibility of news in Arabic newspapers. It showed that the SVM classifier, on average, exhibited the highest performance results, resulting in 90% in every performance measure of sources. Moreover, the results indicate the second-best algorithm is the linear SVC since it resulted in 90% in performance measure with the societies' typical type of fake information. CONCLUSIONS The study concludes that conducting a system with machine learning algorithms using Python programming language allows the rapid measures of the users' perception to comment and rate the credibility result and subscribing to news email services.


2018 ◽  
Vol 28 (02) ◽  
pp. 1750036 ◽  
Author(s):  
Shuqiang Wang ◽  
Yong Hu ◽  
Yanyan Shen ◽  
Hanxiong Li

In this study, we propose an automated framework that combines diffusion tensor imaging (DTI) metrics with machine learning algorithms to accurately classify control groups and groups with cervical spondylotic myelopathy (CSM) in the spinal cord. The comparison between selected voxel-based classification and mean value-based classification were performed. A support vector machine (SVM) classifier using a selected voxel-based dataset produced an accuracy of 95.73%, sensitivity of 93.41% and specificity of 98.64%. The efficacy of each index of diffusion for classification was also evaluated. Using the proposed approach, myelopathic areas in CSM are detected to provide an accurate reference to assist spine surgeons in surgical planning in complicated cases.


2015 ◽  
Vol 11 (6) ◽  
pp. 4 ◽  
Author(s):  
Xianfeng Yuan ◽  
Mumin Song ◽  
Fengyu Zhou ◽  
Yugang Wang ◽  
Zhumin Chen

Support Vector Machines (SVM) is a set of popular machine learning algorithms which have been successfully applied in diverse aspects, but for large training data sets the processing time and computational costs are prohibitive. This paper presents a novel fast training method for SVM, which is applied in the fault diagnosis of service robot. Firstly, sensor data are sampled under different running conditions of the robot and those samples are divided as training sets and testing sets. Secondly, the sampled data are preprocessed and the principal component analysis (PCA) model is established for fault feature extraction. Thirdly, the feature vectors are used to train the SVM classifier, which achieves the fault diagnosis of the robot. To speed up the training process of SVM, on the one hand, sample reduction is done using the proposed support vectors selection (SVS) algorithm, which can ensure good classification accuracy and generalization capability. On the other hand, we take advantage of the excellent parallel computing abilities of Graphics Processing Unit (GPU) to pre-calculate the kernel matrix, which avoids the recalculation during the cross validation process. Experimental results illustrate that the proposed method can significantly reduce the training time without decreasing the classification accuracy.


2021 ◽  
Vol 9 ◽  
Author(s):  
Ashwini K ◽  
P. M. Durai Raj Vincent ◽  
Kathiravan Srinivasan ◽  
Chuan-Yu Chang

Neonatal infants communicate with us through cries. The infant cry signals have distinct patterns depending on the purpose of the cries. Preprocessing, feature extraction, and feature selection need expert attention and take much effort in audio signals in recent days. In deep learning techniques, it automatically extracts and selects the most important features. For this, it requires an enormous amount of data for effective classification. This work mainly discriminates the neonatal cries into pain, hunger, and sleepiness. The neonatal cry auditory signals are transformed into a spectrogram image by utilizing the short-time Fourier transform (STFT) technique. The deep convolutional neural network (DCNN) technique takes the spectrogram images for input. The features are obtained from the convolutional neural network and are passed to the support vector machine (SVM) classifier. Machine learning technique classifies neonatal cries. This work combines the advantages of machine learning and deep learning techniques to get the best results even with a moderate number of data samples. The experimental result shows that CNN-based feature extraction and SVM classifier provides promising results. While comparing the SVM-based kernel techniques, namely radial basis function (RBF), linear and polynomial, it is found that SVM-RBF provides the highest accuracy of kernel-based infant cry classification system provides 88.89% accuracy.


2020 ◽  
Vol 17 (4) ◽  
pp. 572-578
Author(s):  
Mohammad Parseh ◽  
Mohammad Rahmanimanesh ◽  
Parviz Keshavarzi

Persian handwritten digit recognition is one of the important topics of image processing which significantly considered by researchers due to its many applications. The most important challenges in Persian handwritten digit recognition is the existence of various patterns in Persian digit writing that makes the feature extraction step to be more complicated.Since the handcraft feature extraction methods are complicated processes and their performance level are not stable, most of the recent studies have concentrated on proposing a suitable method for automatic feature extraction. In this paper, an automatic method based on machine learning is proposed for high-level feature extraction from Persian digit images by using Convolutional Neural Network (CNN). After that, a non-linear multi-class Support Vector Machine (SVM) classifier is used for data classification instead of fully connected layer in final layer of CNN. The proposed method has been applied to HODA dataset and obtained 99.56% of recognition rate. Experimental results are comparable with previous state-of-the-art methods


2021 ◽  
Author(s):  
Rejith K.N ◽  
Kamalraj Subramaniam ◽  
Ayyem Pillai Vasudevan Pillai ◽  
Roshini T V ◽  
Renjith V. Ravi ◽  
...  

Abstract In this work, PD patients and healthy individuals were categorized with machine-learning algorithms. EEG signals associated with six different emotions, (Happiness(E1), Sadness(E2), Fear(E3), Anger(E4), Surprise,(E5) and disgust(E6)) were used for the study. EEG data were collected from 20 PD patients and 20 normal controls using multimodal stimuli. Different features were used to categorize emotional data. Emotional recognition in Parkinson’s disease (PD) has been investigated in three domains namely, time, frequency and time frequency using Entropy, Energy-Entropy and Teager Energy-Entropy features. Three classifiers namely, K-Nearest Neighbor Algorithm, Support Vector Machine and Probabilistic Neural Network were used to observethe classification results. Emotional EEG stimuli such as anger, surprise, happiness, sadness, fear, and disgust were used to categorize PD patients and healthy controls (HC). For each EEG signal, frequency features corresponding to alpha, beta and gamma bands were obtained for nine feature extraction methods (Entropy, Energy Entropy, Teager Energy Entropy, Spectral Entropy, Spectral Energy-Entropy, Spectral Teager Energy-Entropy, STFT Entropy, STFT Energy-Entropy and STFT Teager Energy-Entropy). From the analysis, it is observed that the entropy feature in frequency domain performs evenly well (above 80 %) for all six emotions with KNN. Classification results shows that using the selected energy entropy combination feature in frequency domain provides highest accuracy for all emotions except E1 and E2 for KNN and SVM classifier, whereas other features give accuracy values of above 60% for most emotions.It is also observed that emotion E1 gives above 90 % classification accuracy for all classifiers in time domain.In frequency domain also, emotion E1 gives above 90% classification accuracy using PNN classifier.


Sign in / Sign up

Export Citation Format

Share Document