scholarly journals Early Detection of Seasonal Outbreaks from Twitter Data Using Machine Learning Approaches

Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Samina Amin ◽  
Muhammad Irfan Uddin ◽  
Duaa H. alSaeed ◽  
Atif Khan ◽  
Muhammad Adnan

Seasonal outbreaks have several different periods that occur primarily during winter in temperate regions, while influenza may occur throughout the year in tropical regions, triggering outbreaks more irregularly. Similarly, dengue occurs in the star of the rainy season in early May and reaches its peak in late June. Dengue and flu brought an impact on various countries in the years 2017–2019 and streaming Twitter data reveals the status of dengue and flu outbreaks in the most affected regions. This research work presents that Social Media Analysis (SMA) can be used as a detector of the epidemic outbreak and to understand the sentiment of social media users regarding various diseases. Providing awareness about seasonal outbreaks through SMA is an effective approach for researchers and healthcare responders to detect the early outbreaks. The proposed model aims to find the sentiment about the disease in tweets, and the seasonal outbreaks-related tweets are classified into two classes as disease positive and disease negative. This work proposes a machine-learning-based approach to detect dengue and flu outbreaks in social media platform Twitter, using four machine learning algorithms: Random Forest (RF), K-Nearest Neighbor (KNN), Support Vector Machine (SVM), and Decision Tree (DT), with the help of Term Frequency and Inverse Document Frequency (TF-IDF). For experimental analysis, two datasets (dengue and flu) are analyzed individually. The experimental results show that the RF classifier has outperformed the comparison models in terms of improved accuracy, precision, recall, F1-measure, and Receiver Operating Characteristic (ROC) curve. The proposed work offers favorable performance with total precision, accuracy, recall, and F1-measure ranging from 84% to 88% for conventional machine learning techniques.

Author(s):  
Erick Omuya ◽  
George Okeyo ◽  
Michael Kimwele

Social media has been embraced by different people as a convenient and official medium of communication. People write messages and attach images and videos on Twitter, Facebook and other social media which they share. Social media therefore generates a lot of data that is rich in sentiments from these updates. Sentiment analysis has been used to determine opinions of clients, for instance, relating to a particular product or company. Knowledge based approach and Machine learning approach are among the strategies that have been used to analyze these sentiments. The performance of sentiment analysis is however distorted by noise, the curse of dimensionality, the data domains and size of data used for training and testing. This research aims at developing a model for sentiment analysis in which dimensionality reduction and the use of different parts of speech improves sentiment analysis performance. It uses natural language processing for filtering, storing and performing sentiment analysis on the data from social media. The model is tested using Naïve Bayes, Support Vector Machines and K-Nearest neighbor machine learning algorithms and its performance compared with that of two other Sentiment Analysis models. Experimental results show that the model improves sentiment analysis performance using machine learning techniques.


2021 ◽  
pp. 1-17
Author(s):  
Ahmed Al-Tarawneh ◽  
Ja’afer Al-Saraireh

Twitter is one of the most popular platforms used to share and post ideas. Hackers and anonymous attackers use these platforms maliciously, and their behavior can be used to predict the risk of future attacks, by gathering and classifying hackers’ tweets using machine-learning techniques. Previous approaches for detecting infected tweets are based on human efforts or text analysis, thus they are limited to capturing the hidden text between tweet lines. The main aim of this research paper is to enhance the efficiency of hacker detection for the Twitter platform using the complex networks technique with adapted machine learning algorithms. This work presents a methodology that collects a list of users with their followers who are sharing their posts that have similar interests from a hackers’ community on Twitter. The list is built based on a set of suggested keywords that are the commonly used terms by hackers in their tweets. After that, a complex network is generated for all users to find relations among them in terms of network centrality, closeness, and betweenness. After extracting these values, a dataset of the most influential users in the hacker community is assembled. Subsequently, tweets belonging to users in the extracted dataset are gathered and classified into positive and negative classes. The output of this process is utilized with a machine learning process by applying different algorithms. This research build and investigate an accurate dataset containing real users who belong to a hackers’ community. Correctly, classified instances were measured for accuracy using the average values of K-nearest neighbor, Naive Bayes, Random Tree, and the support vector machine techniques, demonstrating about 90% and 88% accuracy for cross-validation and percentage split respectively. Consequently, the proposed network cyber Twitter model is able to detect hackers, and determine if tweets pose a risk to future institutions and individuals to provide early warning of possible attacks.


2018 ◽  
Vol 34 (3) ◽  
pp. 569-581 ◽  
Author(s):  
Sujata Rani ◽  
Parteek Kumar

Abstract In this article, an innovative approach to perform the sentiment analysis (SA) has been presented. The proposed system handles the issues of Romanized or abbreviated text and spelling variations in the text to perform the sentiment analysis. The training data set of 3,000 movie reviews and tweets has been manually labeled by native speakers of Hindi in three classes, i.e. positive, negative, and neutral. The system uses WEKA (Waikato Environment for Knowledge Analysis) tool to convert these string data into numerical matrices and applies three machine learning techniques, i.e. Naive Bayes (NB), J48, and support vector machine (SVM). The proposed system has been tested on 100 movie reviews and tweets, and it has been observed that SVM has performed best in comparison to other classifiers, and it has an accuracy of 68% for movie reviews and 82% in case of tweets. The results of the proposed system are very promising and can be used in emerging applications like SA of product reviews and social media analysis. Additionally, the proposed system can be used in other cultural/social benefits like predicting/fighting human riots.


Machine Learning is empowering many aspects of day-to-day lives from filtering the content on social networks to suggestions of products that we may be looking for. This technology focuses on taking objects as image input to find new observations or show items based on user interest. The major discussion here is the Machine Learning techniques where we use supervised learning where the computer learns by the input data/training data and predict result based on experience. We also discuss the machine learning algorithms: Naïve Bayes Classifier, K-Nearest Neighbor, Random Forest, Decision Tress, Boosted Trees, Support Vector Machine, and use these classifiers on a dataset Malgenome and Drebin which are the Android Malware Dataset. Android is an operating system that is gaining popularity these days and with a rise in demand of these devices the rise in Android Malware. The traditional techniques methods which were used to detect malware was unable to detect unknown applications. We have run this dataset on different machine learning classifiers and have recorded the results. The experiment result provides a comparative analysis that is based on performance, accuracy, and cost.


2020 ◽  
Vol 8 (5) ◽  
pp. 4624-4627

In recent years, a lot of data has been generated about students, which can be utilized for deciding the career path of the student. This paper discusses some of the machine learning techniques which can be used to predict the performance of a student and help to decide his/her career path. Some of the key Machine Learning (ML) algorithms applied in our research work are Linear Regression, Logistics Regression, Support Vector machine, Naïve Bayes Classifier and K- means Clustering. The aim of this paper is to predict the student career path using Machine Learning algorithms. We compare the efficiencies of different ML classification algorithms on a real dataset obtained from University students.


2021 ◽  
Author(s):  
Nisha Agnihotri

<i>Bipolar disorder, a complex disorder in brain has affected many millions of people around the world. This brain disorder is identified by the occurrence of the oscillations of the patient’s changing mood. The mood swing between two states i.e. depression and mania. This is a result of different psychological and physical features. A set of psycholinguistic features like behavioral changes, mood swings and mental illness are observed to provide feedback on health and wellness. The study is an objective measure of identifying the stress level of human brain that could improve the harmful effects associated with it considerably. In the paper, we present the study prediction of symptoms and behavior of a commonly known mental health illness, bipolar disorder using Machine Learning Techniques. Therefore, we extracted data from articles and research papers were studied and analyzed by using statistical analysis tools and machine learning (ML) techniques. Data is visualized to extract and communicate meaningful information from complex datasets on predicting and optimizing various day to day analyses. The study also includes the various research papers having machine Learning algorithms and different classifiers like Decision Trees, Random Forest, Support Vector Machine, Naïve Bayes, Logistic Regression and K- Nearest Neighbor are studied and analyzed for identifying the mental state in a target group. The purpose of the paper is mainly to explore the challenges, adequacy and limitations in detecting the mental health condition using Machine Learning Techniques</i>


2021 ◽  
Author(s):  
Nisha Agnihotri

<i>Bipolar disorder, a complex disorder in brain has affected many millions of people around the world. This brain disorder is identified by the occurrence of the oscillations of the patient’s changing mood. The mood swing between two states i.e. depression and mania. This is a result of different psychological and physical features. A set of psycholinguistic features like behavioral changes, mood swings and mental illness are observed to provide feedback on health and wellness. The study is an objective measure of identifying the stress level of human brain that could improve the harmful effects associated with it considerably. In the paper, we present the study prediction of symptoms and behavior of a commonly known mental health illness, bipolar disorder using Machine Learning Techniques. Therefore, we extracted data from articles and research papers were studied and analyzed by using statistical analysis tools and machine learning (ML) techniques. Data is visualized to extract and communicate meaningful information from complex datasets on predicting and optimizing various day to day analyses. The study also includes the various research papers having machine Learning algorithms and different classifiers like Decision Trees, Random Forest, Support Vector Machine, Naïve Bayes, Logistic Regression and K- Nearest Neighbor are studied and analyzed for identifying the mental state in a target group. The purpose of the paper is mainly to explore the challenges, adequacy and limitations in detecting the mental health condition using Machine Learning Techniques</i>


Now a day’s human relations are maintained by social media networks. Traditional relationships now days are obsolete. To maintain in association, sharing ideas, exchange knowledge between we use social media networking sites. Social media networking sites like Twitter, Facebook, LinkedIn etc are available in the communication environment. Through Twitter media users share their opinions, interests, knowledge to others by messages. At the same time some of the user’s misguide the genuine users. These genuine users are also called solicited users and the users who misguidance are called spammers. These spammers post unwanted information to the non spam users. The non spammers may retweet them to others and they follow the spammers. To avoid this spam messages we propose a methodology by us using machine learning algorithms. To develop our approach used a set of content based features. In spam detection model we used Support vector machine algorithm(SVM) and Naive bayes classification algorithm. To measure the performance of our model we used precision, recall and F measure metrics.


Author(s):  
Asma Akhtar ◽  
Samia Akhtar ◽  
Birra Bakhtawar ◽  
Ashfaq Ali Kashif ◽  
Nauman Aziz ◽  
...  

Covid-19 pandemic has seriously affected the mankind with colossal loss of life around the world. There is a critical requirement for timely and reliable detection of Corona virus patients to give better and early treatment to prevent the spread of the infection. With that being said, current researches have revealed some critical benefits of utilizing complete blood count tests for early detection of COVID-19 positive individuals. In this research we employed different machine learning algorithms using full blood count for the prediction of COVID-19. These algorithms include: “K Nearest Neighbor, Radial Basis Function, Naive Bayes, kStar, PART, Random Forest, Decision Tree, OneR, Support Vector Machine and Multi-Layer Perceptron”. Further, “Accuracy, Recall, Precision, and F-Measure” are the performance evaluation measures that are utilized in this study.


ACTA IMEKO ◽  
2020 ◽  
Vol 9 (2) ◽  
pp. 66
Author(s):  
Domenico Luca Carnì ◽  
Eulalia Balestrieri ◽  
Ioan Tudosa ◽  
Francesco Lamonaca

<!--[if gte mso 11]><w:PermStart w:id="205147274" w:edGrp="everyone"/><![endif]--><p class="Abstract">In this article, an automatic Analog Modulation Classifier based on Empirical mode decomposition and Machine learning approaches (AMC-EM) is proposed. The AMC-EM operates without a priori information and can recognise typical analog modulation schemes: amplitude modulation, phase modulation, frequency modulation, and single sideband modulation. The AMC-EM uses Empirical Mode Decomposition (EMD) to evaluate the features of the signal for the successive classification by using Machine Learning (ML). In the design of the AMC-EM, the selection of the specific ML technique is performed by comparing, with numerical tests, the performance of the (i) Support Vector Machine (SVM), (ii) k-nearest neighbor classifier, and (iii) adaptive boosting, since they are commonly used in the field of signal classification. The tests have highlighted that the SVM, specifically the quadratic SVM, permits the best possible performance concerning classification accuracy, by considering different noise intensities superimposed on the signal. To assess the advantages of the proposal, a comparison with other classifiers available in the literature has been undertaken through numerical tests. Finally, the AMC-EM is experimentally evaluated, and the experimental results agree with those of the simulation.</p><p class="Abstract"><span lang="EN-US"><br /><!--[if gte mso 11]><w:PermEnd w:id="205147274"/><![endif]--></span></p>


Sign in / Sign up

Export Citation Format

Share Document