scholarly journals Regional Influenza Prediction with Sampling Twitter Data and PDE Model

Author(s):  
Yufang Wang ◽  
Kuai Xu ◽  
Yun Kang ◽  
Haiyan Wang ◽  
Feng Wang ◽  
...  

The large volume of geotagged Twitter streaming data on flu epidemics provides chances for researchers to explore, model, and predict the trends of flu cases in a timely manner. However, the explosive growth of data from social media makes data sampling a natural choice. In this paper, we develop a method for influenza prediction based on the real-time tweet data from social media, and this method ensures real-time prediction and is applicable to sampling data. Specifically, we first simulate the sampling process of flu tweets, and then develop a specific partial differential equation (PDE) model to characterize and predict the aggregated flu tweet volumes. Our PDE model incorporates the effects of flu spreading, flu recovery, and active human interventions for reducing flu. Our extensive simulation results show that this PDE model can almost eliminate the data reduction effects from the sampling process: It requires lesser historical data but achieves stronger prediction results with a relative accuracy of over 90% on the 1% sampling data. Even for the more aggressive data sampling ratios such as 0.1% and 0.01% sampling, our model is still able to achieve relative accuracies of 85% and 83%, respectively. These promising results highlight the ability of our mechanistic PDE model in predicting temporal–spatial patterns of flu trends even in the scenario of small sampling Twitter data.

The rise of social media platforms like Twitter and the increasing adoption by people in order to stay connected provide a large source of data to perform analysis based on the various trends, events and even various personalities. Such analysis also provides insight into a person’s likes and inclinations in real time independent of the data size. Several techniques have been created to retrieve such data however the most efficient technique is clustering. This paper provides an overview of the algorithms of the various clustering methods as well as looking at their efficiency in determining trending information. The clustered data may be further classified by topics for real time analysis on a large dynamic data set. In this paper, data classification is performed and analyzed for flaws followed by another classification on the same data set.


2020 ◽  
Vol 12 (23) ◽  
pp. 10175
Author(s):  
Fatima Abdullah ◽  
Limei Peng ◽  
Byungchul Tak

The volume of streaming sensor data from various environmental sensors continues to increase rapidly due to wider deployments of IoT devices at much greater scales than ever before. This, in turn, causes massive increase in the fog, cloud network traffic which leads to heavily delayed network operations. In streaming data analytics, the ability to obtain real time data insight is crucial for computational sustainability for many IoT enabled applications such as environmental monitors, pollution and climate surveillance, traffic control or even E-commerce applications. However, such network delays prevent us from achieving high quality real-time data analytics of environmental information. In order to address this challenge, we propose the Fog Sampling Node Selector (Fossel) technique that can significantly reduce the IoT network and processing delays by algorithmically selecting an optimal subset of fog nodes to perform the sensor data sampling. In addition, our technique performs a simple type of query executions within the fog nodes in order to further reduce the network delays by processing the data near the data producing devices. Our extensive evaluations show that Fossel technique outperforms the state-of-the-art in terms of latency reduction as well as in bandwidth consumption, network usage and energy consumption.


BMJ Open ◽  
2019 ◽  
Vol 9 (1) ◽  
pp. e024018 ◽  
Author(s):  
Xiaolei Huang ◽  
Michael C Smith ◽  
Amelia M Jamison ◽  
David A Broniatowski ◽  
Mark Dredze ◽  
...  

IntroductionThe Centers for Disease Control and Prevention (CDC) spend significant time and resources to track influenza vaccination coverage each influenza season using national surveys. Emerging data from social media provide an alternative solution to surveillance at both national and local levels of influenza vaccination coverage in near real time.ObjectivesThis study aimed to characterise and analyse the vaccinated population from temporal, demographical and geographical perspectives using automatic classification of vaccination-related Twitter data.MethodsIn this cross-sectional study, we continuously collected tweets containing both influenza-related terms and vaccine-related terms covering four consecutive influenza seasons from 2013 to 2017. We created a machine learning classifier to identify relevant tweets, then evaluated the approach by comparing to data from the CDC’s FluVaxView. We limited our analysis to tweets geolocated within the USA.ResultsWe assessed 1 124 839 tweets. We found strong correlations of 0.799 between monthly Twitter estimates and CDC, with correlations as high as 0.950 in individual influenza seasons. We also found that our approach obtained geographical correlations of 0.387 at the US state level and 0.467 at the regional level. Finally, we found a higher level of influenza vaccine tweets among female users than male users, also consistent with the results of CDC surveys on vaccine uptake.ConclusionSignificant correlations between Twitter data and CDC data show the potential of using social media for vaccination surveillance. Temporal variability is captured better than geographical and demographical variability. We discuss potential paths forward for leveraging this approach.


Author(s):  
Srinidhi Hiriyannaiah ◽  
G.M. Siddesh ◽  
K.G. Srinivasa

This article describes how recent advances in computing have led to an increase in the generation of data in fields such as social media, medical, power and others. With the rapid increase in internet users, social media has given power for sentiment analysis or opinion mining. It is a highly challenging task for storing, querying and analyzing such types of data. This article aims at providing a solution to store, query and analyze streaming data using Apache Kafka as the platform and twitter data as an example for analysis. A three-way classification method is proposed for sentimental analysis of twitter data that combines both the approaches for knowledge-based and machine-learning using three stages namely emotion classification, word classification and sentiment classification. The hybrid three-way classification approach was evaluated using a sample of five query strings on twitter and compared with existing emotion classifier, polarity classifier and Naïve Bayes classifier for sentimental analysis. The accuracy of the results of the proposed approach is superior when compared to existing approaches.


2020 ◽  
pp. 1377-1390
Author(s):  
Srinidhi Hiriyannaiah ◽  
G.M. Siddesh ◽  
K.G. Srinivasa

This article describes how recent advances in computing have led to an increase in the generation of data in fields such as social media, medical, power and others. With the rapid increase in internet users, social media has given power for sentiment analysis or opinion mining. It is a highly challenging task for storing, querying and analyzing such types of data. This article aims at providing a solution to store, query and analyze streaming data using Apache Kafka as the platform and twitter data as an example for analysis. A three-way classification method is proposed for sentimental analysis of twitter data that combines both the approaches for knowledge-based and machine-learning using three stages namely emotion classification, word classification and sentiment classification. The hybrid three-way classification approach was evaluated using a sample of five query strings on twitter and compared with existing emotion classifier, polarity classifier and Naïve Bayes classifier for sentimental analysis. The accuracy of the results of the proposed approach is superior when compared to existing approaches.


Author(s):  
Renya N. Nath ◽  
N. Priya ◽  
C.R. Rene Robin

Social media has evolved as an inseparable entity in everybody's life. People make use of social media like Face book, twitter, etc. to express their feelings. That's the reason organizations make use of social media information to infer the behavior of its users. The recent ChennaiRains2015 followed by Chennai flood show the reachability of social media as most of the people have utilized it to convey their status and requirements. Many people have utilized the same social media to express their willingness for providing help (food, shelter, evacuation and medical) to the flood victims. Connecting such people to the needy in a timely manner can make the disaster management process more efficient. In this paper, the authors highlight, (1) the design of Apache Storm based real time analytics of twitter data for extracting location and status of flood affected areas and (2) the development of an optimized map connecting the volunteers (people ready to help flood victims) and the flood victims who have raised their requests via social media.


2021 ◽  
Vol 1 (2) ◽  
Author(s):  
Dilmini Rathnayaka ◽  
Pubudu K.P.N Jayasena ◽  
Iraj Ratnayake

Sentiment analysis mainly supports sorting out the polarity and provides valuable information with the use of raw data in social media platforms. Many fields like health, business, and security require real-time data analysis for instant decision-making situations.Since Twitter is considered a popular social media platform to collect data easily, this paper is considering data analysis methods of Twitter data, real-time Twitter data analysis based on geo-location. Twitter data classification and analysis can be done with the use of diverse algorithms and deciding the most appropriate algorithm for data analysis, can be accomplished by implementing and testing these diverse algorithms.This paper is discussing the major description of sentiment analysis, data collection methods, data pre-processing, feature extraction, and sentiment analysis methods related to Twitter data. Real-time data analysis arises as a major method of analyzing the data available online and the real-time Twitter data analysis process is described throughout this paper. Several methods of classifying the polarized Twitter data are discussed within the paper while depicting a proposed method of Twitter data analyzing algorithm. Location-based Twitter data analysis is another crucial aspect of sentiment analyses, that enables data sorting according to geo-location, and this paper describes the way of analyzing Twitter data based on geo-location. Further, a comparison about several sentiment analysis algorithms used by previous researchers has been reported and finally, a conclusion has been provided.


Sign in / Sign up

Export Citation Format

Share Document