scholarly journals Spatio-Temporal Machine Learning Analysis of Social Media Data and Refugee Movement Statistics

2021 ◽  
Vol 10 (8) ◽  
pp. 498
Author(s):  
Clemens Havas ◽  
Lorenz Wendlinger ◽  
Julian Stier ◽  
Sahib Julka ◽  
Veronika Krieger ◽  
...  

In 2015, within the timespan of only a few months, more than a million people made their way from Turkey to Central Europe in the wake of the Syrian civil war. At the time, public authorities and relief organisations struggled with the admission, transfer, care, and accommodation of refugees due to the information gap about ongoing refugee movements. Therefore, we propose an approach utilising machine learning methods and publicly available data to provide more information about refugee movements. The approach combines methods to analyse the textual, temporal and spatial features of social media data and the number of arriving refugees of historical refugee movement statistics to provide relevant and up to date information about refugee movements and expected numbers. The results include spatial patterns and factual information about collective refugee movements extracted from social media data that match actual movement patterns. Furthermore, our approach enables us to forecast and simulate refugee movements to forecast an increase or decrease in the number of incoming refugees and to analyse potential future scenarios. We demonstrate that the approach proposed in this article benefits refugee management and vastly improves the status quo.

2021 ◽  
Vol 40 (5) ◽  
pp. 9361-9382 ◽  
Author(s):  
Naeem Iqbal ◽  
Rashid Ahmad ◽  
Faisal Jamil ◽  
Do-Hyeun Kim

Quality prediction plays an essential role in the business outcome of the product. Due to the business interest of the concept, it has extensively been studied in the last few years. Advancement in machine learning (ML) techniques and with the advent of robust and sophisticated ML algorithms, it is required to analyze the factors influencing the success of the movies. This paper presents a hybrid features prediction model based on pre-released and social media data features using multiple ML techniques to predict the quality of the pre-released movies for effective business resource planning. This study aims to integrate pre-released and social media data features to form a hybrid features-based movie quality prediction (MQP) model. The proposed model comprises of two different experimental models; (i) predict movies quality using the original set of features and (ii) develop a subset of features based on principle component analysis technique to predict movies success class. This work employ and implement different ML-based classification models, such as Decision Tree (DT), Support Vector Machines with the linear and quadratic kernel (L-SVM and Q-SVM), Logistic Regression (LR), Bagged Tree (BT) and Boosted Tree (BOT), to predict the quality of the movies. Different performance measures are utilized to evaluate the performance of the proposed ML-based classification models, such as Accuracy (AC), Precision (PR), Recall (RE), and F-Measure (FM). The experimental results reveal that BT and BOT classifiers performed accurately and produced high accuracy compared to other classifiers, such as DT, LR, LSVM, and Q-SVM. The BT and BOT classifiers achieved an accuracy of 90.1% and 89.7%, which shows an efficiency of the proposed MQP model compared to other state-of-art- techniques. The proposed work is also compared with existing prediction models, and experimental results indicate that the proposed MQP model performed slightly better compared to other models. The experimental results will help the movies industry to formulate business resources effectively, such as investment, number of screens, and release date planning, etc.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Yasmeen George ◽  
Shanika Karunasekera ◽  
Aaron Harwood ◽  
Kwan Hui Lim

AbstractA key challenge in mining social media data streams is to identify events which are actively discussed by a group of people in a specific local or global area. Such events are useful for early warning for accident, protest, election or breaking news. However, neither the list of events nor the resolution of both event time and space is fixed or known beforehand. In this work, we propose an online spatio-temporal event detection system using social media that is able to detect events at different time and space resolutions. First, to address the challenge related to the unknown spatial resolution of events, a quad-tree method is exploited in order to split the geographical space into multiscale regions based on the density of social media data. Then, a statistical unsupervised approach is performed that involves Poisson distribution and a smoothing method for highlighting regions with unexpected density of social posts. Further, event duration is precisely estimated by merging events happening in the same region at consecutive time intervals. A post processing stage is introduced to filter out events that are spam, fake or wrong. Finally, we incorporate simple semantics by using social media entities to assess the integrity, and accuracy of detected events. The proposed method is evaluated using different social media datasets: Twitter and Flickr for different cities: Melbourne, London, Paris and New York. To verify the effectiveness of the proposed method, we compare our results with two baseline algorithms based on fixed split of geographical space and clustering method. For performance evaluation, we manually compute recall and precision. We also propose a new quality measure named strength index, which automatically measures how accurate the reported event is.


2021 ◽  
Author(s):  
Alexey Bessudnov ◽  
Denis Tarasov ◽  
Viacheslav Panasovets ◽  
Veronica Kostenko ◽  
Ivan Smirnov ◽  
...  

In this paper we develop a machine learning classifier that predicts perceived ethnicity from data on personal names for major ethnic groups populating Russia. We collect data from VK, the largest Russian social media website. Ethnicity has been determined from languages spoken by users and their geographical location, with the data manually cleaned by crowd workers. The classifier shows the accuracy of 0.82 for a scheme with 24 ethnic groups and 0.92 for 15 aggregated ethnic groups. It can be used for research on ethnicity and ethnic relations in Russia, in particular with VK and other social media data.


2020 ◽  
pp. 193-201 ◽  
Author(s):  
Hayder A. Alatabi ◽  
Ayad R. Abbas

Over the last period, social media achieved a widespread use worldwide where the statistics indicate that more than three billion people are on social media, leading to large quantities of data online. To analyze these large quantities of data, a special classification method known as sentiment analysis, is used. This paper presents a new sentiment analysis system based on machine learning techniques, which aims to create a process to extract the polarity from social media texts. By using machine learning techniques, sentiment analysis achieved a great success around the world. This paper investigates this topic and proposes a sentiment analysis system built on Bayesian Rough Decision Tree (BRDT) algorithm. The experimental results show the success of this system where the accuracy of the system is more than 95% on social media data.


In this never-ending social media era it is estimated that over 5 billion people use smartphones. Out of these, there are over 1.5 billion active users in the world. In which we all are a major part and before opening our messages we all are curious about what message we have received. No doubt, we all always hope for a good message to be received. So Sentiment analysis on social media data has been seen by many as an effective tool to monitor user preferences and inclination. Finally, we propose a scalable machine learning model to analyze the polarity of a communicative text using Naive Bayes’ Bernoulli classifier. This paper works on only two polarities that is whether the sentence is positive or negative. Bernoulli classifier is used in this paper because it is best suited for binary inputs which in turn enhances the accuracy of up to 97%.


Author(s):  
F. O. Ostermann ◽  
H. Huang ◽  
G. Andrienko ◽  
N. Andrienko ◽  
C. Capineri ◽  
...  

Increasing availability of Geo-Social Media (e.g. Facebook, Foursquare and Flickr) has led to the accumulation of large volumes of social media data. These data, especially geotagged ones, contain information about perception of and experiences in various environments. Harnessing these data can be used to provide a better understanding of the semantics of places. We are interested in the similarities or differences between different Geo-Social Media in the description of places. This extended abstract presents the results of a first step towards a more in-depth study of semantic similarity of places. Particularly, we took places extracted through spatio-temporal clustering from one data source (Twitter) and examined whether their structure is reflected semantically in another data set (Flickr). Based on that, we analyse how the semantic similarity between places varies over space and scale, and how Tobler's first law of geography holds with regards to scale and places.


Author(s):  
Afiq Izzudin A. Rahim ◽  
Mohd Ismail Ibrahim ◽  
Kamarul Imran Musa ◽  
Sook-Ling Chua ◽  
Najib Majdi Yaacob

Social media is emerging as a new avenue for hospitals and patients to solicit input on the quality of care. However, social media data is unstructured and enormous in volume. Moreover, no empirical research on the use of social media data and perceived hospital quality of care based on patient online reviews has been performed in Malaysia. The purpose of this study was to investigate the determinants of positive sentiment expressed in hospital Facebook reviews in Malaysia, as well as the association between hospital accreditation and sentiments expressed in Facebook reviews. From 2017 to 2019, we retrieved comments from 48 official public hospitals’ Facebook pages. We used machine learning to build a sentiment analyzer and service quality (SERVQUAL) classifier that automatically classifies the sentiment and SERVQUAL dimensions. We utilized logistic regression analysis to determine our goals. We evaluated a total of 1852 reviews and our machine learning sentiment analyzer detected 72.1% of positive reviews and 27.9% of negative reviews. We classified 240 reviews as tangible, 1257 reviews as trustworthy, 125 reviews as responsive, 356 reviews as assurance, and 1174 reviews as empathy using our machine learning SERVQUAL classifier. After adjusting for hospital characteristics, all SERVQUAL dimensions except Tangible were associated with positive sentiment. However, no significant relationship between hospital accreditation and online sentiment was discovered. Facebook reviews powered by machine learning algorithms provide valuable, real-time data that may be missed by traditional hospital quality assessments. Additionally, online patient reviews offer a hitherto untapped indication of quality that may benefit all healthcare stakeholders. Our results confirm prior studies and support the use of Facebook reviews as an adjunct method for assessing the quality of hospital services in Malaysia.


2020 ◽  
Author(s):  
Stevie Chancellor ◽  
Steven A Sumner ◽  
Corinne David-Ferdon ◽  
Tahirah Ahmad ◽  
Munmun De Choudhury

BACKGROUND Online communities provide support for individuals looking for help with suicidal ideation and crisis. As community data are increasingly used to devise machine learning models to infer who might be at risk, there have been limited efforts to identify both risk and protective factors in web-based posts. These annotations can enrich and augment computational assessment approaches to identify appropriate intervention points, which are useful to public health professionals and suicide prevention researchers. OBJECTIVE This qualitative study aims to develop a valid and reliable annotation scheme for evaluating risk and protective factors for suicidal ideation in posts in suicide crisis forums. METHODS We designed a valid, reliable, and clinically grounded process for identifying risk and protective markers in social media data. This scheme draws on prior work on construct validity and the social sciences of measurement. We then applied the scheme to annotate 200 posts from r/SuicideWatch—a Reddit community focused on suicide crisis. RESULTS We documented our results on producing an annotation scheme that is consistent with leading public health information coding schemes for suicide and advances attention to protective factors. Our study showed high internal validity, and we have presented results that indicate that our approach is consistent with findings from prior work. CONCLUSIONS Our work formalizes a framework that incorporates construct validity into the development of annotation schemes for suicide risk on social media. This study furthers the understanding of risk and protective factors expressed in social media data. This may help public health programming to prevent suicide and computational social science research and investigations that rely on the quality of labels for downstream machine learning tasks.


Sign in / Sign up

Export Citation Format

Share Document