Sentiment Analysis of Users’ Reviews on COVID-19 Contact Tracing Apps with a Benchmark Dataset (Preprint)

2021 ◽  
Author(s):  
Kashif Ahmad ◽  
Firoj Alam ◽  
Junaid Qadir ◽  
Basheer Qolomony ◽  
Imran Khan ◽  
...  

BACKGROUND Contact tracing has been globally adopted in the fight to control the infection rate of COVID-19. Thanks to digital technologies, such as smartphones and wearable devices, contacts of COVID-19 patients can be easily traced and informed about their potential exposure to the virus. To this aim, several interesting mobile applications have been developed. However, there are ever-growing concerns over the working mechanism and performance of these applications. The literature already provides some interesting exploratory studies on the community’s response to the applications by analyzing information from different sources, such as news and users’ reviews of the applications. However, to the best of our knowledge, there is no existing solution that automatically analyzes users’ reviews and extracts the evoked sentiments. OBJECTIVE In this paper, we analyze how AI models can help in automatically extract and classify the polarity of users’ sentiments and propose a sentiment analysis framework to automatically analyze users’ reviews on COVID-19 contact tracing mobile applications. METHODS we propose a pipeline starting from manual annotation via a crowd-sourcing study and concluding on the development and training of AI models for automatic sentiment analysis of users’ reviews. In detail, we collected and annotated a large-scale dataset of Android and iOS mobile application users’ reviews for COVID-19 contact tracing. After manually analyzing and annotating users’ reviews, we employed both classical (i.e., Naïve Bayes, SVM, Random Forest) and deep learning (i.e., fastText, and different transformers) methods for classification experiments. This resulted in eight different classification models. RESULTS We employed eight different methods on three different tasks achieving up to an average F1-Scores 94.8% indicating the feasibility of automatic sentiment analysis of users’ reviews on the COVID-19 contact tracing applications. Moreover, the crowd-sourcing activity resulted in a large-scale benchmark dataset composed of 34,534 reviews manually annotated from the contract tracing applications of 46 distinct countries. CONCLUSIONS The existing literature mostly relies on the manual/exploratory analysis of users’ reviews on the application, which is a tedious and time-consuming process. Moreover, in the existing studies, generally, data from fewer applications are analyzed. In this work, we showed that automatic sentiment analysis can help in analyzing users’ responses to the application more quickly with significant accuracy. Moreover, we also provided a large-scale benchmark dataset composed of 34,534 reviews from 47 different applications. We believe the presented analysis and the dataset will support future research on the topic.

2022 ◽  
Author(s):  
Kashif Ahmad ◽  
Firoj Alam ◽  
Juniad Qadir ◽  
Basheer Qolomany ◽  
Imran Khan ◽  
...  

BACKGROUND Contact tracing has been globally adopted in the fight to control the infection rate of COVID-19. Thanks to digital technologies, such as smartphones and wearable devices, contacts of COVID-19 patients can be easily traced and informed about their potential exposure to the virus. To this aim, several mobile applications have been developed. However, there are ever-growing concerns over the working mechanism and performance of these applications. The literature already provides some interesting exploratory studies on the community’s response to the applications by analyzing information from different sources, such as news and users’ reviews of the applications. However, to the best of our knowledge, there is no existing solution that automatically analyzes users’ reviews and extracts the evoked sentiments. We believe such solutions combined with a user-friendly interface can be used as a rapid surveillance tool to monitor how effective an application is and to make immediate changes without going through an intense participatory design method which, although in normal circumstances is optimal, but not optimal in emergency situations where a mobile device needs to be deployed immediately with little to no user input from the beginning for the greater public good. OBJECTIVE In this paper, we aim to analyze the efficacy of AI models and Natural Language Processing (NLP) techniques in automatically extracting and classifying the polarity of users’ sentiments by proposing a sentiment analysis framework to automatically analyze users’ reviews on COVID-19 contact tracing mobile applications. We also aim to provide a large-scale annotated benchmark dataset to facilitate future research in the domain. As a proof of concepts, we also develop a potential web application, based on the proposed solutions, with a user-friendly interface to automatically analyze and classify users’ reviews on the COVID-19 contact tracing applications. The proposed framework combined with the interface which is expected to help the community in quickly analyzing users’ perception about such mobile applications and can be used as a rapid surveillance tool to monitor effectiveness of mobile applications and to make immediate changes without going through an intense participatory design method in emergency situations. METHODS We propose a pipeline starting from manual annotation via a crowd-sourcing study and concluding on the development and training of AI models for automatic sentiment analysis of users’ reviews. In detail, we collected and annotated a large- scale dataset of Android and iOS mobile applications users’ reviews for COVID-19 contact tracing. After manually analyzing and annotating users’ reviews, we employed both classical (i.e., Naïve Bayes, SVM, Random Forest) and deep learning (i.e., fastText, and different transformers) methods for classification experiments. This resulted in eight different classification models. RESULTS We employed eight different methods on three different tasks achieving up to an average F1-Scores 94.8% indicating the feasibility and applicability of automatic sentiment analysis of users’ reviews on the COVID-19 contact tracing applications. Moreover, the crowd-sourcing activity resulted in a large-scale benchmark dataset composed of 34,534 reviews manually annotated from the contract tracing applications of 46 distinct countries. The resulted dataset is also made publicly available for research usage. CONCLUSIONS The existing literature mostly relies on the manual/exploratory analysis of users’ reviews on the application, which is a tedious and time-consuming process. Moreover, in the existing studies, generally, data from fewer applications are analyzed. In this work, we showed that AI and NLP techniques provide good results in analyzing and classifying users’ sentiments’ polarity, and that the automatic sentiment analysis can help in analyzing users’ responses to the application more quickly with a significant accuracy. Moreover, we also provided a large-scale benchmark dataset composed of 34,534 reviews from 47 different applications. We believe the presented analysis, dataset, and the proposed solutions combined with a user-friendly interface can be used as a rapid surveillance tool to analyze and monitor mobile applications deployed in emergency situations leading to rapid changes in the applications without going through an intense participatory design method.


2017 ◽  
Vol 58 (2) ◽  
pp. 175-191 ◽  
Author(s):  
Ali Reza Alaei ◽  
Susanne Becken ◽  
Bela Stantic

Advances in technology have fundamentally changed how information is produced and consumed by all actors involved in tourism. Tourists can now access different sources of information, and they can generate their own content and share their views and experiences. Tourism content shared through social media has become a very influential information source that impacts tourism in terms of both reputation and performance. However, the volume of data on the Internet has reached a level that makes manual processing almost impossible, demanding new analytical approaches. Sentiment analysis is rapidly emerging as an automated process of examining semantic relationships and meaning in reviews. In this article, different sentiment analysis approaches applied in tourism are reviewed and assessed in terms of the datasets used and performances on key evaluation metrics. The article concludes by outlining future research avenues to further advance sentiment analysis in tourism as part of a broader Big Data approach.


Sensors ◽  
2020 ◽  
Vol 20 (24) ◽  
pp. 7115
Author(s):  
Amin Muhammad Sadiq ◽  
Huynsik Ahn ◽  
Young Bok Choi

A rapidly increasing growth of social networks and the propensity of users to communicate their physical activities, thoughts, expressions, and viewpoints in text, visual, and audio material have opened up new possibilities and opportunities in sentiment and activity analysis. Although sentiment and activity analysis of text streams has been extensively studied in the literature, it is relatively recent yet challenging to evaluate sentiment and physical activities together from visuals such as photographs and videos. This paper emphasizes human sentiment in a socially crucial field, namely social media disaster/catastrophe analysis, with associated physical activity analysis. We suggest multi-tagging sentiment and associated activity analyzer fused with a a deep human count tracker, a pragmatic technique for multiple object tracking, and count in occluded circumstances with a reduced number of identity switches in disaster-related videos and images. A crowd-sourcing study has been conducted to analyze and annotate human activity and sentiments towards natural disasters and related images in social networks. The crowdsourcing study outcome into a large-scale benchmark dataset with three annotations sets each resolves distinct tasks. The presented analysis and dataset will anchor a baseline for future research in the domain. We believe that the proposed system will contribute to more viable communities by benefiting different stakeholders, such as news broadcasters, emergency relief organizations, and the public in general.


Author(s):  
Dilip Kumar Sharma ◽  
Sonal Garg

AbstractSpotting fake news is a critical problem nowadays. Social media are responsible for propagating fake news. Fake news propagated over digital platforms generates confusion as well as induce biased perspectives in people. Detection of misinformation over the digital platform is essential to mitigate its adverse impact. Many approaches have been implemented in recent years. Despite the productive work, fake news identification poses many challenges due to the lack of a comprehensive publicly available benchmark dataset. There is no large-scale dataset that consists of Indian news only. So, this paper presents IFND (Indian fake news dataset) dataset. The dataset consists of both text and images. The majority of the content in the dataset is about events from the year 2013 to the year 2021. Dataset content is scrapped using the Parsehub tool. To increase the size of the fake news in the dataset, an intelligent augmentation algorithm is used. An intelligent augmentation algorithm generates meaningful fake news statements. The latent Dirichlet allocation (LDA) technique is employed for topic modelling to assign the categories to news statements. Various machine learning and deep-learning classifiers are implemented on text and image modality to observe the proposed IFND dataset's performance. A multi-modal approach is also proposed, which considers both textual and visual features for fake news detection. The proposed IFND dataset achieved satisfactory results. This study affirms that the accessibility of such a huge dataset can actuate research in this laborious exploration issue and lead to better prediction models.


2021 ◽  
Vol 230 ◽  
pp. 110519
Author(s):  
Mingyang Qian ◽  
Da Yan ◽  
Tianzhen Hong ◽  
Hua Liu

Data ◽  
2020 ◽  
Vol 5 (4) ◽  
pp. 87 ◽  
Author(s):  
Viktoriia Shubina ◽  
Sylvia Holcer ◽  
Michael Gould ◽  
Elena Simona Lohan

Some of the recent developments in data science for worldwide disease control have involved research of large-scale feasibility and usefulness of digital contact tracing, user location tracking, and proximity detection on users’ mobile devices or wearables. A centralized solution relying on collecting and storing user traces and location information on a central server can provide more accurate and timely actions than a decentralized solution in combating viral outbreaks, such as COVID-19. However, centralized solutions are more prone to privacy breaches and privacy attacks by malevolent third parties than decentralized solutions, storing the information in a distributed manner among wireless networks. Thus, it is of timely relevance to identify and summarize the existing privacy-preserving solutions, focusing on decentralized methods, and analyzing them in the context of mobile device-based localization and tracking, contact tracing, and proximity detection. Wearables and other mobile Internet of Things devices are of particular interest in our study, as not only privacy, but also energy-efficiency, targets are becoming more and more critical to the end-users. This paper provides a comprehensive survey of user location-tracking, proximity-detection, and digital contact-tracing solutions in the literature from the past two decades, analyses their advantages and drawbacks concerning centralized and decentralized solutions, and presents the authors’ thoughts on future research directions in this timely research field.


2020 ◽  
Vol 6 (23) ◽  
pp. eaaz0286 ◽  
Author(s):  
Elena Miu ◽  
Ned Gulley ◽  
Kevin N. Laland ◽  
Luke Rendell

Human technology is characterized by cumulative cultural knowledge gain, yet researchers have limited knowledge of the mix of copying and innovation that maximizes progress. Here, we analyze a unique large-scale dataset originating from collaborative online programming competitions to investigate, in a setting of real-world complexity, how individual differences in innovation, social-information use, and performance generate technological progress. We find that cumulative knowledge gain is primarily driven by pragmatists, willing to copy, innovate, explore, and take risks flexibly, rather than by pure innovators or habitual copiers. Our study also reveals a key role for prestige in information transfer.


Electronics ◽  
2021 ◽  
Vol 10 (10) ◽  
pp. 1133
Author(s):  
Zenun Kastrati ◽  
Lule Ahmedi ◽  
Arianit Kurti ◽  
Fatbardh Kadriu ◽  
Doruntina Murtezaj ◽  
...  

During the pandemic, when people needed to physically distance, social media platforms have been one of the outlets where people expressed their opinions, thoughts, sentiments, and emotions regarding the pandemic situation. The core object of this research study is the sentiment analysis of peoples’ opinions expressed on Facebook regarding the current pandemic situation in low-resource languages. To do this, we have created a large-scale dataset comprising of 10,742 manually classified comments in the Albanian language. Furthermore, in this paper we report our efforts on the design and development of a sentiment analyser that relies on deep learning. As a result, we report the experimental findings obtained from our proposed sentiment analyser using various classifier models with static and contextualized word embeddings, that is, fastText and BERT, trained and validated on our collected and curated dataset. Specifically, the findings reveal that combining the BiLSTM with an attention mechanism achieved the highest performance on our sentiment analysis task, with an F1 score of 72.09%.


2021 ◽  
Vol 15 ◽  
Author(s):  
Chao He ◽  
Jialu Liu ◽  
Yuesheng Zhu ◽  
Wencai Du

Classification of electroencephalogram (EEG) is a key approach to measure the rhythmic oscillations of neural activity, which is one of the core technologies of brain-computer interface systems (BCIs). However, extraction of the features from non-linear and non-stationary EEG signals is still a challenging task in current algorithms. With the development of artificial intelligence, various advanced algorithms have been proposed for signal classification in recent years. Among them, deep neural networks (DNNs) have become the most attractive type of method due to their end-to-end structure and powerful ability of automatic feature extraction. However, it is difficult to collect large-scale datasets in practical applications of BCIs, which may lead to overfitting or weak generalizability of the classifier. To address these issues, a promising technique has been proposed to improve the performance of the decoding model based on data augmentation (DA). In this article, we investigate recent studies and development of various DA strategies for EEG classification based on DNNs. The review consists of three parts: what kind of paradigms of EEG-based on BCIs are used, what types of DA methods are adopted to improve the DNN models, and what kind of accuracy can be obtained. Our survey summarizes the current practices and performance outcomes that aim to promote or guide the deployment of DA to EEG classification in future research and development.


Sign in / Sign up

Export Citation Format

Share Document