scholarly journals Identification of Risk Factors and Symptoms of COVID-19: Analysis of Biomedical Literature and Social Media Data

10.2196/20509 ◽  
2020 ◽  
Vol 22 (10) ◽  
pp. e20509
Author(s):  
Jouhyun Jeon ◽  
Gaurav Baruah ◽  
Sarah Sarabadani ◽  
Adam Palanica

Background In December 2019, the COVID-19 outbreak started in China and rapidly spread around the world. Lack of a vaccine or optimized intervention raised the importance of characterizing risk factors and symptoms for the early identification and successful treatment of patients with COVID-19. Objective This study aims to investigate and analyze biomedical literature and public social media data to understand the association of risk factors and symptoms with the various outcomes observed in patients with COVID-19. Methods Through semantic analysis, we collected 45 retrospective cohort studies, which evaluated 303 clinical and demographic variables across 13 different outcomes of patients with COVID-19, and 84,140 Twitter posts from 1036 COVID-19–positive users. Machine learning tools to extract biomedical information were introduced to identify mentions of uncommon or novel symptoms in tweets. We then examined and compared two data sets to expand our landscape of risk factors and symptoms related to COVID-19. Results From the biomedical literature, approximately 90% of clinical and demographic variables showed inconsistent associations with COVID-19 outcomes. Consensus analysis identified 72 risk factors that were specifically associated with individual outcomes. From the social media data, 51 symptoms were characterized and analyzed. By comparing social media data with biomedical literature, we identified 25 novel symptoms that were specifically mentioned in tweets but have been not previously well characterized. Furthermore, there were certain combinations of symptoms that were frequently mentioned together in social media. Conclusions Identified outcome-specific risk factors, symptoms, and combinations of symptoms may serve as surrogate indicators to identify patients with COVID-19 and predict their clinical outcomes in order to provide appropriate treatments.


2020 ◽  
Author(s):  
Jouhyun Jeon ◽  
Gaurav Baruah ◽  
Sarah Sarabadani ◽  
Adam Palanica

Background In December 2019, Coronavirus disease 2019 (COVID-19) outbreak started in China and rapidly spread around the world. Lack of any vaccine or optimized intervention raised the importance of characterizing risk factors and symptoms for the early identification and successful treatment of COVID-19 patients. Methods We systematically integrated and analyzed published biomedical literature and public social media data to expand our landscape of clinical and demographic variables of COVID-19. Through semantic analysis, 45 retrospective cohort studies, which evaluated 303 clinical and demographic variables across 13 different outcomes of COVID-19, and 84,140 tweet posts from 1,036 COVID-19 positive users were collected. In total, 59 symptoms were identified across both datasets. Findings Approximately 90% of clinical and demographic variables showed inconsistency across outcomes of COVID-19. From the consensus analysis, we identified clinical and demographic variables that were specific for individual outcomes of COVID-19. Also, 25 novel symptoms that have been not previously well characterized, but were mentioned in social media. Furthermore, we observed that there were certain combinations of symptoms that were frequently mentioned together among COVID-19 patients. Interpretation Identified outcome-specific clinical and demographic variables, symptoms, and combinations of symptoms may serve as surrogate indicators to identify COVID-19 patients and predict their clinical outcomes providing appropriate treatments.



2020 ◽  
Author(s):  
Jouhyun Jeon ◽  
Gaurav Baruah ◽  
Sarah Sarabadani ◽  
Adam Palanica

BACKGROUND In December 2019, the COVID-19 outbreak started in China and rapidly spread around the world. Lack of a vaccine or optimized intervention raised the importance of characterizing risk factors and symptoms for the early identification and successful treatment of patients with COVID-19. OBJECTIVE This study aims to investigate and analyze biomedical literature and public social media data to understand the association of risk factors and symptoms with the various outcomes observed in patients with COVID-19. METHODS Through semantic analysis, we collected 45 retrospective cohort studies, which evaluated 303 clinical and demographic variables across 13 different outcomes of patients with COVID-19, and 84,140 Twitter posts from 1036 COVID-19–positive users. Machine learning tools to extract biomedical information were introduced to identify mentions of uncommon or novel symptoms in tweets. We then examined and compared two data sets to expand our landscape of risk factors and symptoms related to COVID-19. RESULTS From the biomedical literature, approximately 90% of clinical and demographic variables showed inconsistent associations with COVID-19 outcomes. Consensus analysis identified 72 risk factors that were specifically associated with individual outcomes. From the social media data, 51 symptoms were characterized and analyzed. By comparing social media data with biomedical literature, we identified 25 novel symptoms that were specifically mentioned in tweets but have been not previously well characterized. Furthermore, there were certain combinations of symptoms that were frequently mentioned together in social media. CONCLUSIONS Identified outcome-specific risk factors, symptoms, and combinations of symptoms may serve as surrogate indicators to identify patients with COVID-19 and predict their clinical outcomes in order to provide appropriate treatments.



2015 ◽  
Vol 23 (3) ◽  
pp. 644-648 ◽  
Author(s):  
Hopin Lee ◽  
James H McAuley ◽  
Markus Hübscher ◽  
Heidi G Allen ◽  
Steven J Kamper ◽  
...  

Background Back pain is a global health problem. Recent research has shown that risk factors that are proximal to the onset of back pain might be important targets for preventive interventions. Rapid communication through social media might be useful for delivering timely interventions that target proximal risk factors. Identifying individuals who are likely to discuss back pain on Twitter could provide useful information to guide online interventions. Methods We used a case-crossover study design for a sample of 742 028 tweets about back pain to quantify the risks associated with a new tweet about back pain. Results The odds of tweeting about back pain just after tweeting about selected physical, psychological, and general health factors were 1.83 (95% confidence interval [CI], 1.80-1.85), 1.85 (95% CI: 1.83-1.88), and 1.29 (95% CI, 1.27-1.30), respectively. Conclusion These findings give directions for future research that could use social media for innovative public health interventions.



2022 ◽  
pp. 188-205
Author(s):  
Erkan Çiçek ◽  
Uğur Gündüz

Social media has been in our lives so much lately that it is an undeniable fact that global pandemics, which constitute an important part of our lives, are also affected by these networks and that they exist in these networks and share the users. The purpose of making this hashtag analysis is to reveal the difference in discourse and language while analyzing Twitter data and to evaluate the effects of a global pandemic crisis on language, message, and crisis management with social media data. This form of analysis is typically completed through amassing textual content data then investigating the “sentiment” conveyed. Within the scope of the study, 11,300 Twitter messages posted with the #stayhome hashtag between 30 May 2020 and 6 June 2020 were examined. The impact and reliability of social media in disaster management could be questioned by carrying out a content analysis based totally on the semantic analysis of the messages given on the Twitter posts with the phrases and frequencies used.



2019 ◽  
Vol 3 (3) ◽  
pp. 38 ◽  
Author(s):  
Stefan Spettel ◽  
Dimitrios Vagianos

Social media are heavily used to shape political discussions. Thus, it is valuable for corporations and political parties to be able to analyze the content of those discussions. This is exemplified by the work of Cambridge Analytica, in support of the 2016 presidential campaign of Donald Trump. One of the most straightforward metrics is the sentiment of a message, whether it is considered as positive or negative. There are many commercial and/or closed-source tools available which make it possible to analyze social media data, including sentiment analysis (SA). However, to our knowledge, not many publicly available tools have been developed that allow for analyzing social media data and help researchers around the world to enter this quickly expanding field of study. In this paper, we provide a thorough description of implementing a tool that can be used for performing sentiment analysis on tweets. In an effort to underline the necessity for open tools and additional monitoring on the Twittersphere, we propose an implementation model based exclusively on publicly available open-source software. The resulting tool is capable of downloading Tweets in real-time based on hashtags or account names and stores the sentiment for replies to specific tweets. It is therefore capable of measuring the average reaction to one tweet by a person or a hashtag, which can be represented with graphs. Finally, we tested our open-source tool within a case study based on a data set of Twitter accounts and hashtags referring to the Syrian war, covering a short time window of one week in the spring of 2018. The results show that while high accuracy of commercial or other complicated tools may not be achieved, our proposed open source tool makes it possible to get a good overview of the overall replies to specific tweets, as well as a practical perception of tweets, related to specific hashtags, identifying them as positive or negative.



10.2196/18767 ◽  
2020 ◽  
Vol 22 (12) ◽  
pp. e18767
Author(s):  
Jooyun Lee ◽  
Hyeoun-Ae Park ◽  
Seul Ki Park ◽  
Tae-Min Song

Background Analysis of posts on social media is effective in investigating health information needs for disease management and identifying people’s emotional status related to disease. An ontology is needed for semantic analysis of social media data. Objective This study was performed to develop a cancer ontology with terminology containing consumer terms and to analyze social media data to identify health information needs and emotions related to cancer. Methods A cancer ontology was developed using social media data, collected with a crawler, from online communities and blogs between January 1, 2014 and June 30, 2017 in South Korea. The relative frequencies of posts containing ontology concepts were counted and compared by cancer type. Results The ontology had 9 superclasses, 213 class concepts, and 4061 synonyms. Ontology-driven natural language processing was performed on the text from 754,744 cancer-related posts. Colon, breast, stomach, cervical, lung, liver, pancreatic, and prostate cancer; brain tumors; and leukemia appeared most in these posts. At the superclass level, risk factor was the most frequent, followed by emotions, symptoms, treatments, and dealing with cancer. Conclusions Information needs and emotions differed according to cancer type. The observations of this study could be used to provide tailored information to consumers according to cancer type and care process. Attention should be paid to provision of cancer-related information to not only patients but also their families and the general public seeking information on cancer.



2017 ◽  
Vol 7 (3) ◽  
pp. 201-213 ◽  
Author(s):  
Peng Yan

Abstract Social media is playing an increasingly important role in reporting major events happening in the world. However, detecting events from social media is challenging due to the huge magnitude of the data and the complex semantics of the language being processed. This paper proposes MASEED (MapReduce and Semantics Enabled Event Detection), a novel event detection framework that effectively addresses the following problems: 1) traditional data mining paradigms cannot work for big data; 2) data preprocessing requires significant human efforts; 3) domain knowledge must be gained before the detection; 4) semantic interpretation of events is overlooked; 5) detection scenarios are limited to specific domains. In this work, we overcome these challenges by embedding semantic analysis into temporal analysis for capturing the salient aspects of social media data, and parallelizing the detection of potential events using the MapReduce methodology. We evaluate the performance of our method using real Twitter data. The results will demonstrate the proposed system outperforms most of the state-of-the-art methods in terms of accuracy and efficiency.



Author(s):  
Yunzhe Wang ◽  
George Baciu ◽  
Chenhui Li

This article focuses on the cognitive exploration of photo sharing data which contain information about the location where the photo was taken and potentially some description about the photo. Therefore, the features of photo-spots can be deduced. Spots with similar features constitute a region of cognitive interest. The objective is to identify these regions and allow users to explore into regions of interest by cognitive understanding of their features. The authors propose an approach that makes use of semantic analysis, data clustering, and cognitive visualization. In this article, the authors introduce the design of an interactive visualization interface which projects photo sharing data to cognitive social activity map components. The contributions are two-fold. First, the authors put forward a novel social-media data classification method. Second, the authors suggest a new method to explore social activity maps by discovering regions of cognitive interest. Experiments are performed on the Flickr dataset.



2020 ◽  
Author(s):  
Jooyun Lee ◽  
Hyeoun-Ae Park ◽  
Seul Ki Park ◽  
Tae-Min Song

BACKGROUND Analysis of posts on social media is effective in investigating health information needs for disease management and identifying people’s emotional status related to disease. An ontology is needed for semantic analysis of social media data. OBJECTIVE This study was performed to develop a cancer ontology with terminology containing consumer terms and to analyze social media data to identify health information needs and emotions related to cancer. METHODS A cancer ontology was developed using social media data, collected with a crawler, from online communities and blogs between January 1, 2014 and June 30, 2017 in South Korea. The relative frequencies of posts containing ontology concepts were counted and compared by cancer type. RESULTS The ontology had 9 superclasses, 213 class concepts, and 4061 synonyms. Ontology-driven natural language processing was performed on the text from 754,744 cancer-related posts. Colon, breast, stomach, cervical, lung, liver, pancreatic, and prostate cancer; brain tumors; and leukemia appeared most in these posts. At the superclass level, risk factor was the most frequent, followed by emotions, symptoms, treatments, and dealing with cancer. CONCLUSIONS Information needs and emotions differed according to cancer type. The observations of this study could be used to provide tailored information to consumers according to cancer type and care process. Attention should be paid to provision of cancer-related information to not only patients but also their families and the general public seeking information on cancer.



Author(s):  
Koustuv Saha ◽  
Ted Grover ◽  
Stephen M. Mattingly ◽  
Vedant Das swain ◽  
Pranshu Gupta ◽  
...  

Personalized predictions have shown promises in various disciplines but they are fundamentally constrained in their ability to generalize across individuals. These models are often trained on limited datasets which do not represent the fluidity of human functioning. In contrast, generalized models capture normative behaviors between individuals but lack precision in predicting individual outcomes. This paper aims to balance the tradeoff between one-for-each and one-for-all models by clustering individuals on mutable behaviors and conducting cluster-specific predictions of psychological constructs in a multimodal sensing dataset of 754 individuals. Specifically, we situate our modeling on social media that has exhibited capability in inferring psychosocial attributes. We hypothesize that complementing social media data with offline sensor data can help to personalize and improve predictions. We cluster individuals on physical behaviors captured via Bluetooth, wearables, and smartphone sensors. We build contextualized models predicting psychological constructs trained on each cluster's social media data and compare their performance against generalized models trained on all individuals' data. The comparison reveals no difference in predicting affect and a decline in predicting cognitive ability, but an improvement in predicting personality, anxiety, and sleep quality. We construe that our approach improves predicting psychological constructs sharing theoretical associations with physical behavior. We also find how social media language associates with offline behavioral contextualization. Our work bears implications in understanding the nuanced strengths and weaknesses of personalized predictions, and how the effectiveness may vary by multiple factors. This work reveals the importance of taking a critical stance on evaluating the effectiveness before investing efforts in personalization.



Sign in / Sign up

Export Citation Format

Share Document