scholarly journals Analyzing the Epidemiological Outbreak of COVID-19: Real-time, Visual Data Analysis, Short-term Forecasting, and Risk Factor Identification

2021 ◽  
Vol 2 (3) ◽  
pp. 246-261
Author(s):  
Jiawei Long

The COVID-19 outbreak was initially reported in Wuhan, China, and it has been declared as a Public Health Emergency of International Concern (PHEIC) on 30 January 2020 by WHO. It has now spread to over 180 countries, and it has gradually evolved into a world-wide pandemic, endangering the state of global public health and becoming a serious threat to the global community. To combat and prevent the spread of the disease, all individuals should be well-informed of the rapidly changing state of COVID-19. To accomplish this objective, I have built a website to analyze and deliver the latest state of the disease and relevant analytical insights. The website is designed to cater to the general audience, and it aims to communicate insights through various straightforward and concise data visualizations that are supported by sound statistical methods, accurate data modeling, state-of-the-art natural language processing techniques, and reliable data sources. This paper discusses the major methodologies which are utilized to generate the insights displayed on the website, which include an automatic data ingestion pipeline, normalization techniques, moving average computation, ARIMA time-series forecasting, and logistic regression models. In addition, the paper highlights key discoveries that have been derived in regard to COVID-19 using the methodologies. Doi: 10.28991/HIJ-2021-02-03-09 Full Text: PDF

Author(s):  
Jiawei Long

While the COVID-19 outbreak was reported to first originate from Wuhan, China, it has been declared as a Public Health Emergency of International Concern (PHEIC) on 30 January 2020 by WHO, and it has spread to over 180 countries by the time of this paper was being composed. As the disease spreads around the globe, it has evolved into a worldwide pandemic, endangering the state of global public health and becoming a serious threat to the global community. To combat and prevent the spread of the disease, all individuals should be well-informed of the rapidly changing state of COVID-19. In the endeavor of accomplishing this objective, a COVID-19 real-time analytical tracker has been built to provide the latest status of the disease and relevant analytical insights. The real-time tracker is designed to cater to the general audience without advanced statistical aptitude. It aims to communicate insights through various straightforward and concise data visualizations that are supported by sound statistical foundations and reliable data sources. This paper aims to discuss the major methodologies which are utilized to generate the insights displayed on the real-time tracker, which include real-time data retrieval, normalization techniques, ARIMA time-series forecasting, and logistic regression models. In addition to introducing the details and motivations of the utilized methodologies, the paper additionally features some key discoveries that have been derived in regard to COVID-19 using the methodologies.


Author(s):  
Dave Carter ◽  
Marta Stojanovic ◽  
Berry De Bruijn

Objective: To rebuild the software that underpins the Global Public Health Intelligence Network using modern natural language processing techniques to support recent and future improvements in situational awareness capability.Introduction: The Global Public Health Intelligence Network is a non-traditional all-hazards multilingual surveillance system introduced in 1997 by the Government of Canada in collaboration with the World Health Organization.1 GPHIN software collects news articles, media releases, and incident reports and analyzes them for information about communicable diseases, natural disasters, product recalls, radiological events and other public health crises. Since 2016, the Public Health Agency of Canada (PHAC) and National Research Council Canada (NRC) have collaborated to replace GPHIN with a modular platform that incorporates modern natural language processing techniques to support more ambitious situational awareness goals.Methods: The updated GPHIN platform assembles several natural language processing tools to annotate incoming data in order to support situational awareness; broadly, GPHIN aims to extract knowledge from data.Data are collected in 10 languages and are machine translated to English. Several of the machine translation models use high performance neural networks. Language models are updated regularly and support external dictionaries for handling emerging domain-specific terms that might not yet exist in the parallel corpora used to train the models.All incoming documents are assigned a relevance score. Machine learning models estimate a score based on similarity to sets of known high-relevance and known low-relevance documents. PHAC analysts provide feedback on the scoring from time to time in the course of their work, and this feedback is used to periodically retrain scoring models.Documents are assigned keywords using two ontologies: an all-hazards multilingual taxonomy hand-compiled at PHAC, and the U.S. National Library of Medicine’s Unified Medical Language System (UMLS).Categories are assigned probabilistically to incoming articles (e.g., human infectious diseases, animal infectious diseases, substance abuse, environmental hazards), largely using affinity scores that correspond to keywords.Dates and times are resolved to canonical forms, so that mentions like last Tuesday get resolved to specific dates; this makes it possible to sequence data about a single event that are released at varying frequencies and with varying timeliness.Cities, states/provinces, and countries are identified in the documents, and gaps in the hierarchical geographic relationships are filled in. Locations are disambiguated based on collocations; the system distinguishes between and correctly resolves Ottawa, KS vs. Ottawa, ON, Canada, for example. Countries are displayed with their socio-economic population statistics (Gini coefficient, human development index, median age, and so on).The system attempts to detect and reconcile near-duplicate articles in order to handle instances where one article is released on a newswire and subsequently gets lightly edited and syndicated in dozens or hundreds of local papers; this improves the signal-to-noise ratio of the data in GPHIN for better productivity. Template-based reports (where the same document may get re-issued with a new number of cases but no other changes, for example) are still a challenge, but whitelisting tools reduce the false positive rate.The system provides tools for constructing arbitrarily detailed searches, tied to colour-coded maps and graphs that update on-the-fly, and offers short extractive summaries of each search result for easy filtering. GPHIN also generates topical knowledge graphs about sets of articles that seek to reveal surprising correlations in the data; for example, graphically reconciling and highlighting cases where several neighbouring countries all have reports of a mysterious disease and where a particular mosquito is mentioned.Next steps in the ongoing rejuvenation involve collating discrete articles and documents into narrative timelines that track an ongoing event: collecting all data about the spread of an infectious disease outbreak or perhaps the aftermath of an earthquake in the developing world. Our research is focussing on how to build line lists from such a stream of news articles about an event and how to detect important change points in the ongoing narrative.Results: The new GPHIN platform was launched in August 2016 in order to support syndromic surveillance activities for the Rio 2016 Olympics, and has been updated incrementally since then to offer further capabilities to professional users in 30 countries. Its modular construction supports current situational awareness activities as well as further research into advanced natural language processing techniques.Conclusions: We improved (and continue to improve) GPHIN with modern natural language processing techniques, including better translations, relevance scoring, categorization, near-duplicate detection, and improved data visualization tools, all towards the goal of more productive and more trustworthy situational awareness.


2020 ◽  
Author(s):  
Hala Hamadah ◽  
Barrak Alahmad ◽  
Mohammad Behbehani ◽  
Sarah Al-Youha ◽  
Sulaiman Almazeedi ◽  
...  

Abstract Background: In light of the COVID-19 pandemic, many have flagged racial and ethnic differences in health outcomes in western countries as an urgent global public health priority. Kuwait has a unique demographic profile with two-thirds of the population consisting of non-nationals, most of which are migrant workers.Objective: We aimed to explore whether there is a significant difference in health outcomes between non-Kuwaiti and Kuwaiti patients diagnosed with COVID-19.Methods: We used a prospective COVID-19 registry of all patients (symptomatic and asymptomatic) in Kuwait who tested positive from February 24th to April 20th, 2020 , collected from Jaber Al-Ahmad Al-Sabah Hospital, the officially-designated COVID-19 healthcare facility in the country. We ran separate logistic regression models comparing non-Kuwaitis to Kuwaitis for death, intensive care unit (ICU) admission, acute respiratory distress syndrome (ARDS) and pneumonia.Results: The first 1,123 COVID-19 positive patients in Kuwait were all recruited in the study. About 26% were Kuwaitis and 73% were non-Kuwaiti. With adjustments made to age, gender, smoking and selected co-morbidities, non-Kuwaitis had two-fold increase in the odds of death or being admitted to the intensive care unit compared to Kuwaitis (OR: 2.14, 95% CI 1.12-4.32). Non-Kuwaitis had also higher odds of ARDS (OR:2.44, 95% CI 1.23-5.09) and pneumonia (OR: 2.24, 95% CI 1.27-4.12).Conclusion: This is the first study to report on COVID-19 outcomes between Kuwaiti and non-Kuwaiti patients. The current pandemic may have amplified the differences of health outcomes among marginalized subpopulations. A number of socioeconomic and environmental factors could explain this health disparity. More research is needed to advance the understanding of policymakers in Kuwait in order to make urgent public health interventions.


2020 ◽  
Author(s):  
Marichi Gupta ◽  
Adity Bansal ◽  
Bhav Jain ◽  
Jillian Rochelle ◽  
Atharv Oak ◽  
...  

Objective: The potential ability for weather to affect SARS-CoV-2 transmission has been an area of controversial discussion during the COVID-19 pandemic. Individuals' perceptions of the impact of weather can inform their adherence to public health guidelines; however, there is no measure of their perceptions. We quantified Twitter users' perceptions of the effect of weather and analyzed how they evolved with respect to real-world events and time. Materials and Methods: We collected 166,005 tweets posted between January 23 and June 22, 2020 and employed machine learning/natural language processing techniques to filter for relevant tweets, classify them by the type of effect they claimed, and identify topics of discussion. Results: We identified 28,555 relevant tweets and estimate that 40.4% indicate uncertainty about weather's impact, 33.5% indicate no effect, and 26.1% indicate some effect. We tracked changes in these proportions over time. Topic modeling revealed major latent areas of discussion. Discussion: There is no consensus among the public for weather's potential impact. Earlier months were characterized by tweets that were uncertain of weather's effect or claimed no effect; later, the portion of tweets claiming some effect of weather increased. Tweets claiming no effect of weather comprised the largest class by June. Major topics of discussion included comparisons to influenza's seasonality, President Trump's comments on weather's effect, and social distancing. Conclusion: There is a major gap between scientific evidence and public opinion of weather's impacts on COVID-19. We provide evidence of public's misconceptions and topics of discussion, which can inform public health communications.


2020 ◽  
Vol 20 (1) ◽  
Author(s):  
Hala Hamadah ◽  
Barrak Alahmad ◽  
Mohammad Behbehani ◽  
Sarah Al-Youha ◽  
Sulaiman Almazeedi ◽  
...  

Abstract Background In light of the COVID-19 pandemic, many have flagged racial and ethnic differences in health outcomes in western countries as an urgent global public health priority. Kuwait has a unique demographic profile with two-thirds of the population consisting of non-nationals, most of which are migrant workers. We aimed to explore whether there is a significant difference in health outcomes between non-Kuwaiti and Kuwaiti patients diagnosed with COVID-19. Methods We used a prospective COVID-19 registry of all patients (symptomatic and asymptomatic) in Kuwait who tested positive from February 24th to April 20th, 2020, collected from Jaber Al-Ahmad Al-Sabah Hospital, the officially-designated COVID-19 healthcare facility in the country. We ran separate logistic regression models comparing non-Kuwaitis to Kuwaitis for death, intensive care unit (ICU) admission, acute respiratory distress syndrome (ARDS) and pneumonia. Results The first 1123 COVID-19 positive patients in Kuwait were all recruited in the study. About 26% were Kuwaitis and 73% were non-Kuwaiti. With adjustments made to age, gender, smoking and selected co-morbidities, non-Kuwaitis had two-fold increase in the odds of death or being admitted to the intensive care unit compared to Kuwaitis (OR: 2.14, 95% CI 1.12–4.32). Non-Kuwaitis had also higher odds of ARDS (OR:2.44, 95% CI 1.23–5.09) and pneumonia (OR: 2.24, 95% CI 1.27–4.12). Conclusion This is the first study to report on COVID-19 outcomes between Kuwaiti and non-Kuwaiti patients. The current pandemic may have amplified the differences of health outcomes among marginalized subpopulations. A number of socioeconomic and environmental factors could explain this health disparity. More research is needed to advance the understanding of policymakers in Kuwait in order to make urgent public health interventions.


2020 ◽  
Author(s):  
Jameson D. Voss ◽  
Martin Skarzynski ◽  
Erin M. McAuley ◽  
Ezekiel J. Maier ◽  
Thomas Gibbons ◽  
...  

AbstractIntroductionThe coronavirus disease 2019 (COVID-19) pandemic is a global public health emergency causing a disparate burden of death and disability around the world. The molecular characteristics of the virus that predict better or worse outcome are largely still being discovered.MethodsWe downloaded 155,958 severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genomes from GISAID and evaluated whether variants improved prediction of reported severity beyond age and region. We also evaluated specific variants to determine the magnitude of association with severity and the frequency of these variants among the genomes.ResultsLogistic regression models that included viral genomic variants outperformed other models (AUC=0.91 as compared with 0.68 for age and gender alone; p<0.001). Among individual variants, we found 17 single nucleotide variants in SARS-CoV-2 have more than two-fold greater odds of being associated with higher severity and 67 variants associated with ≤ 0.5 times the odds of severity. The median frequency of associated variants was 0.15% (interquartile range 0.09%-0.45%). Altogether 85% of genomes had at least one variant associated with patient outcome.ConclusionNumerous SARS-CoV-2 variants have two-fold or greater association with odds of mild or severe outcome and collectively, these variants are common. In addition to comprehensive mitigation efforts, public health measures should be prioritized to control the more severe manifestations of COVID-19 and the transmission chains linked to these severe cases.


2020 ◽  
Vol 11 (SPL1) ◽  
pp. 469-471 ◽  
Author(s):  
Bhagyashri Vijay Chaudhari ◽  
Priya P. Chawle

“A lesson learned the hard way is a lesson learned for a lifetime.” Every bad situation hurts; however, it sure does teach us something a lesson. In the same manner of a new lesson for Human lifetime, history is observing 'The Novel COVID-19 ’, a very horrible and strange situation created due to fighting with a microscopic enemy. WHO on 11 February 2020 has announced a name for new disease as - 19 and has declared as a global public health emergency and subsequently as pandemic because of its widespread. This began as an outbreak in December 2019, with its in Wuhan, the People Republic of China has emerged as a public health emergency of international concern. is the group of a virus with non-segmented, single-stranded and positive RNA genome. This bad situation of pandemic creates new scenes in the life of people in a different manner, which will be going to be life lessons for them. Such lessons should be kept in mind for the safety of living beings and many more things. In this narrative review article, reference was taken from a different article published in various databases which include the view of different authors and writers on the &quot;Lessons to be from Corona&quot;.


2020 ◽  
Author(s):  
Helmi Zakariah ◽  
Fadzilah bt Kamaluddin ◽  
Choo-Yee Ting ◽  
Hui-Jia Yee ◽  
Shereen Allaham ◽  
...  

UNSTRUCTURED The current outbreak of coronavirus disease 2019 (COVID-19) caused by the novel coronavirus named SARS-CoV-2 has been a major global public health problem threatening many countries and territories. Mathematical modelling is one of the non-pharmaceutical public health measures that plays a crucial role for mitigating the risk and impact of the pandemic. A group of researchers and epidemiologists have developed a machine learning-powered inherent risk of contagion (IRC) analytical framework to georeference the COVID-19 with an operational platform to plan response & execute mitigation activities. This framework dataset provides a coherent picture to track and predict the COVID-19 epidemic post lockdown by piecing together preliminary data on publicly available health statistic metrics alongside the area of reported cases, drivers, vulnerable population, and number of premises that are suspected to become a transmission area between drivers and vulnerable population. The main aim of this new analytical framework is to measure the IRC and provide georeferenced data to protect the health system, aid contact tracing, and prioritise the vulnerable.


Sign in / Sign up

Export Citation Format

Share Document