scholarly journals A comparison of five epidemiological models for transmission of SARS-CoV-2 in India

2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Soumik Purkayastha ◽  
Rupam Bhattacharyya ◽  
Ritwik Bhaduri ◽  
Ritoban Kundu ◽  
Xuelin Gu ◽  
...  

Abstract Background Many popular disease transmission models have helped nations respond to the COVID-19 pandemic by informing decisions about pandemic planning, resource allocation, implementation of social distancing measures, lockdowns, and other non-pharmaceutical interventions. We study how five epidemiological models forecast and assess the course of the pandemic in India: a baseline curve-fitting model, an extended SIR (eSIR) model, two extended SEIR (SAPHIRE and SEIR-fansy) models, and a semi-mechanistic Bayesian hierarchical model (ICM). Methods Using COVID-19 case-recovery-death count data reported in India from March 15 to October 15 to train the models, we generate predictions from each of the five models from October 16 to December 31. To compare prediction accuracy with respect to reported cumulative and active case counts and reported cumulative death counts, we compute the symmetric mean absolute prediction error (SMAPE) for each of the five models. For reported cumulative cases and deaths, we compute Pearson’s and Lin’s correlation coefficients to investigate how well the projected and observed reported counts agree. We also present underreporting factors when available, and comment on uncertainty of projections from each model. Results For active case counts, SMAPE values are 35.14% (SEIR-fansy) and 37.96% (eSIR). For cumulative case counts, SMAPE values are 6.89% (baseline), 6.59% (eSIR), 2.25% (SAPHIRE) and 2.29% (SEIR-fansy). For cumulative death counts, the SMAPE values are 4.74% (SEIR-fansy), 8.94% (eSIR) and 0.77% (ICM). Three models (SAPHIRE, SEIR-fansy and ICM) return total (sum of reported and unreported) cumulative case counts as well. We compute underreporting factors as of October 31 and note that for cumulative cases, the SEIR-fansy model yields an underreporting factor of 7.25 and ICM model yields 4.54 for the same quantity. For total (sum of reported and unreported) cumulative deaths the SEIR-fansy model reports an underreporting factor of 2.97. On October 31, we observe 8.18 million cumulative reported cases, while the projections (in millions) from the baseline model are 8.71 (95% credible interval: 8.63–8.80), while eSIR yields 8.35 (7.19–9.60), SAPHIRE returns 8.17 (7.90–8.52) and SEIR-fansy projects 8.51 (8.18–8.85) million cases. Cumulative case projections from the eSIR model have the highest uncertainty in terms of width of 95% credible intervals, followed by those from SAPHIRE, the baseline model and finally SEIR-fansy. Conclusions In this comparative paper, we describe five different models used to study the transmission dynamics of the SARS-Cov-2 virus in India. While simulation studies are the only gold standard way to compare the accuracy of the models, here we were uniquely poised to compare the projected case-counts against observed data on a test period. The largest variability across models is observed in predicting the “total” number of infections including reported and unreported cases (on which we have no validation data). The degree of under-reporting has been a major concern in India and is characterized in this report. Overall, the SEIR-fansy model appeared to be a good choice with publicly available R-package and desired flexibility plus accuracy.

2021 ◽  
Author(s):  
Soumik Purkayastha ◽  
Rupam Bhattacharyya ◽  
Ritwik Bhaduri ◽  
Ritoban Kundu ◽  
Xuelin Gu ◽  
...  

Abstract BackgroundMany popular disease transmission models have helped nations respond to the COVID-19 pandemic by informing decisions about pandemic planning, resource allocation, implementation of social distancing measures and other non-pharmaceutical interventions. We study how five epidemiological models forecast and assess the course of the pandemic in India: a baseline model, an extended SIR (eSIR) model, two extended SEIR (SAPHIRE and SEIR-fansy) models, and a semi-mechanistic Bayesian hierarchical model (ICM). MethodsUsing COVID-19 data for India from March 15 to June 18 to train the models, we generate predictions from each of the five models from June 19 to July 18. To compare prediction accuracy with respect to reported cumulative and active case counts and cumulative death counts, we compute the symmetric mean absolute prediction error (SMAPE) for each of the five models. ResultsFor active case counts, SMAPE values are 0.72 (SEIR-fansy) and 33.83 (eSIR). For cumulative case counts, SMAPE values are 1.76 (baseline) 23. (eSIR), 2.07 (SAPHIRE) and 3.20 (SEIR-fansy). For cumulative death counts, the SMAPE values are 7.13 (SEIR-fansy) and 26.30 (eSIR). For cumulative cases and deaths, we compute Pearson’s and Lin’s correlation coefficients to investigate how well the projected and observed reported COVID-counts agree. Three models (SAPHIRE, SEIR-fansy and ICM) return total (sum of reported and unreported) counts as well. We compute underreporting factors as of June 30 and note that the SEIR-fansy model reports the highest underreporting factor for active cases (6.10) and cumulative deaths (3.62), while the SAPHIRE model reports the highest underreporting factor for cumulative cases (27.79).ConclusionsIn this comparative paper we describe five different models used to study full disease transmission of the SARS-Cov-2 disease transmission in India. While simulation studies are the only gold standard way to compare the accuracy of the models, here we were uniquely poised to compare the projected case-counts against observed data on a test period. Prediction of daily active number of cases does show appreciable variation across models. The largest variability across models is observed in predicting the “total” number of infections including reported and unreported cases. The degree of under-reporting has been a major concern in India.


2020 ◽  
Author(s):  
Soumik Purkayastha ◽  
Rupam Bhattacharyya ◽  
Ritwik Bhaduri ◽  
Ritoban Kundu ◽  
Xuelin Gu ◽  
...  

Many popular disease transmission models have helped nations respond to the COVID-19 pandemic by informing decisions about pandemic planning, resource allocation, implementation of social distancing measures and other non-pharmaceutical interventions. We compare five epidemiological models for forecasting and assessing the course of the pandemic. We compare how the models analyze case-recovery-death count data in India, the country with second highest reported case-counts in a world where a large proportion of infections remain undetected. A baseline curve-fitting model is introduced, in addition to three compartmental models: an extended SIR (eSIR) model, an expanded SEIR model developed to account for infectiousness of asymptomatic and pre-symptomatic cases (SAPHIRE), another SEIR model to handle high false negative rate and symptom-based administration of tests (SEIR-fansy). A semi-mechanistic Bayesian hierarchical model developed at the Imperial College London (ICM) is also examined. Using COVID-19 data for India from March 15 to June 18 to train the models, we generate predictions from each of the five models from June 19 to July 18. To compare prediction accuracy with respect to reported cumulative and active case counts and cumulative death counts, we compute the symmetric mean absolute prediction error (SMAPE) and mean squared relative prediction error (MSRPE) for each of the five models. For active case counts, SEIR-fansy yields an SMAPE value of 0.72, and the eSIR model yields a value of 33.83. For cumulative case counts, SMAPE values are 1.76 for baseline model, 23.10 for eSIR, 2.07 for SAPHIRE and 3.20 for SEIR-fansy. For cumulative death counts, the SEIR-fansy model performs the best, with an SMAPE of 7.13, as compared to 26.30 for the eSIR model. Using Pearson correlation coefficient and Lin concordance correlation coefficient, for cumulative case counts, the baseline model exhibits highest correlation (both Pearson as well as Lin coefficients), while for cumulative death counts, projections from SEIR-fansy exhibit the best performance: For cumulative cases, correlation coefficients computed for the baseline model are 1 (Pearson) and 0.991 (Lin). For eSIR, those values are 0.985 (Pearson) and 0.316 (Lin). For SAPHIRE, we compute 1 (Pearson) and 0.975 (Lin). Finally, for SEIR-fansy we have those values at 1 (Pearson) and 0.965 (Lin). Similarly, for cumulative deaths, correlation coefficients computed for eSIR is 0.978 (Pearson) and 0.206 (Lin), and for SEIR-fansy we have those values at 0.999 (Pearson) and 0.742 (Lin). Three models (SAPHIRE, SEIR-fansy and ICM) return total (sum of reported and unreported) counts as well. We compute underreporting factors on two specific dates (June 30 and July 10) and note that on both dates, the SEIR-fansy model reports the highest underreporting factor for active cases (June 30: 6.10 and July 10: 6.24) and cumulative deaths (June 30: 3.62 and July 10: 3.99) for both dates, while the SAPHIRE model reports the highest underreporting factor for cumulative cases (June 30: 27.79 and July 10: 26.74).


Animals ◽  
2020 ◽  
Vol 10 (6) ◽  
pp. 1071
Author(s):  
Kathrin Büttner ◽  
Joachim Krieter

Besides the direct transport of animals, also indirect transmission routes, e.g., contact via contaminated vehicles, have to be considered. In this study, the transmission routes of a German pig trade network were illustrated as a monopartite animal movements network and two bipartite networks including information of the transport company and the feed producer which were projected on farm level (n = 866) to enable a comparison. The networks were investigated with the help of network analysis and formed the basis for epidemiological models to evaluate the impact of different transmission routes on network structure as well as on potential epidemic sizes. The number of edges increased immensely from the monopartite animal movements network to both projected networks. The median centrality parameters revealed clear differences between the three representations. Furthermore, moderate correlation coefficients ranging from 0.55 to 0.68 between the centrality values of the animal movements network and the projected transportation network were obtained. The epidemiological models revealed significantly more infected farms for both projected networks (70% to 100%) compared to the animal movements network (1%). The inclusion of indirect transmission routes had an immense impact on the outcome of centrality parameters as well as on the results of the epidemiological models.


2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Fiona Teltscher ◽  
Sophie Bouvaine ◽  
Gabriella Gibson ◽  
Paul Dyer ◽  
Jennifer Guest ◽  
...  

Abstract Background Mosquito-borne diseases are a global health problem, causing hundreds of thousands of deaths per year. Pathogens are transmitted by mosquitoes feeding on the blood of an infected host and then feeding on a new host. Monitoring mosquito host-choice behaviour can help in many aspects of vector-borne disease control. Currently, it is possible to determine the host species and an individual human host from the blood meal of a mosquito by using genotyping to match the blood profile of local inhabitants. Epidemiological models generally assume that mosquito biting behaviour is random; however, numerous studies have shown that certain characteristics, e.g. genetic makeup and skin microbiota, make some individuals more attractive to mosquitoes than others. Analysing blood meals and illuminating host-choice behaviour will help re-evaluate and optimise disease transmission models. Methods We describe a new blood meal assay that identifies the sex of the person that a mosquito has bitten. The amelogenin locus (AMEL), a sex marker located on both X and Y chromosomes, was amplified by polymerase chain reaction in DNA extracted from blood-fed Aedes aegypti and Anopheles coluzzii. Results AMEL could be successfully amplified up to 24 h after a blood meal in 100% of An. coluzzii and 96.6% of Ae. aegypti, revealing the sex of humans that were fed on by individual mosquitoes. Conclusions The method described here, developed using mosquitoes fed on volunteers, can be applied to field-caught mosquitoes to determine the host species and the biological sex of human hosts on which they have blood fed. Two important vector species were tested successfully in our laboratory experiments, demonstrating the potential of this technique to improve epidemiological models of vector-borne diseases. This viable and low-cost approach has the capacity to improve our understanding of vector-borne disease transmission, specifically gender differences in exposure and attractiveness to mosquitoes. The data gathered from field studies using our method can be used to shape new transmission models and aid in the implementation of more effective and targeted vector control strategies by enabling a better understanding of the drivers of vector-host interactions.


PeerJ ◽  
2018 ◽  
Vol 6 ◽  
pp. e5768 ◽  
Author(s):  
Camilo Saavedra

Mortality is one of the most important parameters for the study of population dynamics. One of the main sources of information to calculate the mortality of cetaceans arises from the observed age-structure of stranded animals. A method based on an adaptation of a Heligman-Pollard model is proposed. A freely accessible package of functions (strandCet) has been created to apply this method in the statistical software R. Total, natural, and anthropogenic mortality-at-age is estimated using only data of stranded cetaceans whose age is known. Bayesian melding estimation with Incremental Mixture Importance Sampling is used for fitting this model. This characteristic, which accounts for uncertainty, further eases the estimation of credible intervals. The package also includes functions to perform life tables, Siler mortality models to calculate total mortality-at-age and Leslie matrices to derive population projections. Estimated mortalities can be tested under different scenarios. Population parameters as population growth, net production or generation time can be derived from population projections. The strandCet R package provides a new analytical framework to assess mortality in cetacean populations and to explore the consequences of management decisions using only stranding-derived data.


2019 ◽  
Vol 4 (1) ◽  
Author(s):  
Chang Chen ◽  
Shixue Sun ◽  
Zhixin Cao ◽  
Yan Shi ◽  
Baoqing Sun ◽  
...  

Abstract Sample entropy is a powerful tool for analyzing the complexity and irregularity of physiology signals which may be associated with human health. Nevertheless, the sophistication of its calculation hinders its universal application. As of today, the R language provides multiple open-source packages for calculating sample entropy. All of which, however, are designed for different scenarios. Therefore, when searching for a proper package, the investigators would be confused on the parameter setting and selection of algorithms. To ease their selection, we have explored the functions of five existing R packages for calculating sample entropy and have compared their computing capability in several dimensions. We used four published datasets on respiratory and heart rate to study their input parameters, types of entropy, and program running time. In summary, NonlinearTseries and CGManalyzer can provide the analysis of sample entropy with different embedding dimensions and similarity thresholds. CGManalyzer is a good choice for calculating multiscale sample entropy of physiological signal because it not only shows sample entropy of all scales simultaneously but also provides various visualization plots. MSMVSampEn is the only package that can calculate multivariate multiscale entropies. In terms of computing time, NonlinearTseries, CGManalyzer, and MSMVSampEn run significantly faster than the other two packages. Moreover, we identify the issues in MVMSampEn package. This article provides guidelines for researchers to find a suitable R package for their analysis and applications using sample entropy.


2020 ◽  
Vol 20 (1) ◽  
Author(s):  
Levente Kriston

Abstract Background Infectious disease predictions models, including virtually all epidemiological models describing the spread of the SARS-CoV-2 pandemic, are rarely evaluated empirically. The aim of the present study was to investigate the predictive accuracy of a prognostic model for forecasting the development of the cumulative number of reported SARS-CoV-2 cases in countries and administrative regions worldwide until the end of May 2020. Methods The cumulative number of reported SARS-CoV-2 cases was forecasted in 251 regions with a horizon of two weeks, one month, and two months using a hierarchical logistic model at the end of March 2020. Forecasts were compared to actual observations by using a series of evaluation metrics. Results On average, predictive accuracy was very high in nearly all regions at the two weeks forecast, high in most regions at the one month forecast, and notable in the majority of the regions at the two months forecast. Higher accuracy was associated with the availability of more data for estimation and with a more pronounced cumulative case growth from the first case to the date of estimation. In some strongly affected regions, cumulative case counts were considerably underestimated. Conclusions With keeping its limitations in mind, the investigated model may be used for the preparation and distribution of resources during the initial phase of epidemics. Future research should primarily address the model’s assumptions and its scope of applicability. In addition, establishing a relationship with known mechanisms and traditional epidemiological models of disease transmission would be desirable.


Biostatistics ◽  
2017 ◽  
Vol 18 (3) ◽  
pp. 569-585 ◽  
Author(s):  
Panagiota Filippou ◽  
Giampiero Marra ◽  
Rosalba Radice

SUMMARY This article proposes a penalized likelihood method to estimate a trivariate probit model, which accounts for several types of covariate effects (such as linear, nonlinear, random, and spatial effects), as well as error correlations. The proposed approach also addresses the difficulty in estimating accurately the correlation coefficients, which characterize the dependence of binary responses conditional on covariates. The parameters of the model are estimated within a penalized likelihood framework based on a carefully structured trust region algorithm with integrated automatic multiple smoothing parameter selection. The relevant numerical computation can be easily carried out using the SemiParTRIV() function in a freely available R package. The proposed method is illustrated through a case study whose aim is to model jointly adverse birth binary outcomes in North Carolina.


Animals ◽  
2021 ◽  
Vol 11 (9) ◽  
pp. 2535
Author(s):  
Jeanette Wentzel ◽  
Cory Gall ◽  
Mark Bourn ◽  
Juan De Beer ◽  
Ferreira du Plessis ◽  
...  

South African protected areas account for 8% of the total landmass according to World Bank indicators. Effective conservation of biodiversity in protected areas requires the development of specific reserve management objectives addressing species and disease management. The primary objective of the current study was to identify predictors of carnivore detection in an effort to inform carnivore species management plans on Andover and Manyeleti nature reserves in South Africa. A limited number of camera traps were placed randomly using a grid system. Species detection data were analysed using mixed-effects logistic regression and Spearman’s correlation coefficients. Deterministic inverse distance weighted distribution maps were used to describe the spatial distribution of carnivore species. Camera traps identified similar species as traditional call-up surveys during the study and would be useful as an adjunct census method. Carnivore detection was associated with several variables, including the presence of specific prey species. The measured intra-and interspecies interactions suggested the risk of disease transmission among species, and vaccination for prevalent diseases should be considered to manage this risk.


Sign in / Sign up

Export Citation Format

Share Document