scholarly journals A random forest model for forecasting regional COVID-19 cases utilizing reproduction number estimates and demographic data

2022 ◽  
pp. 111779
Author(s):  
Joseph Galasso ◽  
Duy M. Cao ◽  
Robert Hochberg
2021 ◽  
Author(s):  
Joseph Galasso ◽  
Duy M. Cao ◽  
Robert Hochberg

During the COVID-19 pandemic, predicting case spikes at the local level is important for a precise, targeted public health response and is generally done with compartmental models. The performance of compartmental models is highly dependent on the accuracy of their assumptions about disease dynamics within a population; thus, such models are susceptible to human error, unexpected events, or unknown characteristics of a novel infectious agent like COVID-19. We present a relatively non-parametric random forest model that forecasts the number of COVID-19 cases at the U.S. county level. Its most prioritized training features are derived from easily accessible, standard epidemiological data (i.e., regional test positivity rate) and the effective reproduction number R(t) from compartmental models. A novel input training feature is case projections generated by aligning estimated effective reproduction number from a compartmental model with real time testing data until maximally correlated, helping our model fit better to the epidemic's trajectory ascertained by traditional models. Any poor reliability of R(t) due to flaws in the compartmental model are mitigated with dynamic population mobility and prevalence and mortality of non-COVID-19 diseases to gauge population disease susceptibility. The model was used to generate forecasts for 1, 2, 3, and 4 weeks into the future for each reference week within 11/01/2020 - 01/10/2021 for 3068 counties. Over this time period, it maintained a mean absolute error (MAE) of less than 300 weekly cases/100,000 and consistently outperformed or performed comparably with gold-standard compartmental models. Furthermore, it holds great potential in ensemble modeling due to its potential for a more expansive training feature set while maintaining good performance and limited resource utilization.


2021 ◽  
Author(s):  
Christian Thiele ◽  
Gerrit Hirschfeld ◽  
Ruth von Brachel

AbstractRegistries of clinical trials are a potential source for scientometric analysis of medical research and serve important functions for the research community and the public at large. Clinical trials that recruit patients in Germany are usually registered in the German Clinical Trials Register (DRKS) or in international registries such as ClinicalTrials.gov. Furthermore, the International Clinical Trials Registry Platform (ICTRP) aggregates trials from multiple primary registries. We queried the DRKS, ClinicalTrials.gov, and the ICTRP for trials with a recruiting location in Germany. Trials that were registered in multiple registries were linked using the primary and secondary identifiers and a Random Forest model based on various similarity metrics. We identified 35,912 trials that were conducted in Germany. The majority of the trials was registered in multiple databases. 32,106 trials were linked using primary IDs, 26 were linked using a Random Forest model, and 10,537 internal duplicates on ICTRP were identified using the Random Forest model after finding pairs with matching primary or secondary IDs. In cross-validation, the Random Forest increased the F1-score from 96.4% to 97.1% compared to a linkage based solely on secondary IDs on a manually labelled data set. 28% of all trials were registered in the German DRKS. 54% of the trials on ClinicalTrials.gov, 43% of the trials on the DRKS and 56% of the trials on the ICTRP were pre-registered. The ratio of pre-registered studies and the ratio of studies that are registered in the DRKS increased over time.


2021 ◽  
Vol 10 (8) ◽  
pp. 503
Author(s):  
Hang Liu ◽  
Riken Homma ◽  
Qiang Liu ◽  
Congying Fang

The simulation of future land use can provide decision support for urban planners and decision makers, which is important for sustainable urban development. Using a cellular automata-random forest model, we considered two scenarios to predict intra-land use changes in Kumamoto City from 2018 to 2030: an unconstrained development scenario, and a planning-constrained development scenario that considers disaster-related factors. The random forest was used to calculate the transition probabilities and the importance of driving factors, and cellular automata were used for future land use prediction. The results show that disaster-related factors greatly influence land vacancy, while urban planning factors are more important for medium high-rise residential, commercial, and public facilities. Under the unconstrained development scenario, urban land use tends towards spatially disordered growth in the total amount of steady growth, with the largest increase in low-rise residential areas. Under the planning-constrained development scenario that considers disaster-related factors, the urban land area will continue to grow, albeit slowly and with a compact growth trend. This study provides planners with information on the relevant trends in different scenarios of land use change in Kumamoto City. Furthermore, it provides a reference for Kumamoto City’s future post-disaster recovery and reconstruction planning.


2021 ◽  
pp. 100017
Author(s):  
Xinyu Dou ◽  
Cuijuan Liao ◽  
Hengqi Wang ◽  
Ying Huang ◽  
Ying Tu ◽  
...  

2021 ◽  
Vol 49 (3) ◽  
pp. 030006052199398
Author(s):  
Jinwu Peng ◽  
Zhili Duan ◽  
Yamin Guo ◽  
Xiaona Li ◽  
Xiaoqin Luo ◽  
...  

Objectives Liver echinococcosis is a severe zoonotic disease caused by Echinococcus (tapeworm) infection, which is epidemic in the Qinghai region of China. Here, we aimed to explore biomarkers and establish a predictive model for the diagnosis of liver echinococcosis. Methods Microarray profiling followed by Gene Ontology and Kyoto Encyclopedia of Genes and Genomes analysis was performed in liver tissue from patients with liver hydatid disease and from healthy controls from the Qinghai region of China. A protein–protein interaction (PPI) network and random forest model were established to identify potential biomarkers and predict the occurrence of liver echinococcosis, respectively. Results Microarray profiling identified 1152 differentially expressed genes (DEGs), including 936 upregulated genes and 216 downregulated genes. Several previously unreported biological processes and signaling pathways were identified. The FCGR2B and CTLA4 proteins were identified by the PPI networks and random forest model. The random forest model based on FCGR2B and CTLA4 reliably predicted the occurrence of liver hydatid disease, with an area under the receiver operator characteristic curve of 0.921. Conclusion Our findings give new insight into gene expression in patients with liver echinococcosis from the Qinghai region of China, improving our understanding of hepatic hydatid disease.


Sign in / Sign up

Export Citation Format

Share Document