scholarly journals Spatial Prediction of COVID-19 in China Based on Machine Learning Algorithms and Geographically Weighted Regression

2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Qi Shao ◽  
Yongming Xu ◽  
Hanyi Wu

COVID-19 has swept through the world since December 2019 and caused a large number of patients and deaths. Spatial prediction on the spread of the epidemic is greatly important for disease control and management. In this study, we predicted the cumulative confirmed cases (CCCs) from Jan 17 to Mar 1, 2020, in mainland China at the city level, using machine learning algorithms, geographically weighted regression (GWR), and partial least squares regression (PLSR) based on population flow, geolocation, meteorological, and socioeconomic variables. The validation results showed that machine learning algorithms and GWR achieved good performances. These models could not effectively predict CCCs in Wuhan, the first city that reported COVID-19 cases in China, but performed well in other cities. Random Forest (RF) outperformed other methods with a CV ‐ R 2 of 0.84. In this model, the population flow from Wuhan to other cities (WP) was the most important feature and the other features also made considerable contributions to the prediction accuracy. Compared with RF, GWR showed a slightly worse performance ( CV ‐ R 2 = 0.81 ) but required fewer spatial independent variables. This study explored the spatial prediction of the epidemic based on multisource spatial independent variables, providing references for the estimation of CCCs in the regions lacking accurate and timely.

2020 ◽  
Author(s):  
Hanna Meyer ◽  
Edzer Pebesma

<p>Spatial mapping is an important task in environmental science to reveal spatial patterns and changes of the environment. In this context predictive modelling using flexible machine learning algorithms has become very popular. However, looking at the diversity of modelled (global) maps of environmental variables, there might be increasingly the impression that machine learning is a magic tool to map everything. Recently, the reliability of such maps have been increasingly questioned, calling for a reliable quantification of uncertainties.</p><p>Though spatial (cross-)validation allows giving a general error estimate for the predictions, models are usually applied to make predictions for a much larger area or might even be transferred to make predictions for an area where they were not trained on. But by making predictions on heterogeneous landscapes, there will be areas that feature environmental properties that have not been observed in the training data and hence not learned by the algorithm. This is problematic as most machine learning algorithms are weak in extrapolations and can only make reliable predictions for environments with conditions the model has knowledge about. Hence predictions for environmental conditions that differ significantly from the training data have to be considered as uncertain.</p><p>To approach this problem, we suggest a measure of uncertainty that allows identifying locations where predictions should be regarded with care. The proposed uncertainty measure is based on distances to the training data in the multidimensional predictor variable space. However, distances are not equally relevant within the feature space but some variables are more important than others in the machine learning model and hence are mainly responsible for prediction patterns. Therefore, we weight the distances by the model-derived importance of the predictors. </p><p>As a case study we use a simulated area-wide response variable for Europe, bio-climatic variables as predictors, as well as simulated field samples. Random Forest is applied as algorithm to predict the simulated response. The model is then used to make predictions for entire Europe. We then calculate the corresponding uncertainty and compare it to the area-wide true prediction error. The results show that the uncertainty map reflects the patterns in the true error very well and considerably outperforms ensemble-based standard deviations of predictions as indicator for uncertainty.</p><p>The resulting map of uncertainty gives valuable insights into spatial patterns of prediction uncertainty which is important when the predictions are used as a baseline for decision making or subsequent environmental modelling. Hence, we suggest that a map of distance-based uncertainty should be given in addition to prediction maps.</p>


Author(s):  
Yuri Andrei Gelsleichter ◽  
Lúcia Helena Cunha dos Anjos ◽  
Elias Mendes Costa ◽  
Gabriela Valente ◽  
Paula Debiasi ◽  
...  

Visible and near-infrared reflectance (Vis–NIR) techniques are a plausible method to soil analyses. The main objective of the study was to investigate the capacity to predicting soil properties Al, Ca, K, Mg, Na, P, pH, total carbon (TC), H and N, by using different spectral (350–2500 nm) pre-treatments and machine learning algorithms such as Artificial Neural Network (ANN), Random Forest (RF), Partial Least-squares Regression (PLSR) and Cubist (CB). The 300 soil samples were sampled in the upper part of the Itatiaia National Park (INP), located in Southeastern region of Brazil. The 10 K-fold cross validation was used with the models. The best spectral pre-treatment was the Inverse of Reflectance by a Factor of 104 (IRF4) for TC with CB, giving an averaged R² among the folds of 0.85, RMSE of 1.96; and 0.67 with 0.041 respectively for H. Into the K-folds models of TC, the highest prediction had a R² of 0.95. These results are relevant for the INP management plan, and also to similar environments. The good correlation with Vis–NIR techniques can be used for remote sense monitoring, especially in areas with very restricted access such as INP.


2020 ◽  
Vol 12 (22) ◽  
pp. 9338
Author(s):  
Anna Kopeć ◽  
Paweł Trybała ◽  
Dariusz Głąbicki ◽  
Anna Buczyńska ◽  
Karolina Owczarz ◽  
...  

Mining operations cause negative changes in the environment. Therefore, such areas require constant monitoring, which can benefit from remote sensing data. In this article, research was carried out on the environmental impact of underground hard coal mining in the Bogdanka mine, located in the southeastern Poland. For this purpose, spectral indexes, satellite radar interferometry, Geographic Information System (GIS) tools and machine learning algorithms were utilized. Based on optical, radar, geological, hydrological and meteorological data, a spatial model was developed to determine the statistical significance of the selected factors’ individual impact on the occurrence of wetlands. Obtained results show that Normalized Difference Vegetation Index (NDVI) change, terrain height, groundwater level and terrain displacement had a considerable influence on the occurrence of wetlands in the research area. Moreover, the machine learning model developed using the Random Forest algorithm allowed for an efficient determination of potential flooding zones based on a set of spatial variables, correctly detecting 76% area of wetlands. Finally, the GWR (Geographically Weighted Regression (GWR) modelling enabled identification of local anomalies of selected factors’ influence on the occurrence of wetlands, which in turn helped to understand the causes of wetland formation.


Author(s):  
Zouiten Mohammed ◽  
Chaaouan Hanae ◽  
Setti Larbi

Forest fires have caused considerable losses to ecologies, societies and economies worldwide. To minimize these losses and reduce forest fires, modeling and predicting the occurrence of forest fires are meaningful because they can support forest fire prevention and management. In recent years, the convolutional neural network (CNN) has become an important state-of-the-art deep learning algorithm, and its implementation has enriched many fields. Therefore, a competitive spatial prediction model for automatic early detection of wild forest fire using machine learning algorithms can be proposed. This model can help researchers to predict forest fires and identify risk zonas. System using machine learning algorithm on geodata will be able to notify in real time the interested parts and authorities by providing alerts and presenting on maps based on geographical treatments for more efficacity and analyzing of the situation. This research extends the application of machine learning algorithms for early fire forest prediction to detection and representation in geographical information system (GIS) maps.


2019 ◽  
Vol 11 (21) ◽  
pp. 2520 ◽  
Author(s):  
Hua Wu ◽  
Wangmin Ying

Net surface shortwave radiation (NSSR) is one of the most important fundamental parameters in various land processes. Benefiting from its efficient nonlinear fitting ability, machine learning algorithms have a great potential in the retrieval of NSSR. However, few studies have explored the level of accuracy that machine learning algorithms can reach for different land covers on the worldwide scale and what the optimal independent variables are in the machine learning-based NSSR model. To guide the use of machine learning algorithms correctly in the retrieval of NSSR, it is necessary to give a comprehensive analysis from algorithm complexity, accuracy, and other aspects. In this study, three classic machine learning algorithms, including Random Forest (RF), Artificial Neural Network (ANN), and Support Vector Regression (SVR), were built well to estimate instantaneous NSSR with optimal hyperparameters by elaborately selecting different independent variables, including top of atmosphere (TOA) channel spectral reflectance, geographic parameters, surface information, and atmosphere conditions. Global FLUXNET in situ measurements throughout 2014 were used to validate the accuracies of retrieved NSSR over various land cover types. The root mean square error (RMSE) is below 55 W/m2, and the distributions of error histogram are also similar. Approximately 50% of absolute error were within 25 W/m2. There was a performance difference of NSSR estimations in various surface types, and the performance of three machine learning methods in a specific surface type was also different. However, the RF method may be considered as the optimal methodology to retrieve NSSR from MODIS data, owing to its relatively better precision and concise hyperparameter-tuned process. The importance analysis of the proposed independent variables of NSSR retrieval shows that the introduction of geographic information can effectively reduce the error of NSSR retrieval, and surface information and atmosphere information are not necessary. It was also found that a combination of geographic information and blue band TOA reflectance already have a pretty good accuracy in NSSR retrieval, which implies there is a possibility to transfer our NSSR model to other satellite sensors, especially with insufficient channels. In a word, the NSSR model with machine learning algorithms would be an efficient, concise, and general method in the future.


Sensors ◽  
2019 ◽  
Vol 19 (10) ◽  
pp. 2350 ◽  
Author(s):  
Mariusz Pelc ◽  
Yuriy Khoma ◽  
Volodymyr Khoma

In this paper, the possibility of using the ECG signal as an unequivocal biometric marker for authentication and identification purposes has been presented. Furthermore, since the ECG signal was acquired from 4 sources using different measurement equipment, electrodes positioning and number of patients as well as the duration of the ECG record acquisition, we have additionally provided an estimation of the extent of information available in the ECG record. To provide a more objective assessment of the credibility of the identification method, some selected machine learning algorithms were used in two combinations: with and without compression. The results that we have obtained confirm that the ECG signal can be acclaimed as a valid biometric marker that is very robust to hardware variations, noise and artifacts presence, that is stable over time and that is scalable across quite a solid (~100) number of users. Our experiments indicate that the most promising algorithms for ECG identification are LDA, KNN and MLP algorithms. Moreover, our results show that PCA compression, used as part of data preprocessing, does not only bring any noticeable benefits but in some cases might even reduce accuracy.


Sign in / Sign up

Export Citation Format

Share Document