scholarly journals Analysis of Factors Influencing Forest Loss in South Korea: Statistical Models and Machine-Learning Model

Forests ◽  
2021 ◽  
Vol 12 (12) ◽  
pp. 1636
Author(s):  
Jeongmook Park ◽  
Byeoungmin Lim ◽  
Jungsoo Lee

Analyzing the current status of forest loss and its causes is crucial for understanding and preparing for future forest changes and the spatial pattern of forest loss. We investigated spatial patterns of forest loss in South Korea and assessed the effects of various factors on forest loss based on spatial heterogeneity. We used the local Moran’s I to classify forest loss spatial patterns as high–high clusters, low–low clusters, high–low outliers, and high–low outliers. Additionally, to assess the effect of factors on forest loss, two statistical models (i.e., ordinary least squares regression (OLS) and geographically weighted regression (GWR) models) and one machine-learning model (i.e., random forest (RF) model) were used. The accuracy of each model was determined using the R2, RMSE, MAE, and AICc. Across South Korea, the forest loss rate was highest in the Seoul–Incheon–Gyeonggi region. Moreover, high–high spatial clusters were found in the Seoul–Incheon–Gyeonggi and Daejeon–Chungnam regions. Among the models, the GWR model was the most accurate. Notably, according to the GWR model, the main factors driving forest loss were road density, cropland area, number of households, and number of tertiary industry establishments. However, the factors driving forest loss had varying degrees of influence depending on the location. Therefore, our findings suggest that spatial heterogeneity should be considered when developing policies to reduce forest loss.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Sarah Quiñones ◽  
Aditya Goyal ◽  
Zia U. Ahmed

AbstractType 2 diabetes mellitus (T2D) prevalence in the United States varies substantially across spatial and temporal scales, attributable to variations of socioeconomic and lifestyle risk factors. Understanding these variations in risk factors contributions to T2D would be of great benefit to intervention and treatment approaches to reduce or prevent T2D. Geographically-weighted random forest (GW-RF), a tree-based non-parametric machine learning model, may help explore and visualize the relationships between T2D and risk factors at the county-level. GW-RF outputs are compared to global (RF and OLS) and local (GW-OLS) models between the years of 2013–2017 using low education, poverty, obesity, physical inactivity, access to exercise, and food environment as inputs. Our results indicate that a non-parametric GW-RF model shows a high potential for explaining spatial heterogeneity of, and predicting, T2D prevalence over traditional local and global models when inputting six major risk factors. Some of these predictions, however, are marginal. These findings of spatial heterogeneity using GW-RF demonstrate the need to consider local factors in prevention approaches. Spatial analysis of T2D and associated risk factor prevalence offers useful information for targeting the geographic area for prevention and disease interventions.


Author(s):  
Yoshiro Suzuki ◽  
Ayaka Suzuki

We built a machine learning model (ML model) which input the number of daily infection cases and the other information related to COVID-19 over the past 24 days in each of 17 provinces in South Korea, and output the total increase in the number of infection cases in each of 17 provinces over the coming 24 days. We employ a combination of XGBoost and MultiOutputRegressor as machine learning model (ML model). For each province, we conduct a binary classification whether our ML model can classify provinces where total infection cases over the coming 24 days is more than 100. The result is Sensitivity = 3/3 = 100%, Specificity = 11/14 = 78.6%, False Positive Rate = 3/11 = 21.4%, Accuracy = 14/17 = 82.4%. Sensitivity = 100% means that we did not overlook the three provinces where the number of COVID-19 infection cases increased by more than100. In addition, as for the provinces where the actual number of new COVID-19 infection cases is less than 100, the ratio (Specificity) that our ML model can correctly estimate was 78.6%, which is relatively high. From the above all, it is demonstrated that there is a sufficient possibility that our ML model can support the following four points. (1) Promotion of behavior modification of residents in dangerous areas, (2) Assistance for decision to resume economic activities in each province, (3) Assistance in determining infectious disease control measures in each province, (4) Search for factors that are highly correlated with the future increase in the number of COVID-19 infection cases.


2018 ◽  
Author(s):  
Steen Lysgaard ◽  
Paul C. Jennings ◽  
Jens Strabo Hummelshøj ◽  
Thomas Bligaard ◽  
Tejs Vegge

A machine learning model is used as a surrogate fitness evaluator in a genetic algorithm (GA) optimization of the atomic distribution of Pt-Au nanoparticles. The machine learning accelerated genetic algorithm (MLaGA) yields a 50-fold reduction of required energy calculations compared to a traditional GA.


Author(s):  
Dhilsath Fathima.M ◽  
S. Justin Samuel ◽  
R. Hari Haran

Aim: This proposed work is used to develop an improved and robust machine learning model for predicting Myocardial Infarction (MI) could have substantial clinical impact. Objectives: This paper explains how to build machine learning based computer-aided analysis system for an early and accurate prediction of Myocardial Infarction (MI) which utilizes framingham heart study dataset for validation and evaluation. This proposed computer-aided analysis model will support medical professionals to predict myocardial infarction proficiently. Methods: The proposed model utilize the mean imputation to remove the missing values from the data set, then applied principal component analysis to extract the optimal features from the data set to enhance the performance of the classifiers. After PCA, the reduced features are partitioned into training dataset and testing dataset where 70% of the training dataset are given as an input to the four well-liked classifiers as support vector machine, k-nearest neighbor, logistic regression and decision tree to train the classifiers and 30% of test dataset is used to evaluate an output of machine learning model using performance metrics as confusion matrix, classifier accuracy, precision, sensitivity, F1-score, AUC-ROC curve. Results: Output of the classifiers are evaluated using performance measures and we observed that logistic regression provides high accuracy than K-NN, SVM, decision tree classifiers and PCA performs sound as a good feature extraction method to enhance the performance of proposed model. From these analyses, we conclude that logistic regression having good mean accuracy level and standard deviation accuracy compared with the other three algorithms. AUC-ROC curve of the proposed classifiers is analyzed from the output figure.4, figure.5 that logistic regression exhibits good AUC-ROC score, i.e. around 70% compared to k-NN and decision tree algorithm. Conclusion: From the result analysis, we infer that this proposed machine learning model will act as an optimal decision making system to predict the acute myocardial infarction at an early stage than an existing machine learning based prediction models and it is capable to predict the presence of an acute myocardial Infarction with human using the heart disease risk factors, in order to decide when to start lifestyle modification and medical treatment to prevent the heart disease.


Author(s):  
Dhaval Patel ◽  
Shrey Shrivastava ◽  
Wesley Gifford ◽  
Stuart Siegel ◽  
Jayant Kalagnanam ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document