scholarly journals Nonparametric machine learning for mapping forest cover and exploring influential factors

2020 ◽  
Vol 35 (7) ◽  
pp. 1683-1699
Author(s):  
Bao Liu ◽  
Lei Gao ◽  
Baoan Li ◽  
Raymundo Marcos-Martinez ◽  
Brett A. Bryan
Computers ◽  
2021 ◽  
Vol 10 (12) ◽  
pp. 157
Author(s):  
Daniel Santos ◽  
José Saias ◽  
Paulo Quaresma ◽  
Vítor Beires Nogueira

Traffic accidents are one of the most important concerns of the world, since they result in numerous casualties, injuries, and fatalities each year, as well as significant economic losses. There are many factors that are responsible for causing road accidents. If these factors can be better understood and predicted, it might be possible to take measures to mitigate the damages and its severity. The purpose of this work is to identify these factors using accident data from 2016 to 2019 from the district of Setúbal, Portugal. This work aims at developing models that can select a set of influential factors that may be used to classify the severity of an accident, supporting an analysis on the accident data. In addition, this study also proposes a predictive model for future road accidents based on past data. Various machine learning approaches are used to create these models. Supervised machine learning methods such as decision trees (DT), random forests (RF), logistic regression (LR), and naive Bayes (NB) are used, as well as unsupervised machine learning techniques including DBSCAN and hierarchical clustering. Results show that a rule-based model using the C5.0 algorithm is capable of accurately detecting the most relevant factors describing a road accident severity. Further, the results of the predictive model suggests the RF model could be a useful tool for forecasting accident hotspots.


2014 ◽  
Vol 955-959 ◽  
pp. 3803-3812
Author(s):  
Guang Di Li ◽  
Guo Yin Wang ◽  
Xue Rui Zhang ◽  
Wei Hui Deng ◽  
Fan Zhang

Storm is the most popular realtime stream processing platform, which can be used to deal with online machine learning. Similar to how Hadoop provides a set of general primitives for doing batch processing, Storm provides a set of general primitives for doing realtime computation. SAMOA includes distributed algorithms for the most common machine learning tasks like Mahout for Hadoop. SAMOA is both a platform and a library. In this paper, Forest cover types, a large benchmaking dataset available at the UCI KDD Archive is used as the data stream source. Vertical Hoeffding Tree, a parallelizing streaming decision tree induction for distributed enviroment, which is incorporated in SAMOA API is applied on Storm platform. This study compared stream prcessing technique for predicting forest cover types from cartographic variables with traditional classic machine learning algorithms applied on this dataset. The test then train method used in this system is totally different from the traditional train then test. The results of the stream processing technique indicated that it’s output is aymptotically nearly identical to that of a conventional learner, but the model derived from this system is totally scalable, real-time, capable of dealing with evolving streams and insensitive to stream ordering.


2021 ◽  
Vol 45 (1) ◽  
pp. 111-124
Author(s):  
Jaehee Cho ◽  
Sehwan Kim ◽  
Gwangjin Jeong ◽  
Chonghye Kim ◽  
Ja-Kyoung Seo

Objectives: In this study, we aimed to find the influential factors in determining individuals' use and non-use of fitness and diet apps on smartphones. To this end, we focused on diverse groups of predictors that would significantly affect people's use and non-use of these apps. Methods: Overall, we considered 105 factors as potential predictors and included them in further analyses using a machine learning algorithm, XGBoost. The main reason for selecting this particular algorithm was that it had been known as one of the most accurate and popular algorithms for predicting consumer behaviors. Results: We found the accuracy score of those factors for predicting people's use and non-use of fitness and diet apps was approximately 71.3%. In particular, the most influential predictors were mainly related to social influence, media use, overeating, social support, health management, and attitudes toward exercise. Conclusion: These findings contribute to helping scholars and practitioners to develop more practical strategies of the implementation of fitness and diet apps.


2020 ◽  
Author(s):  
Johannes Kirchebner ◽  
Moritz Günther ◽  
Martina Sonnweber ◽  
Alice King ◽  
Steffen Lau

Abstract Background: Prolonged forensic psychiatric hospitalizations have raised ethical, economic, and clinical concerns. Due to the confounded nature of factors affecting length of stay of psychiatric offender patients, prior research has called for the application of a new statistical methodology better accommodating this data structure. The present study attempts to investigate factors contributing to long-term hospitalization of schizophrenic offenders referred to a Swiss forensic institution, using machine learning algorithms that are better suited than conventional methods to detect nonlinear dependencies between variables. Methods: In this retrospective file and registry study, multidisciplinary notes of 143 schizophrenic offenders were reviewed using a structured protocol on patients’ characteristics, criminal and medical history and course of treatment. Via a forward selection procedure, the most influential factors for length of stay were preselected. Machine learning algorithms then identified the most efficient model for predicting length-of-stay. Results: Two factors have been identified as being particularly influential for a prolonged forensic hospital stay, both of which are related to aspects of the index offense, namely (attempted) homicide and the extent of the victim's injury. The results are discussed in light of previous research on this topic. Conclusions: In this study, length of stay was determined by legal considerations, but not by factors that can be influenced therapeutically. Results emphasize that forensic risk assessments should be based on different evaluation criteria and not merely on legal aspects.


2019 ◽  
Vol 30 (3) ◽  
pp. 344-352 ◽  
Author(s):  
Saisanjana Kalagara ◽  
Adam E. M. Eltorai ◽  
Wesley M. Durand ◽  
J. Mason DePasse ◽  
Alan H. Daniels

OBJECTIVEHospital readmission contributes substantial costs to the healthcare system. The purpose of this investigation was to create a predictive machine learning model to identify lumbar laminectomy patients at risk for postoperative hospital readmission.METHODSPatients who had undergone a lumbar laminectomy procedure in the period from 2011 to 2014 were isolated from the American College of Surgeons National Surgical Quality Improvement Program (ACS NSQIP) database. Demographic characteristics and clinical factors, including complications, comorbidities, length of stay, age, and body mass index, were analyzed in relation to whether or not the patients had been readmitted to the hospital within 30 days after their procedure by utilizing independent-samples t-tests. Supervised gradient boosting machine learning was then used to create two models to predict readmission—one with all collected patient variables and one with only the variables known prior to hospital discharge.RESULTSA total of 26,869 patients were evaluated, 5.59% (1501 patients) of whom had an unplanned readmission to the hospital within 30 days of their procedure. Readmitted patients were older and had a greater number of complications and comorbidities, longer operative time, longer hospital stay, higher BMI, and higher work relative value unit (RVU) operation score (p < 0.01). They also had a worse health status prior to surgery (p < 0.01) and were more likely to be sent to a skilled discharge destination postoperatively (p < 0.01). The model with all patient variables accurately identified 49.6% of readmissions with an overall accuracy of 95.33% (area under the curve [AUC] = 0.8059), with postdischarge complications and comorbidities as the most important predictors. The predictive model built with only clinical information known predischarge identified 40.5% of readmitted patients with an accuracy of 79.55% (AUC = 0.6901), with discharge destination, comorbidities, and American Society of Anesthesiologists (ASA) classification as the most influential factors in identifying readmitted patients.CONCLUSIONSIn this study, the authors analyzed hospital readmissions following laminectomy and developed predictive models to identify readmitted patients with an accuracy of over 95% using all variables and over 79% when using only predischarge variables. Using only the variables available predischarge, the authors created a model capable of predicting 40% of the readmitted patients. This study provides data that will assist in the development of predictive models for readmission and the creation of interventions to prevent readmission in high-risk patients.


2021 ◽  
pp. 1-10
Author(s):  
Eduardo Rojas ◽  
Brian R Zutta ◽  
Yessenia K Velazco ◽  
Javier G Montoya-Zumaeta ◽  
Montserrat Salvà-Catarineu

Summary The prevention of tropical forest deforestation is essential for mitigating climate change. We tested the machine learning algorithm Maxent to predict deforestation across the Peruvian Amazon. We used official annual 2001–2019 deforestation data to develop a predictive model and to test the model’s accuracy using near-real-time forest loss data for 2020. Distance from agricultural land and distance from roads were the predictor variables that contributed most to the final model, indicating that a narrower set of variables contribute nearly 80% of the information necessary for prediction at scale. The permutation importance indicating variable information not present in the other variables was also highest for distance from agricultural land and distance from roads, at 40.5% and 14.3%, respectively. The predictive model registered 73.2% of the 2020 early alerts in a high or very high risk category; less than 1% of forest cover in national protected areas were registered as very high risk, but buffer zones were far more vulnerable, with 15% of forest cover being in this category. To our knowledge, this is the first study to use 19 years of annual data for deforestation risk. The open-source machine learning method could be applied to other forest regions, at scale, to improve strategies for reducing future deforestation.


2020 ◽  
Author(s):  
Serge Dolgikh

AbstractBased on a subset of Covid-19 Wave 1 cases at a time point near TZ+3m (April, 2020), we perform an analysis of the influencing factors for the epidemics impacts with several different statistical methods. The consistent conclusion of the analysis with the available data is that apart from the policy and management quality, being the dominant factor, the most influential factors among the considered were current or recent universal BCG immunization and the prevalence of smoking.


2019 ◽  
Vol 119 ◽  
pp. 407-417 ◽  
Author(s):  
Long Ye ◽  
Lei Gao ◽  
Raymundo Marcos-Martinez ◽  
Dirk Mallants ◽  
Brett A. Bryan

Sign in / Sign up

Export Citation Format

Share Document