Data-driven prediction of peak sound levels at long range using sparse, ground-level meteorological measurements and a random forest

Over the years, rampant wildfires have plagued the state of California, creating economic and environmental loss. In 2018, wildfires cost nearly 800 million dollars in economic loss and claimed more than 100 lives in California. Over 1.6 million acres of land has burned and caused large sums of environmental damage. Although, recently, researchers have introduced machine learning models and algorithms in predicting the wildfire risks, these results focused on special perspectives and were restricted to a limited number of data parameters. In this paper, we have proposed two data-driven machine learning approaches based on random forest models to predict the wildfire risk at areas near Monticello and Winters, California. This study demonstrated how the models were developed and applied with comprehensive data parameters such as powerlines, terrain, and vegetation in different perspectives that improved the spatial and temporal accuracy in predicting the risk of wildfire including fire ignition. The combined model uses the spatial and the temporal parameters as a single combined dataset to train and predict the fire risk, whereas the ensemble model was fed separate parameters that were later stacked to work as a single model. Our experiment shows that the combined model produced better results compared to the ensemble of random forest models on separate spatial data in terms of accuracy. The models were validated with Receiver Operating Characteristic (ROC) curves, learning curves, and evaluation metrics such as: accuracy, confusion matrices, and classification report. The study results showed and achieved cutting-edge accuracy of 92% in predicting the wildfire risks, including ignition by utilizing the regional spatial and temporal data along with standard data parameters in Northern California.

Download Full-text

Data-driven approach for tailoring facilitation strategies to overcome implementation barriers in community pharmacy

Implementation Science ◽

10.1186/s13012-021-01138-8 ◽

2021 ◽

Vol 16 (1) ◽

Author(s):

Lydia Moussa ◽

Shalom Benrimoj ◽

Katarzyna Musial ◽

Simon Kocbek ◽

Victoria Garcia-Cardenas

Keyword(s):

Random Forest ◽

Community Pharmacy ◽

Implementation Research ◽

Theoretical Domains Framework ◽

Professional Services ◽

Data Driven ◽

Machine Learning Techniques ◽

Community Pharmacies ◽

Continuous Change ◽

Implementation Barriers

Abstract Background Implementation research has delved into barriers to implementing change and interventions for the implementation of innovation in practice. There remains a gap, however, that fails to connect implementation barriers to the most effective implementation strategies and provide a more tailored approach during implementation. This study aimed to explore barriers for the implementation of professional services in community pharmacies and to predict the effectiveness of facilitation strategies to overcome implementation barriers using machine learning techniques. Methods Six change facilitators facilitated a 2-year change programme aimed at implementing professional services across community pharmacies in Australia. A mixed methods approach was used where barriers were identified by change facilitators during the implementation study. Change facilitators trialled and recorded tailored facilitation strategies delivered to overcome identified barriers. Barriers were coded according to implementation factors derived from the Consolidated Framework for Implementation Research and the Theoretical Domains Framework. Tailored facilitation strategies were coded into 16 facilitation categories. To predict the effectiveness of these strategies, data mining with random forest was used to provide the highest level of accuracy. A predictive resolution percentage was established for each implementation strategy in relation to the barriers that were resolved by that particular strategy. Results During the 2-year programme, 1131 barriers and facilitation strategies were recorded by change facilitators. The most frequently identified barriers were a ‘lack of ability to plan for change’, ‘lack of internal supporters for the change’, ‘lack of knowledge and experience’, ‘lack of monitoring and feedback’, ‘lack of individual alignment with the change’, ‘undefined change objectives’, ‘lack of objective feedback’ and ‘lack of time’. The random forest algorithm used was able to provide 96.9% prediction accuracy. The strategy category with the highest predicted resolution rate across the most number of implementation barriers was ‘to empower stakeholders to develop objectives and solve problems’. Conclusions Results from this study have provided a better understanding of implementation barriers in community pharmacy and how data-driven approaches can be used to predict the effectiveness of facilitation strategies to overcome implementation barriers. Tailored facilitation strategies such as these can increase the rate of real-time implementation of innovations in healthcare, leading to an industry that can confidently and efficiently adapt to continuous change.

Download Full-text

A data-driven approach to forecasting ground-level ozone concentration

International Journal of Forecasting ◽

10.1016/j.ijforecast.2021.07.008 ◽

2021 ◽

Author(s):

Dario Marvin ◽

Lorenzo Nespoli ◽

Davide Strepparava ◽

Vasco Medici

Keyword(s):

Ozone Concentration ◽

Ground Level ◽

Data Driven ◽

Ground Level Ozone ◽

Data Driven Approach

Download Full-text

Estimating ground-level PM2.5 using micro-satellite images by a convolutional neural network and random forest approach

Atmospheric Environment ◽

10.1016/j.atmosenv.2020.117451 ◽

2020 ◽

Vol 230 ◽

pp. 117451 ◽

Cited By ~ 2

Author(s):

Tongshu Zheng ◽

Michael H. Bergin ◽

Shijia Hu ◽

Joshua Miller ◽

David E. Carlson

Keyword(s):

Neural Network ◽

Random Forest ◽

Convolutional Neural Network ◽

Satellite Images ◽

Ground Level

Download Full-text

A Hybrid Approach: Dynamic Diagnostic Rules for Sensor Systems in Industry 4.0 Generated by Online Hyperparameter Tuned Random Forest

10.20944/preprints202007.0548.v1 ◽

2020 ◽

Author(s):

Ahlam Mallak ◽

Madjid Fathi

Keyword(s):

Random Forest ◽

Random Search ◽

Hybrid Approach ◽

Fault Detection And Diagnosis ◽

Data Driven ◽

Sensor Data ◽

Sensor Systems ◽

Hydraulic Test ◽

Model Based ◽

Detection And Diagnosis

In this work, A hybrid component Fault Detection and Diagnosis (FDD) approach for industrial sensor systems is established and analyzed, to provide a hybrid schema that combines the advantages and eliminates the drawbacks of both model-based and data-driven methods of diagnosis. Moreover, spotting the light on a new utilization of Random Forest (RF) together with model-based diagnosis, beyond its ordinary data-driven application. RF is trained and hyperparameter tuned using 3-fold cross-validation over a random grid of parameters using random search, to finally generate diagnostic graphs as the dynamic, data-driven part of this system. Followed by translating those graphs into model-based rules in the form of if-else statements, SQL queries or semantic queries such as SPARQL, in order to feed the dynamic rules into a structured model essential for further diagnosis. The RF hyperparameters are consistently updated online using the newly generated sensor data, in order to maintain the dynamicity and accuracy of the generated graphs and rules thereafter. The architecture of the proposed method is demonstrated in a comprehensive manner, as well as the dynamic rules extraction phase is applied using a case study on condition monitoring of a hydraulic test rig using time series multivariate sensor readings.

Download Full-text

Comparison of Random Forest and Neural Network in Modelling the Performance and Emissions of a Natural Gas Spark Ignition Engine

Journal of Energy Resources Technology ◽

10.1115/1.4053301 ◽

2021 ◽

pp. 1-20

Author(s):

Jinlong Liu ◽

Qiao Huang ◽

Christopher Ulishney ◽

Cosmin E. Dumitrescu

Keyword(s):

Neural Network ◽

Random Forest ◽

Natural Gas ◽

Internal Combustion Engines ◽

Engine Performance ◽

Spark Ignition ◽

Spark Ignition Engine ◽

Data Driven ◽

Ann Model ◽

Mean Square Errors

Abstract Machine learning (ML) models can accelerate the development of efficient internal combustion engines. This study assessed the feasibility of data-driven methods towards predicting the performance of a diesel engine modified to natural gas spark ignition, based on a limited number of experiments. As the best ML technique cannot be chosen a priori, the applicability of different ML algorithms for such an engine application was evaluated. Specifically, the performance of two widely used ML algorithms, the random forest (RF) and the artificial neural network (ANN), in forecasting engine responses related to in-cylinder combustion phenomena was compared. The results indicated that both algorithms with spark timing, mixture equivalence ratio, and engine speed as model inputs produced acceptable results with respect to predicting engine performance, combustion phasing, and engine-out emissions. Despite requiring more effort in hyperparameter optimization, the ANN model performed better than the RF model, especially for engine emissions, as evidenced by the larger R-squared, smaller root-mean-square errors, and more realistic predictions of the effects of key engine control variables on the engine performance. However, in applications where the combustion behavior knowledge is limited, it is recommended to use a RF model to quickly determine the appropriate number of model inputs. Consequently, using the RF model to define the model structure and then employing the ANN model to improve the model's predictive capability can help to rapidly build data-driven engine combustion models.

Download Full-text

A Data-driven Framework for Long-Range Aircraft Conflict Detection and Resolution

ACM Transactions on Spatial Algorithms and Systems ◽

10.1145/3328832 ◽

2019 ◽

Vol 5 (4) ◽

pp. 1-23

Author(s):

Samet Ayhan ◽

Pablo Costas ◽

Hanan Samet

Keyword(s):

Long Range ◽

Conflict Detection ◽

Data Driven ◽

Conflict Detection And Resolution

Download Full-text

A Data-Driven Approach for Winter Precipitation Classification Using Weather Radar and NWP Data

Atmosphere ◽

10.3390/atmos11070701 ◽

2020 ◽

Vol 11 (7) ◽

pp. 701

Author(s):

Bong-Chul Seo

Keyword(s):

Random Forest ◽

Binary Classification ◽

Weather Prediction ◽

Model Development ◽

Winter Precipitation ◽

Ensemble Classification ◽

Supervised Machine Learning ◽

Data Driven ◽

Support Vector ◽

Data Driven Approach

This study describes a framework that provides qualitative weather information on winter precipitation types using a data-driven approach. The framework incorporates the data retrieved from weather radars and the numerical weather prediction (NWP) model to account for relevant precipitation microphysics. To enable multimodel-based ensemble classification, we selected six supervised machine learning models: k-nearest neighbors, logistic regression, support vector machine, decision tree, random forest, and multi-layer perceptron. Our model training and cross-validation results based on Monte Carlo Simulation (MCS) showed that all the models performed better than our baseline method, which applies two thresholds (surface temperature and atmospheric layer thickness) for binary classification (i.e., rain/snow). Among all six models, random forest presented the best classification results for the basic classes (rain, freezing rain, and snow) and the further refinement of the snow classes (light, moderate, and heavy). Our model evaluation, which uses an independent dataset not associated with model development and learning, led to classification performance consistent with that from the MCS analysis. Based on the visual inspection of the classification maps generated for an individual radar domain, we confirmed the improved classification capability of the developed models (e.g., random forest) compared to the baseline one in representing both spatial variability and continuity.

Download Full-text

Random Forest Classification of Alcohol Use Disorder Using fMRI Functional Connectivity, Neuropsychological Functioning, and Impulsivity Measures

Brain Sciences ◽

10.3390/brainsci10020115 ◽

2020 ◽

Vol 10 (2) ◽

pp. 115 ◽

Cited By ~ 3

Author(s):

Chella Kamarajan ◽

Babak A. Ardekani ◽

Ashwini K. Pandey ◽

Sivan Kinreich ◽

Gayathri Pandey ◽

...

Keyword(s):

Functional Connectivity ◽

Random Forest ◽

Alcohol Use ◽

Long Range ◽

Alcohol Use Disorder ◽

Neuropsychological Functioning ◽

Brain Networks ◽

Cingulate Cortex ◽

Anterior Cingulate ◽

Neuropsychological Performance

Individuals with alcohol use disorder (AUD) are known to manifest a variety of neurocognitive impairments that can be attributed to alterations in specific brain networks. The current study aims to identify specific features of brain connectivity, neuropsychological performance, and impulsivity traits that can classify adult males with AUD (n = 30) from healthy controls (CTL, n = 30) using the Random Forest (RF) classification method. The predictor variables were: (i) fMRI-based within-network functional connectivity (FC) of the Default Mode Network (DMN), (ii) neuropsychological scores from the Tower of London Test (TOLT), and the Visual Span Test (VST), and (iii) impulsivity factors from the Barratt Impulsiveness Scale (BIS). The RF model, with a classification accuracy of 76.67%, identified fourteen DMN connections, two neuropsychological variables (memory span and total correct scores of the forward condition of the VST), and all impulsivity factors as significantly important for classifying participants into either the AUD or CTL group. Specifically, the AUD group manifested hyperconnectivity across the bilateral anterior cingulate cortex and the prefrontal cortex as well as between the bilateral posterior cingulate cortex and the left inferior parietal lobule, while showing hypoconnectivity in long-range anterior–posterior and interhemispheric long-range connections. Individuals with AUD also showed poorer memory performance and increased impulsivity compared to CTL individuals. Furthermore, there were significant associations among FC, impulsivity, neuropsychological performance, and AUD status. These results confirm the previous findings that alterations in specific brain networks coupled with poor neuropsychological functioning and heightened impulsivity may characterize individuals with AUD, who can be efficiently identified using classification algorithms such as Random Forest.

Download Full-text