scholarly journals Correction: Schneider et al. A Satellite-Based Spatio-Temporal Machine Learning Model to Reconstruct Daily PM2.5 Concentrations across Great Britain. Remote Sens. 2020, 12, 3803

2021 ◽  
Vol 13 (18) ◽  
pp. 3588
Author(s):  
Rochelle Schneider ◽  
Ana M. Vicedo-Cabrera ◽  
Francesco Sera ◽  
Pierre Masselot ◽  
Massimo Stafoggia ◽  
...  

In the original article [...]

2020 ◽  
Author(s):  
Rochelle Schneider dos Santos ◽  
Ana Vicedo-Cabrera ◽  
Francesco Sera ◽  
Massimo Stafoggia ◽  
Kees de Hoogh ◽  
...  

Epidemiological studies on health effects of air pollution usually rely on measurements from fixed ground monitors, which provide limited spatio-temporal coverage. Data from satellites, reanalysis and chemical transport models offer additional information used to reconstruct pollution concentrations at high spatio-temporal resolution. The aim of this study is to develop a multi-stage satellite-based machine learning model to estimate daily fine particulate matter (PM2.5) levels across Great Britain during 2008-2018. This high-resolution model consists of random forest (RF) algorithms applied in four stages. Stage-1 augments monitor-PM2.5 series using co-located PM10 measures. Stage-2 imputes missing satellite aerosol optical depth observations using atmospheric reanalysis models. Stage-3 integrates the output from previous stages with spatial and spatio-temporal variables to build a prediction model for PM2.5. Stage-4 applies Stage-3 models to estimate daily PM2.5 concentrations over a 1-km grid. The RF architecture performed well in all stages, with results from Stage-3 showing an average cross-validated R2 of 0.767 and minimal bias. The model performed better over the temporal scale when compared to the spatial component, but both presented good accuracy with an R2 of 0.795 and 0.658, respectively. The high spatio-temporal resolution and relatively high precision allows this dataset (approximately 950 million points) to be used in epidemiological analyses to assess health risks associated with both short- and long-term exposures to PM2.5.


2020 ◽  
Vol 12 (22) ◽  
pp. 3803
Author(s):  
Rochelle Schneider ◽  
Ana M. Vicedo-Cabrera ◽  
Francesco Sera ◽  
Pierre Masselot ◽  
Massimo Stafoggia ◽  
...  

Epidemiological studies on the health effects of air pollution usually rely on measurements from fixed ground monitors, which provide limited spatio-temporal coverage. Data from satellites, reanalysis, and chemical transport models offer additional information used to reconstruct pollution concentrations at high spatio-temporal resolutions. This study aims to develop a multi-stage satellite-based machine learning model to estimate daily fine particulate matter (PM2.5) levels across Great Britain between 2008–2018. This high-resolution model consists of random forest (RF) algorithms applied in four stages. Stage-1 augments monitor-PM2.5 series using co-located PM10 measures. Stage-2 imputes missing satellite aerosol optical depth observations using atmospheric reanalysis models. Stage-3 integrates the output from previous stages with spatial and spatio-temporal variables to build a prediction model for PM2.5. Stage-4 applies Stage-3 models to estimate daily PM2.5 concentrations over a 1 km grid. The RF architecture performed well in all stages, with results from Stage-3 showing an average cross-validated R2 of 0.767 and minimal bias. The model performed better over the temporal scale when compared to the spatial component, but both presented good accuracy with an R2 of 0.795 and 0.658, respectively. These findings indicate that direct satellite observations must be integrated with other satellite-based products and geospatial variables to derive reliable estimates of air pollution exposure. The high spatio-temporal resolution and the relatively high precision allow these estimates (approximately 950 million points) to be used in epidemiological analyses to assess health risks associated with both short- and long-term exposure to PM2.5.


2020 ◽  
Author(s):  
Jihane Elyahyioui ◽  
Valentijn Pauwels ◽  
Edoardo Daly ◽  
Francois Petitjean ◽  
Mahesh Prakash

<p>Flooding is one of the most common and costly natural hazards at global scale. Flood models are important in supporting flood management. This is a computationally expensive process, due to the high nonlinearity of the equations involved and the complexity of the surface topography. New modelling approaches based on deep learning algorithms have recently emerged for multiple applications.</p><p>This study aims to investigate the capacity of machine learning to achieve spatio-temporal flood modelling. The combination of spatial and temporal input data to obtain dynamic results of water levels and flows from a machine learning model on multiple domains for applications in flood risk assessments has not been achieved yet. Here, we develop increasingly complex architectures aimed at interpreting the raw input data of precipitation and terrain to generate essential spatio-temporal variables (water level and velocity fields) and derived products (flood maps) by training these based on hydrodynamic simulations.</p><p>An extensive training dataset is generated by solving the 2D shallow water equations on simplified topographies using Lisflood-FP.</p><p>As a first task, the machine learning model is trained to reproduce the maximum water depth, using as inputs the precipitation time series and the topographic grid. The models combine the spatial and temporal information through a combination of 1D and 2D convolutional layers, pooling, merging and upscaling. Multiple variations of this generic architecture are trained to determine the best one(s). Overall, the trained models return good results regarding performance indices (mean squared error, mean absolute error and classification accuracy) but fail at predicting the maximum water depths with sufficient precision for practical applications.</p><p>A major limitation of this approach is the availability of training examples. As a second task, models will be trained to bring the state of the system (spatially distributed water depth and velocity) from one time step to the next, based on the same inputs as previously, generating the full solution equivalent to that of a hydrodynamic solver. The training database becomes much larger as each pair of consecutive time steps constitutes one training example.</p><p>Assuming that a reliable model can be built and trained, such methodology could be applied to build models that are faster and less computationally demanding than hydrodynamic models. Indeed, in with the synthetic cases shown here, the simulation times of the machine learning models (< seconds) are far shorter than those of the hydrodynamic model (a few minutes at least). These data-driven models could be used for interpolation and forecasting. The potential for extrapolation beyond the range of training datasets will also be investigated (different topography and high intensity precipitation events). </p>


2018 ◽  
Author(s):  
Steen Lysgaard ◽  
Paul C. Jennings ◽  
Jens Strabo Hummelshøj ◽  
Thomas Bligaard ◽  
Tejs Vegge

A machine learning model is used as a surrogate fitness evaluator in a genetic algorithm (GA) optimization of the atomic distribution of Pt-Au nanoparticles. The machine learning accelerated genetic algorithm (MLaGA) yields a 50-fold reduction of required energy calculations compared to a traditional GA.


Author(s):  
Dhilsath Fathima.M ◽  
S. Justin Samuel ◽  
R. Hari Haran

Aim: This proposed work is used to develop an improved and robust machine learning model for predicting Myocardial Infarction (MI) could have substantial clinical impact. Objectives: This paper explains how to build machine learning based computer-aided analysis system for an early and accurate prediction of Myocardial Infarction (MI) which utilizes framingham heart study dataset for validation and evaluation. This proposed computer-aided analysis model will support medical professionals to predict myocardial infarction proficiently. Methods: The proposed model utilize the mean imputation to remove the missing values from the data set, then applied principal component analysis to extract the optimal features from the data set to enhance the performance of the classifiers. After PCA, the reduced features are partitioned into training dataset and testing dataset where 70% of the training dataset are given as an input to the four well-liked classifiers as support vector machine, k-nearest neighbor, logistic regression and decision tree to train the classifiers and 30% of test dataset is used to evaluate an output of machine learning model using performance metrics as confusion matrix, classifier accuracy, precision, sensitivity, F1-score, AUC-ROC curve. Results: Output of the classifiers are evaluated using performance measures and we observed that logistic regression provides high accuracy than K-NN, SVM, decision tree classifiers and PCA performs sound as a good feature extraction method to enhance the performance of proposed model. From these analyses, we conclude that logistic regression having good mean accuracy level and standard deviation accuracy compared with the other three algorithms. AUC-ROC curve of the proposed classifiers is analyzed from the output figure.4, figure.5 that logistic regression exhibits good AUC-ROC score, i.e. around 70% compared to k-NN and decision tree algorithm. Conclusion: From the result analysis, we infer that this proposed machine learning model will act as an optimal decision making system to predict the acute myocardial infarction at an early stage than an existing machine learning based prediction models and it is capable to predict the presence of an acute myocardial Infarction with human using the heart disease risk factors, in order to decide when to start lifestyle modification and medical treatment to prevent the heart disease.


Author(s):  
Dhaval Patel ◽  
Shrey Shrivastava ◽  
Wesley Gifford ◽  
Stuart Siegel ◽  
Jayant Kalagnanam ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document