scholarly journals Developing an Advanced PM2.5 Exposure Model in Lima, Peru

2019 ◽  
Vol 11 (6) ◽  
pp. 641 ◽  
Author(s):  
Bryan Vu ◽  
Odón Sánchez ◽  
Jianzhao Bi ◽  
Qingyang Xiao ◽  
Nadia Hansel ◽  
...  

It is well recognized that exposure to fine particulate matter (PM2.5) affects health adversely, yet few studies from South America have documented such associations due to the sparsity of PM2.5 measurements. Lima’s topography and aging vehicular fleet results in severe air pollution with limited amounts of monitors to effectively quantify PM2.5 levels for epidemiologic studies. We developed an advanced machine learning model to estimate daily PM2.5 concentrations at a 1 km2 spatial resolution in Lima, Peru from 2010 to 2016. We combined aerosol optical depth (AOD), meteorological fields from the European Centre for Medium-Range Weather Forecasts (ECMWF), parameters from the Weather Research and Forecasting model coupled with Chemistry (WRF-Chem), and land use variables to fit a random forest model against ground measurements from 16 monitoring stations. Overall cross-validation R2 (and root mean square prediction error, RMSE) for the random forest model was 0.70 (5.97 μg/m3). Mean PM2.5 for ground measurements was 24.7 μg/m3 while mean estimated PM2.5 was 24.9 μg/m3 in the cross-validation dataset. The mean difference between ground and predicted measurements was −0.09 μg/m3 (Std.Dev. = 5.97 μg/m3), with 94.5% of observations falling within 2 standard deviations of the difference indicating good agreement between ground measurements and predicted estimates. Surface downwards solar radiation, temperature, relative humidity, and AOD were the most important predictors, while percent urbanization, albedo, and cloud fraction were the least important predictors. Comparison of monthly mean measurements between ground and predicted PM2.5 shows good precision and accuracy from our model. Furthermore, mean annual maps of PM2.5 show consistent lower concentrations in the coast and higher concentrations in the mountains, resulting from prevailing coastal winds blown from the Pacific Ocean in the west. Our model allows for construction of long-term historical daily PM2.5 measurements at 1 km2 spatial resolution to support future epidemiological studies.

2018 ◽  
Vol 10 (9) ◽  
pp. 1393 ◽  
Author(s):  
Nicole DeLuca ◽  
Benjamin Zaitchik ◽  
Frank Curriero

Total suspended solids (TSS) is an important environmental parameter to monitor in the Chesapeake Bay due to its effects on submerged aquatic vegetation, pathogen abundance, and habitat damage for other aquatic life. Chesapeake Bay is home to an extensive and continuous network of in situ water quality monitoring stations that include TSS measurements. Satellite remote sensing can address the limited spatial and temporal extent of in situ sampling and has proven to be a valuable tool for monitoring water quality in estuarine systems. Most algorithms that derive TSS concentration in estuarine environments from satellite ocean color sensors utilize only the red and near-infrared bands due to the observed correlation with TSS concentration. In this study, we investigate whether utilizing additional wavelengths from the Moderate Resolution Imaging Spectroradiometer (MODIS) as inputs to various statistical and machine learning models can improve satellite-derived TSS estimates in the Chesapeake Bay. After optimizing the best performing multispectral model, a Random Forest regression, we compare its results to those from a widely used single-band algorithm for the Chesapeake Bay. We find that the Random Forest model modestly outperforms the single-band algorithm on a holdout cross-validation dataset and offers particular advantages under high TSS conditions. We also find that both methods are similarly generalizable throughout various partitions of space and time. The multispectral Random Forest model is, however, more data intensive than the single band algorithm, so the objectives of the application will ultimately determine which method is more appropriate.


2021 ◽  
Vol 5 (7 (113)) ◽  
pp. 59-65
Author(s):  
Nadia Moneem Al-Abdaly ◽  
Salwa R. Al-Taai ◽  
Hamza Imran ◽  
Majed Ibrahim

Because of the incorporation of discontinuous fibers, steel fiber-reinforced concrete (SFRC) outperforms regular concrete. However, due to its complexity and limited available data, the development of SFRC strength prediction techniques is still in its infancy when compared to that of standard concrete. In this paper, the compressive strength of steel fiber-reinforced concrete was predicted from different variables using the Random forest model. Case studies of 133 samples were used for this aim. To design and validate the models, we generated training and testing datasets. The proposed models were developed using ten important material parameters for steel fiber-reinforced concrete characterization. To minimize training and testing split bias, the approach used in this study was validated using the 10-fold Cross-Validation procedure. To determine the optimal hyperparameters for the Random Forest algorithm, the Grid Search Cross-Validation approach was utilized. The root mean square error (RMSE), coefficient of determination (R2), and mean absolute error (MAE) between measured and estimated values were used to validate and compare the models. The prediction performance with RMSE=5.66, R2=0.88 and MAE=3.80 for the Random forest model. Compared with the traditional linear regression model, the outcomes showed that the Random forest model is able to produce enhanced predictive results of the compressive strength of steel fiber-reinforced concrete. The findings show that hyperparameter tuning with grid search and cross-validation is an efficient way to find the optimal parameters for the RF method. Also, RF produces good results and gives an alternate way for anticipating the compressive strength of SFRC


2019 ◽  
Vol 11 (6) ◽  
pp. 722 ◽  
Author(s):  
Xiaofang Sun ◽  
Guicai Li ◽  
Meng Wang ◽  
Zemeng Fan

Accurate estimation of forest aboveground biomass (AGB) is important for carbon accounting. Forest AGB estimation has been conducted with a variety of data sources and prediction methods, but many uncertainties still exist. In this study, six prediction methods, including Gaussian processes, stepwise linear regression, nonlinear regression using a logistic model, partial least squares regression, random forest, and support vector machines were used to estimate forest AGB in Jiangxi Province, China, by combining Geoscience Laser Altimeter System (GLAS) data, Moderate Resolution Imaging Spectroradiometer (MODIS) data, and field measurements. We compared the effect of three factors (prediction methods, sample sizes of field measurements, and cross-validation settings) on the predictive quality of the methods. The results showed that the prediction methods had the most considerable effect on the prediction quality. In most cases, random forest produced more accurate estimates than the other methods. The sample sizes had an obvious effect on accuracy, especially for the random forest model. The accuracy increased with increasing sample sizes. The random forest algorithm with a large number of field measurements, was the most precise (coefficient of determination (R2) = 0.73, root mean square error (RMSE) = 23.58 Mg/ha). Increasing the number of folds within the cross-validation settings improved the R2 values. However, no apparent change occurred in RMSE for different numbers of folds. Finally, the wall-to-wall forest AGB map over the study area was generated using the random forest model.


2021 ◽  
Vol 13 (18) ◽  
pp. 3657
Author(s):  
Chau-Ren Jung ◽  
Wei-Ting Chen ◽  
Shoji F. Nakayama

Satellite-based models for estimating concentrations of particulate matter with an aerodynamic diameter less than 2.5 μm (PM2.5) have seldom been developed in islands with complex topography over the monsoon area, where the transport of PM2.5 is influenced by both the synoptic-scale winds and local-scale circulations compared with the continental regions. We validated Multi-Angle Implementation of Atmospheric Correction (MAIAC) aerosol optical depth (AOD) with ground observations in Japan and developed a 1-km-resolution national-scale model between 2011 and 2016 to estimate daily PM2.5 concentrations. A two-stage random forest model integrating MAIAC AOD with meteorological variables and land use data was applied to develop the model. The first-stage random forest model was used to impute the missing AOD values. The second-stage random forest model was then utilised to estimate ground PM2.5 concentrations. Ten-fold cross-validation was performed to evaluate the model performance. There was good consistency between MAIAC AOD and ground truth in Japan (correlation coefficient = 0.82 and 74.62% of data falling within the expected error). For model training, the model showed a training coefficient of determination (R2) of 0.98 and a root mean square error (RMSE) of 1.22 μg/m3. For the 10-fold cross-validation, the cross-validation R2 and RMSE of the model were 0.86 and 3.02 μg/m3, respectively. A subsite validation was used to validate the model at the grids overlapping with the AERONET sites, and the model performance was excellent at these sites with a validation R2 (RMSE) of 0.94 (1.78 μg/m3). Additionally, the model performance increased as increased AOD coverage. The top-ten important predictors for estimating ground PM2.5 concentrations were day of the year, temperature, AOD, relative humidity, 10-m-height zonal wind, 10-m-height meridional wind, boundary layer height, precipitation, surface pressure, and population density. MAIAC AOD showed high retrieval accuracy in Japan. The performance of the satellite-based model was excellent, which showed that PM2.5 estimates derived from the model were reliable and accurate. These estimates can be used to assess both the short-term and long-term effects of PM2.5 on health outcomes in epidemiological studies.


2018 ◽  
Vol 52 (7) ◽  
pp. 4173-4179 ◽  
Author(s):  
Cole Brokamp ◽  
Roman Jandarov ◽  
Monir Hossain ◽  
Patrick Ryan

2021 ◽  
Author(s):  
Christian Thiele ◽  
Gerrit Hirschfeld ◽  
Ruth von Brachel

AbstractRegistries of clinical trials are a potential source for scientometric analysis of medical research and serve important functions for the research community and the public at large. Clinical trials that recruit patients in Germany are usually registered in the German Clinical Trials Register (DRKS) or in international registries such as ClinicalTrials.gov. Furthermore, the International Clinical Trials Registry Platform (ICTRP) aggregates trials from multiple primary registries. We queried the DRKS, ClinicalTrials.gov, and the ICTRP for trials with a recruiting location in Germany. Trials that were registered in multiple registries were linked using the primary and secondary identifiers and a Random Forest model based on various similarity metrics. We identified 35,912 trials that were conducted in Germany. The majority of the trials was registered in multiple databases. 32,106 trials were linked using primary IDs, 26 were linked using a Random Forest model, and 10,537 internal duplicates on ICTRP were identified using the Random Forest model after finding pairs with matching primary or secondary IDs. In cross-validation, the Random Forest increased the F1-score from 96.4% to 97.1% compared to a linkage based solely on secondary IDs on a manually labelled data set. 28% of all trials were registered in the German DRKS. 54% of the trials on ClinicalTrials.gov, 43% of the trials on the DRKS and 56% of the trials on the ICTRP were pre-registered. The ratio of pre-registered studies and the ratio of studies that are registered in the DRKS increased over time.


2021 ◽  
Vol 10 (8) ◽  
pp. 503
Author(s):  
Hang Liu ◽  
Riken Homma ◽  
Qiang Liu ◽  
Congying Fang

The simulation of future land use can provide decision support for urban planners and decision makers, which is important for sustainable urban development. Using a cellular automata-random forest model, we considered two scenarios to predict intra-land use changes in Kumamoto City from 2018 to 2030: an unconstrained development scenario, and a planning-constrained development scenario that considers disaster-related factors. The random forest was used to calculate the transition probabilities and the importance of driving factors, and cellular automata were used for future land use prediction. The results show that disaster-related factors greatly influence land vacancy, while urban planning factors are more important for medium high-rise residential, commercial, and public facilities. Under the unconstrained development scenario, urban land use tends towards spatially disordered growth in the total amount of steady growth, with the largest increase in low-rise residential areas. Under the planning-constrained development scenario that considers disaster-related factors, the urban land area will continue to grow, albeit slowly and with a compact growth trend. This study provides planners with information on the relevant trends in different scenarios of land use change in Kumamoto City. Furthermore, it provides a reference for Kumamoto City’s future post-disaster recovery and reconstruction planning.


Sign in / Sign up

Export Citation Format

Share Document