Spatial statistics and random forest approaches for traffic crash hot spot identification and prediction

Author(s):  
Eskindir Ayele Atumo ◽  
Tuo Fang ◽  
Xinguo Jiang
Sensors ◽  
2021 ◽  
Vol 21 (1) ◽  
pp. 256
Author(s):  
Pengfei Han ◽  
Han Mei ◽  
Di Liu ◽  
Ning Zeng ◽  
Xiao Tang ◽  
...  

Pollutant gases, such as CO, NO2, O3, and SO2 affect human health, and low-cost sensors are an important complement to regulatory-grade instruments in pollutant monitoring. Previous studies focused on one or several species, while comprehensive assessments of multiple sensors remain limited. We conducted a 12-month field evaluation of four Alphasense sensors in Beijing and used single linear regression (SLR), multiple linear regression (MLR), random forest regressor (RFR), and neural network (long short-term memory (LSTM)) methods to calibrate and validate the measurements with nearby reference measurements from national monitoring stations. For performances, CO > O3 > NO2 > SO2 for the coefficient of determination (R2) and root mean square error (RMSE). The MLR did not increase the R2 after considering the temperature and relative humidity influences compared with the SLR (with R2 remaining at approximately 0.6 for O3 and 0.4 for NO2). However, the RFR and LSTM models significantly increased the O3, NO2, and SO2 performances, with the R2 increasing from 0.3–0.5 to >0.7 for O3 and NO2, and the RMSE decreasing from 20.4 to 13.2 ppb for NO2. For the SLR, there were relatively larger biases, while the LSTMs maintained a close mean relative bias of approximately zero (e.g., <5% for O3 and NO2), indicating that these sensors combined with the LSTMs are suitable for hot spot detection. We highlight that the performance of LSTM is better than that of random forest and linear methods. This study assessed four electrochemical air quality sensors and different calibration models, and the methodology and results can benefit assessments of other low-cost sensors.


2021 ◽  
Vol 11 (19) ◽  
pp. 8828
Author(s):  
Alamirew Mulugeta Tola ◽  
Tamene Adugna Demissie ◽  
Fokke Saathoff ◽  
Alemayehu Gebissa

The reduction of traffic crashes, as well as their socio-economic consequences, has captivated the attention of safety professionals and transportation agencies. The most important activity for an effective road safety practice is to identify hazardous roadway areas based on a spatial pattern analysis of crashes and an evaluation of crash spatial relations with neighboring areas and other relevant factors. For decades, safety researchers have adopted several techniques to analyze historical road traffic crash (RTC) information using the advanced GIS-based hot spot analysis. The objective of this study is to present a GIS technique for identifying crash hot spots based on spatial autocorrelation analysis using a four-year (2014–2017) crash data across Ethiopian regions, as well as zones and towns in the Oromia region. The study considered the corresponding severity values of RTCs for the analysis and ranking of crash hot spot areas. The spatial autocorrelation tool in ArcGIS 10.5 was used to analyze the spatial patterns of RTCs and then the Getis Ord Gi* statistics tool was used to identify high and low crash severity cluster zones. The results showed that the methods used in this analysis, which incorporated Moran’s I spatial autocorrelation of crash incidents, Getis Ord Gi* and crash severity index, proved to be a fruitful strategy for identifying and ranking crash hot spots. The identified crash hot spot areas are along the entrance to and exit from Addis Ababa, Ethiopia’s capital city, so the responsible bodies and traffic management agencies should give top priority attention and conduct a thorough study to reduce the socio-economic effect of RTCs.


2018 ◽  
Vol 11 (1) ◽  
pp. 160 ◽  
Author(s):  
Zeyang Cheng ◽  
Zhenshan Zu ◽  
Jian Lu

Road traffic safety is a key concern of transport management as it has severely restricted Chinese economic and social development. With the objective to prevent and reduce road traffic crashes, this study proposes a comprehensive spatiotemporal analysis method that integrates the time-space cube analysis, spatial autocorrelation analysis, and emerging hot spot analysis for exploring the traffic crash evolution characteristics and identifying crash hot spots. These analyses are all conducted by the corresponding toolbox of ArcGIS 10.5. Then, a small sized-city of China (i.e., Wujiang) is selected as the case study, and the historical traffic crash data occurring at the road intersections of Wujiang for the year 2016 are analyzed by the proposed method. The analysis process identifies the high incidence locations of traffic crashes, then presents the spatial change trend and statistical significance of the crash locations. Finally, different types of crash hotspots, as well as their evolution patterns over time, are determined. The results illustrate that the traffic crash hotspots of road intersections are primarily distributed in the Northeast area of Wujiang’s major urban area, while the crash cold spots are concentrated in the Southwest of Wujiang, which points out the direction for crash prevention. In addition, the finding has a potential engineering application value, and it is of great significance to the sustainable development of Wujiang.


2021 ◽  
Vol 10 (8) ◽  
pp. 519
Author(s):  
Zechun Huang

Unlike previous regionalized studies on a worldwide crisis, this study aims to analyze spatial distribution patterns and evolution characteristics of the COVID-19 pandemic, using space-time aggregation and spatial statistics from a global perspective. Hence, various spatial statistical methods, such as the heat map, global Moran’s I, geographic mean center, and emerging hot spot analysis were utilized comprehensively to mine and analyze spatiotemporal evolution patterns. The main findings were as follows: Overall, the spatial autocorrelation of confirmed cases gradually increased from the initial outbreak until September 2020 and then decreased slightly. The geographic centroid migration ranges of the pandemic in Asia, Europe, and Africa are wider than those in South America, Oceania, and North America. The spatiotemporal evolution pattern of the global pandemic mainly consisted of oscillating hot spots, intensifying cold spots, persistent cold spots, and diminishing cold spots. This study provides auxiliary decision-making information for pandemic prevention and control.


2018 ◽  
Vol 7 (10) ◽  
pp. 388 ◽  
Author(s):  
Grant McKenzie ◽  
Zheng Liu ◽  
Yingjie Hu ◽  
Myeong Lee

Neighborhoods are vaguely defined, localized regions that share similar characteristics. They are most often defined, delineated and named by the citizens that inhabit them rather than municipal government or commercial agencies. The names of these neighborhoods play an important role as a basis for community and sociodemographic identity, geographic communication and historical context. In this work, we take a data-driven approach to identifying neighborhood names based on the geospatial properties of user-contributed rental listings. Through a random forest ensemble learning model applied to a set of spatial statistics for all n-grams in listing descriptions, we show that neighborhood names can be uniquely identified within urban settings. We train a model based on data from Washington, DC, and test it on listings in Seattle, WA, and Montréal, QC. The results indicate that a model trained on housing data from one city can successfully identify neighborhood names in another. In addition, our approach identifies less common neighborhood names and suggestions of alternative or potentially new names in each city. These findings represent a first step in the process of urban neighborhood identification and delineation.


2018 ◽  
Vol 19 (3) ◽  
pp. 444 ◽  
Author(s):  
PAOLO VASSALLO ◽  
CHIARA PAOLI ◽  
JESSICA ALESSI ◽  
ALBERTA MANDICH ◽  
MAURIZIO WÜRTZ ◽  
...  

 The distribution of four top predators in the Tyrrhenian Sea, a sub-basin of the Mediterranean Sea, was investigated by means of random forest regression considering depth, distance from the coast, seafloor slope, and distance from seamounts as habitat descriptors on a 2x2–nautical mile regular grid. RF results are processed to estimate variable importance and model performance. Random forest architecture reached optimal sensitivity and specificity, thus providing a consistent support tool for identifying suitable habitats. The considered species are characterized as having patched suitable habitats with a number of hot-spot areas where the different species’ habitats overlap. These hot-spot areas’ locations correspond to those of specific seamounts identifying the attraction effect of these topographic structures. The mean features typifying the most attractive seamounts are investigated and found to be shallow peak and base depths but wide base area and high relative elevation.


2010 ◽  
Author(s):  
Jason Wyatt ◽  
Michael Alexander
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document