Pitfalls of Using R2 to Evaluate Goodness of Fit of Accident Prediction Models

Author(s):  
Shaw-Pin Miaou ◽  
An Lu ◽  
Harry S. Lum

In developing statistical models of traffic accidents, flow, and roadway design, the R2 goodness-of-fit measure has been used for many years to (a) determine the overall quality and usability of the model, (b) select covariates for inclusion in the model, (c) make decisions as to whether it would be worthwhile to collect additional covariates, and (d) compare the relative quality of models developed from different studies. The pitfalls of using R2 to make these decisions and comparisons are demonstrated through computer simulations of commonly used accident prediction models, including the Poisson and negative binomial regression models. Because accident prediction models are nonnormal and functional forms are typically nonlinear, it is shown that R2 is not an appropriate measure to make any of the decisions and comparisons mentioned. Also, three properties are identified as desirable for any alternative measure to appropriately evaluate these models: (a) it should be bounded between 0 and 1—a value of 0 if no covariate is included in the model and a value of 1 if all the necessary covariates are included; (b) it should increase proportionally as equally important, independent covariates are added to the model one at a time, regardless of their order of selection; and (c) it should be invariant with respect to the mean (i.e., the value of the measure should not change by simply increasing or decreasing the value of the intercept term). Finally, two recent research efforts aimed at developing alternative measures with such properties are briefly reported.

2006 ◽  
Vol 33 (9) ◽  
pp. 1115-1124 ◽  
Author(s):  
Z Sawalha ◽  
T Sayed

Accident prediction models are invaluable tools that have many applications in road safety analysis. However, there are certain statistical issues related to accident modeling that either deserve further attention or have not been dealt with adequately in the road safety literature. This paper discusses and illustrates how to deal with two statistical issues related to modeling accidents using Poisson and negative binomial regression. The first issue is that of model building or deciding which explanatory variables to include in an accident prediction model. The study differentiates between applications for which it is advisable to avoid model over-fitting and other applications for which it is desirable to fit the model to the data as closely as possible. It then suggests procedures for developing parsimonious models, i.e., models that are not over-fitted, and best-fit models. The second issue discussed in the paper is that of outlier analysis. The study suggests a procedure for the identification and exclusion of extremely influential outliers from the development of Poisson and negative binomial regression models. The procedures suggested for model building and conducting outlier analysis are more straightforward to apply in the case of Poisson regression models because of an added complexity presented by the shape parameter of the negative binomial distribution. The paper, therefore, presents flowcharts detailing the application of the procedures when modeling is carried out using negative binomial regression. The described procedures are then applied in the development of negative binomial accident prediction models for the urban arterials of the cities of Vancouver and Richmond located in the province of British Columbia, Canada. Key words: accident prediction models, overfitting, parsimony, outlier analysis, Poisson regression, negative binomial regression.


Author(s):  
Alireza Hadayeghi ◽  
Amer S. Shalaby ◽  
Bhagwant Persaud

A series of macrolevel prediction models that would estimate the number of accidents in planning zones in the city of Toronto, Ontario, Canada, as a function of zonal characteristics were developed. A generalized linear modeling approach was used in which negative binomial regression models were developed separately for total accidents and for severe (fatal and nonfatal injury) accidents as a function of socio-economic and demographic, traffic demand, and network data variables. The variables that had significant effects on accident occurrence were the number of households, the number of major road kilometers, the number of vehicle kilometers traveled, intersection density, posted speed, and volume-capacity ratio. The geographic weighted regression approach was used to test spatial variations in the estimated parameters from zone to zone. Mixed results were obtained from that analysis.


Author(s):  
Monsuru O Popoola ◽  
Oladapo S Abiola ◽  
Simeon O Odunfa

Road safety engineering involves identifying influencing factors causing traffic crashes through accident data, carrying out detailed accident studies at different locations and implementing relevant remedial measures. This study was carried out to establish relationship between traffic accident characteristics (frequency and severity) and traffic and road design characteristics on a two-lane highway. Statistical models applied in traffic accident modeling are Poisson regression, Negative Binomial regression (NB), and Zero-Inflated Negative Binomial regression (ZINB).; Traffic flow and road geometry related variables were the independent variables of the models. Using Ilesha-Akure-Owo highway, South-West, Nigeria accident prediction models were developed on the basis of accident data obtained from Federal Road Safety Commission (FRSC) during a 4-year monitoring period extending between 2012 and 2015. Curve radius (CR), lane width (LW), shoulder factor (SF), access road (CHAR), average annual daily traffic (AADT), parentage heavy good vehicle (HGV) and traffic sign posted (TSP) were the identified effective factors on crash occurrence probability. Finally, a comparison of the three models developed proved the efficiency of ZINB models against traditional Poisson and NB models. Keywords— Traffic accidents. Single carriageway, accident prediction model, road geometric characteristics.


2021 ◽  
Author(s):  
Yesuf Abdela Mustefa ◽  
Addis Belayhun

Abstract Background: Road traffic accident is a major public health as well as economic challenge that rated the eighth leading cause of death. The severity became higher in developing countries. Ethiopian is among the most confronted countries in the world. We utilized the Ethiopian Toll Roads Enterprise data to provide insights and model significant determinants of accidents involving injuries and fatalities. Besides utilizing recent dataset, we applied the most appropriate but forwent statistical model. Moreover, we examined the significance of the effects of drivers’ age and gender that have not been the cases in the literatures.Methods: We made descriptive insights available on the basis of graphs from integrated traffic accident and flow datasets. We tested for the presence of over-dispersion in a total of 1824 observations of accident data recorded from September, 2014 to December, 2019 for inferential analysis. Finally, we modeled the effects of significant variables on the number of injuries using the negative binomial regression model. Results: we found that the number of injuries in accidents were significantly determined by type of vehicles, ownership status of vehicles, accident time weather condition, driver-vehicle relationship, drivers’ level of education, and drivers’ age.Conclusions: Heavy trucks were more likely to cause more number of injuries than medium or small vehicles. Hot and windy weather conditions were associated with higher probability of the number of injuries. The likelihood of the number of injuries were lower when drivers are owner of the vehicle; drivers level of education is above secondary school; and the age of the driver is between 18 and 23 years old. Moreover, due concern needs to be given for traffic road rules.


Author(s):  
Ian Hamilton ◽  
Scott Himes ◽  
R. J. Porter ◽  
Eric Donnell

Design consistency in the context of highway and street design refers to the conformance of highway geometry to driver expectancy. Existing design policies provide guidance related to horizontal alignment design consistency. While design consistency has safety implications and is intuitively linked to roadway departure crashes, the authors are only aware of a few studies that sought to link measures of design consistency to safety performance. This study explores relationships between alternative measures of horizontal alignment design consistency and the expected number of roadway departure crashes along horizontal curves on rural, two-lane, two-way roads. The authors analyzed 854 horizontal curves on rural two-lane highways in Indiana and Pennsylvania using data obtained from the SHRP 2 Roadway Information Database (RID) 2.0. Relationships between measures of design consistency and the expected number of roadway departure crashes were explored using a negative binomial regression modeling approach. The results indicate a relationship between the frequency of roadway departure crashes on a study curve and the radii of upstream and downstream curves. The ratio of the length of upstream and downstream tangents relative to a study curve radius was also statistically significant in Pennsylvania. Such findings are intuitive given the concept of design consistency and represent an advancement to existing predictive methods in the AASHTO Highway Safety Manual, which estimate the expected number of crashes on a segment as a function of the characteristics of only that segment.


2003 ◽  
Vol 1856 (1) ◽  
pp. 125-135 ◽  
Author(s):  
Sravanthi Konduri ◽  
Samuel Labi ◽  
Kumares C. Sinha

Incident prediction models are presented for the Interstate 80/Interstate 94 (Borman Expressway in northwestern Indiana) and Interstate 465 (northeastern Indianapolis, Indiana) freeway sections developed as a function of traffic volume, truck percentage, and weather. Separate models were developed for all incidents and noncrash incidents. Three model types were considered (Poisson regression, negative binomial regression, and nonlinear regression), and the results were compared based on magnitudes and signs of model parameter estimates and t-statistics. Least-squares estimation and maximum-likelihood methods were used to estimate the model parameters. Data from the Indiana Department of Transportation and the Indiana Climatology Database were used to establish the relationships. For a given session and incident category, the results from the Poisson and negative binomial models were found to be consistent. It was observed that, unlike section length, traffic volume is nonlinearly related to incidents, and therefore these two variables have to be considered as separate terms in the modeling process. Truck percentage was found to be a statistically significant factor affecting incident occurrence. It was also found that the weather variable (rain and snow) was negatively correlated to incidents. The freeway incident models developed constitute a useful decision support tool for implementation of new freeway patrol systems or for expansion of existing ones. They are also useful for simulating incident occurrences with a view to identifying elements of cost-effective freeway patrol strategies (patrol deployment policies, fleet size, crew size, and beat routes).


Author(s):  
Andrew P. Tarko ◽  
Natalie M. Villwock ◽  
Nicolas Blond

Although median barriers are an absolute means of preventing drivers from crossing road medians and colliding with vehicles moving in the opposite direction, they may cause additional crashes. This perhaps complex safety effect of median barriers has not been investigated well. Being able to predict the safety impact of most types of median barriers on rural freeways is becoming more desirable because some state departments of transportation plan to expand many of their four-lane rural freeways to six lanes to accommodate increases in traffic volume. Realistic crash prediction models sensitive to the median design would provide the needed guidance useful in designing adequate median treatments on widened freeways. The impact of median designs on crash frequency was investigated in this study through negative binomial regression and before-and-after studies based on data collected in eight participating states. The impact on crash severity was investigated with a logit model. The separate effects of changes in median geometry were quantified for single-vehicle, multiple-vehicle same direction, and multiple-vehicle opposite direction crashes. The results were significantly different and indicated that reducing the median width without adding barriers (the remaining median width is still reasonably wide) increases the severity of crashes, particularly opposite direction crashes. Further, reducing the median and installing concrete barriers eliminates opposite direction crashes but doubles the frequency of single-vehicle crashes and tends to lessen the frequency of same direction crashes. The crash severity also tends to increase.


Author(s):  
Fedy Ouni ◽  
Mounir Belloumi

The purpose of the present study is to explore the linkage between Hazardous Road Locations-based crash counts and a variety of geometric characteristics, roadway characteristics, traffic flow characteristics and spatial features in the region of Sousse, Tunisia. For this purpose, collision data was collected from at 52 hazardous road sections including 1397 crash records for a 11-year monitoring period from January 1, 2004 to December 31, 2014 obtained from National Observatory for Information, Training, Documentation and Studies on Road Safety in Tunisia (NOITDRS). The matrix of Pearson correlation was used in order to avoid inclusion of both variables, which were highly correlated. Both the Random Effects Negative Binomial model and the Negative Binomial model were estimated. To evaluate the models, the Random Effect Negative Binomial model improves the goodness-of-fit compared to the Negative Binomial model. Average Daily Traffic volume, Curved alignment, Presence of public lighting, Visibility, Number of lane, Presence of vertical/horizontal sign, Presence of rural segment, Presence of drainage system, Roadway surface condition, Presence of paved shoulder and presence of major road were found as significant variables influencing accident occurrences. Overall, the current research contributes to the literature from empirical, modeling methodological standpoints since it was the first study conducted in Tunisia to use crash prediction models for hazardous road locations, and that portrays Tunisian reality. The research findings present advantageous insights on hazardous road locations in the region of Sousse, Tunisia and present useful planning tools for public authorities in Tunisia.


Sign in / Sign up

Export Citation Format

Share Document