scholarly journals Prediction of Fatal and Major Injury of Drivers, Cyclists, and Pedestrians in Collisions

2020 ◽  
Vol 32 (1) ◽  
pp. 39-53
Author(s):  
Dalia Shanshal ◽  
Ceni Babaoglu ◽  
Ayşe Başar

Traffic-related deaths and severe injuries may affect every person on the roads, whether driving, cycling or walking. Toronto, the largest city in Canada and the fourth largest in North America, aims to eliminate traffic-related fatalities and serious injuries on city streets. The aim of this study is to build a prediction model using data analytics and machine learning techniques that learn from past patterns, providing additional data-driven decision support for strategic planning. A detailed exploratory analysis is presented, investigating the relationship between the variables and factors affecting collisions in Toronto. A learning-based model is proposed to predict the fatalities and severe injuries in traffic collisions through a comparison of two predictive models: Lasso Regression and Random Forest. Exploratory data analysis results reveal both spatio-temporal and behavioural patterns such as the prevalence of collisions in intersections, in the spring and summer and aggressive driving and inattentive behaviours in drivers. The prediction results show that the best predictor of injury severity for drivers, cyclists and pedestrians is Random Forest with an accuracy of 0.80, 0.89, and 0.80, respectively. The proposed methods demonstrate the effectiveness of machine learning application to traffic and collision data, both for exploratory and predictive analytics.

2020 ◽  
Vol 15 (S359) ◽  
pp. 40-41
Author(s):  
L. M. Izuti Nakazono ◽  
C. Mendes de Oliveira ◽  
N. S. T. Hirata ◽  
S. Jeram ◽  
A. Gonzalez ◽  
...  

AbstractWe present a machine learning methodology to separate quasars from galaxies and stars using data from S-PLUS in the Stripe-82 region. In terms of quasar classification, we achieved 95.49% for precision and 95.26% for recall using a Random Forest algorithm. For photometric redshift estimation, we obtained a precision of 6% using k-Nearest Neighbour.


2017 ◽  
Vol 107 (10) ◽  
pp. 1187-1198 ◽  
Author(s):  
L. Wen ◽  
C. R. Bowen ◽  
G. L. Hartman

Dispersal of urediniospores by wind is the primary means of spread for Phakopsora pachyrhizi, the cause of soybean rust. Our research focused on the short-distance movement of urediniospores from within the soybean canopy and up to 61 m from field-grown rust-infected soybean plants. Environmental variables were used to develop and compare models including the least absolute shrinkage and selection operator regression, zero-inflated Poisson/regular Poisson regression, random forest, and neural network to describe deposition of urediniospores collected in passive and active traps. All four models identified distance of trap from source, humidity, temperature, wind direction, and wind speed as the five most important variables influencing short-distance movement of urediniospores. The random forest model provided the best predictions, explaining 76.1 and 86.8% of the total variation in the passive- and active-trap datasets, respectively. The prediction accuracy based on the correlation coefficient (r) between predicted values and the true values were 0.83 (P < 0.0001) and 0.94 (P < 0.0001) for the passive and active trap datasets, respectively. Overall, multiple machine learning techniques identified the most important variables to make the most accurate predictions of movement of P. pachyrhizi urediniospores short-distance.


Webology ◽  
2021 ◽  
Vol 18 (Special Issue 01) ◽  
pp. 183-195
Author(s):  
Thingbaijam Lenin ◽  
N. Chandrasekaran

Student’s academic performance is one of the most important parameters for evaluating the standard of any institute. It has become a paramount importance for any institute to identify the student at risk of underperforming or failing or even drop out from the course. Machine Learning techniques may be used to develop a model for predicting student’s performance as early as at the time of admission. The task however is challenging as the educational data required to explore for modelling are usually imbalanced. We explore ensemble machine learning techniques namely bagging algorithm like random forest (rf) and boosting algorithms like adaptive boosting (adaboost), stochastic gradient boosting (gbm), extreme gradient boosting (xgbTree) in an attempt to develop a model for predicting the student’s performance of a private university at Meghalaya using three categories of data namely demographic, prior academic record, personality. The collected data are found to be highly imbalanced and also consists of missing values. We employ k-nearest neighbor (knn) data imputation technique to tackle the missing values. The models are developed on the imputed data with 10 fold cross validation technique and are evaluated using precision, specificity, recall, kappa metrics. As the data are imbalanced, we avoid using accuracy as the metrics of evaluating the model and instead use balanced accuracy and F-score. We compare the ensemble technique with single classifier C4.5. The best result is provided by random forest and adaboost with F-score of 66.67%, balanced accuracy of 75%, and accuracy of 96.94%.


Author(s):  
Melika Sajadian ◽  
Ana Teixeira ◽  
Faraz S. Tehrani ◽  
Mathias Lemmens

Abstract. Built environments developed on compressible soils are susceptible to land deformation. The spatio-temporal monitoring and analysis of these deformations are necessary for sustainable development of cities. Techniques such as Interferometric Synthetic Aperture Radar (InSAR) or predictions based on soil mechanics using in situ characterization, such as Cone Penetration Testing (CPT) can be used for assessing such land deformations. Despite the combined advantages of these two methods, the relationship between them has not yet been investigated. Therefore, the major objective of this study is to reconcile InSAR measurements and CPT measurements using machine learning techniques in an attempt to better predict land deformation.


Author(s):  
Ramesh Ponnala ◽  
K. Sai Sowjanya

Prediction of Cardiovascular ailment is an important task inside the vicinity of clinical facts evaluation. Machine learning knowledge of has been proven to be effective in helping in making selections and predicting from the huge amount of facts produced by using the healthcare enterprise. on this paper, we advocate a unique technique that pursuits via finding good sized functions by means of applying ML strategies ensuing in improving the accuracy inside the prediction of heart ailment. The severity of the heart disease is classified primarily based on diverse methods like KNN, choice timber and so on. The prediction version is added with special combos of capabilities and several known classification techniques. We produce a stronger performance level with an accuracy level of a 100% through the prediction version for heart ailment with the Hybrid Random forest area with a linear model (HRFLM).


Author(s):  
Chaudhari Shraddha

Activity recognition in humans is one of the active challenges that find its application in numerous fields such as, medical health care, military, manufacturing, assistive techniques and gaming. Due to the advancements in technologies the usage of smartphones in human lives has become inevitable. The sensors in the smartphones help us to measure the essential vital parameters. These measured parameters enable us to monitor the activities of humans, which we call as human activity recognition. We have applied machine learning techniques on a publicly available dataset. K-Nearest Neighbors and Random Forest classification algorithms are applied. In this paper, we have designed and implemented an automatic human activity recognition system that independently recognizes the actions of the humans. This system is able to recognize the activities such as Laying, Sitting, Standing, Walking, Walking downstairs and Walking upstairs. The results obtained show that, the KNN and Random Forest Algorithms gives 90.22% and 92.70% respectively of overall accuracy in detecting the activities.


2021 ◽  
pp. 289-301
Author(s):  
B. Martín ◽  
J. González–Arias ◽  
J. A. Vicente–Vírseda

Our aim was to identify an optimal analytical approach for accurately predicting complex spatio–temporal patterns in animal species distribution. We compared the performance of eight modelling techniques (generalized additive models, regression trees, bagged CART, k–nearest neighbors, stochastic gradient boosting, support vector machines, neural network, and random forest –enhanced form of bootstrap. We also performed extreme gradient boosting –an enhanced form of radiant boosting– to predict spatial patterns in abundance of migrating Balearic shearwaters based on data gathered within eBird. Derived from open–source datasets, proxies of frontal systems and ocean productivity domains that have been previously used to characterize the oceanographic habitats of seabirds were quantified, and then used as predictors in the models. The random forest model showed the best performance according to the parameters assessed (RMSE value and R2). The correlation between observed and predicted abundance with this model was also considerably high. This study shows that the combination of machine learning techniques and massive data provided by open data sources is a useful approach for identifying the long–term spatial–temporal distribution of species at regional spatial scales.


Author(s):  
Anirudh Reddy Cingireddy ◽  
Robin Ghosh ◽  
Supratik Kar ◽  
Venkata Melapu ◽  
Sravanthi Joginipeli ◽  
...  

Frequent testing of the entire population would help to identify individuals with active COVID-19 and allow us to identify concealed carriers. Molecular tests, antigen tests, and antibody tests are being widely used to confirm COVID-19 in the population. Molecular tests such as the real-time reverse transcription-polymerase chain reaction (rRT-PCR) test will take a minimum of 3 hours to a maximum of 4 days for the results. The authors suggest using machine learning and data mining tools to filter large populations at a preliminary level to overcome this issue. The ML tools could reduce the testing population size by 20 to 30%. In this study, they have used a subset of features from full blood profile which are drawn from patients at Israelita Albert Einstein hospital located in Brazil. They used classification models, namely KNN, logistic regression, XGBooting, naive Bayes, decision tree, random forest, support vector machine, and multilayer perceptron with k-fold cross-validation, to validate the models. Naïve bayes, KNN, and random forest stand out as the most predictive ones with 88% accuracy each.


Sign in / Sign up

Export Citation Format

Share Document