Loading…

Real-time accident detection: Coping with imbalanced data

•SMOTH performs well on accident/non-accident highly imbalanced data.•Abrupt change in speed is an important sign of accident occurrence.•Feature engineering is an essential part of training Machine Learning models.•Performance of detection models is varying in different time intervals after acciden...

Full description

Saved in:
Bibliographic Details
Published in:Accident analysis and prevention 2019-08, Vol.129, p.202-210
Main Authors: Parsa, Amir Bahador, Taghipour, Homa, Derrible, Sybil, Mohammadian, Abolfazl (Kouros)
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•SMOTH performs well on accident/non-accident highly imbalanced data.•Abrupt change in speed is an important sign of accident occurrence.•Feature engineering is an essential part of training Machine Learning models.•Performance of detection models is varying in different time intervals after accidents occurred.•Best performance of detection models is achieved five minutes after accidents occurred. Detecting accidents is of great importance since they often impose significant delay and inconvenience to road users. This study compares the performance of two popular machine learning models, Support Vector Machine (SVM) and Probabilistic Neural Network (PNN), to detect the occurrence of accidents on the Eisenhower expressway in Chicago. Accordingly, since the detection of accidents should be as rapid as possible, seven models are trained and tested for each machine learning technique, using traffic condition data from 1 to 7 min after the actual occurrence. The main sources of data used in this study consist of weather condition, accident, and loop detector data. Furthermore, to overcome the problem of imbalanced data (i.e., underrepresentation of accidents in the dataset), the Synthetic Minority Oversampling TEchnique (SMOTE) is used. The results show that although SVM achieves overall higher accuracy, PNN outperforms SVM regarding the Detection Rate (DR) (i.e., percentage of correct accident detections). In addition, while both models perform best at 5 min after the occurrence of accidents, models trained at 3 or 4 min after the occurrence of an accident detect accidents more rapidly while performing reasonably well. Lastly, a sensitivity analysis of PNN for Time-To-Detection (TTD) reveals that the speed difference between upstream and downstream of accidents location is particularly significant to detect the occurrence of accidents.
ISSN:0001-4575
1879-2057
DOI:10.1016/j.aap.2019.05.014