Loading…
Weighted Ensemble Learning for Accident Severity Classification Using Social Media Data
Due to the heavy increase in social media usage, like Twitter, there is a growing interest in the research community in developing automation tools like accident severity-based tweet classification models. These tools aid in automatically extracting severity information from the accident tweet conte...
Saved in:
Published in: | SN computer science 2024-06, Vol.5 (5), p.528, Article 528 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Due to the heavy increase in social media usage, like Twitter, there is a growing interest in the research community in developing automation tools like accident severity-based tweet classification models. These tools aid in automatically extracting severity information from the accident tweet content. Moreover, prediction models are essential for predicting the severity of an accident to increase the safety efficacy of the road traffic system. However, the difficulty lies in the collection of sufficient labeled data. We propose a model called weighted ensemble-based self-training with decision tree (WESTDT), a semi-supervised methodology with a dynamic data labeling strategy. The base classifier in this model is a homogeneous weighted ensemble of decision tree classifiers for better prediction of pseudo labels. We also propose a novel performance measure called risk factor to estimate the amount of risk present in the application using the prediction model. The proposed model outperformed the state-of-the-art model, namely reliable semi-supervised ensemble learning (RESSEL), and the baseline models, namely decision tree (DT) and self-training with decision tree (STDT), in terms of both the traditional and the proposed measures. The results indicate that the proposed framework outperforms all other models on all datasets in terms of precision, recall, and accuracy by a range of 5–18.3%, 3.9–9.3%, and 6.6%, respectively. These findings can be helpful for the development of efficient and sustainable systems for traffic management and safety. It is also crucial for assisting government authorities in devising prompt, proactive strategies to prevent traffic accidents and enhance road safety. |
---|---|
ISSN: | 2661-8907 2662-995X 2661-8907 |
DOI: | 10.1007/s42979-024-02870-w |