Loading…

An attention-augmented driven modified two-fold U-net anomaly detection model for video surveillance systems

We propose an effective strategy for detecting and localizing anomalous behavior using a modified end-to-end two-stage encoder-decoder U-shaped network. By building the model from scratch for the detection, segmentation, and classification of an anomalous event in video sequences. The encoder model...

Full description

Saved in:
Bibliographic Details
Published in:Multimedia tools and applications 2024-03, Vol.83 (11), p.32019-32040
Main Authors: Sharma, Preeti, Gangadharappa, M.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:We propose an effective strategy for detecting and localizing anomalous behavior using a modified end-to-end two-stage encoder-decoder U-shaped network. By building the model from scratch for the detection, segmentation, and classification of an anomalous event in video sequences. The encoder model is helpful for feature extraction which is input to the bottleneck block. The resulting feature maps serve as input to the decoder path which is responsible for transposing feature maps to the original image. For precise localization of anomaly, we included the augmentation feature of both images and their masks in the proposed U-net model. In a two-stage U-net network, the first model is useful for the detection of video frames while the same U-net model is used in the second stage for augmentation of detected video frames from first model, which provides segmentation and classification of images. This precise symmetric path-based architecture is useful in good spatial localization of anomalous events. We apply a pixel-based threshold value for Intersection over Union score to distinguish the pixels. Pixels having values greater than the threshold are considered anomalous or otherwise normal with an IoU score of 0. We have evaluated our two-stage U-net model performance on three benchmark standard datasets and compared performance with the Conventional U-net model, and Attention-U-net models without augmentation features. Our method combines spatial details and deep features that yield an improved accuracy of 99.15%, a mean intersection over union score of 82.33 and 99% ROC values that are higher as compared to other methods.
ISSN:1573-7721
1380-7501
1573-7721
DOI:10.1007/s11042-023-16728-5