Loading…
Efficient abnormality detection using patch-based 3D convolution with recurrent model
Recent advances in the intelligence video monitoring system have received widespread attention for the detection of anomalous human behavior in crowded scenes. Due to the varying crowd densities, low-resolution videos, inter-object occlusions, and complex human crowds, the detection of abnormalities...
Saved in:
Published in: | Machine vision and applications 2023-07, Vol.34 (4), p.54, Article 54 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Recent advances in the intelligence video monitoring system have received widespread attention for the detection of anomalous human behavior in crowded scenes. Due to the varying crowd densities, low-resolution videos, inter-object occlusions, and complex human crowds, the detection of abnormalities from human activities is extremely challenging. Hence, automatic analysis of behavioral patterns is necessary for accurately modeling crowd behavior and alerting human operators about suspicious activities on the scene. In response to these concerns, we propose a two-stream multi-scale patch-based pyramidal dilated 3D fully connected network (FCN) with attentive bidirectional long short-term memory (2MPD-3DFCN-AttBiDLSTM) for detecting and locating abnormal activities in the frame. This model effectively captures the spatial–temporal features with a dilated convolution network, and thus the motion and optical flow information features are exploited from the continuous frame, which improves the detection accuracy. Also, we introduce a parallel weighted skip connection into the residual learning framework that preserves the rich characteristics of the input data to be learned without the loss of effective features. Based on the attentive mechanism in the bidirectional LSTM model, two directions of temporal and global representations are extracted that enhance the classification of unusual and normal activity in the visual sequences. Experimental analysis is performed with the two publicly available datasets and evaluated in terms of the equal error rate, precision–recall curve, receiver operating characteristic curve, and area under the curve metrics measures. The result shows that the proposed model outperforms the existing model and achieves high detection results in the video surveillance monitoring system. |
---|---|
ISSN: | 0932-8092 1432-1769 |
DOI: | 10.1007/s00138-023-01397-z |