Loading…
Crowd Scene Analysis: Crowd Counting using MCNN based on Self-Supervised training with Attention Mechanism
Fully-supervised learning requires expensive and laborious annotations of labeled data for crowd-counting tasks. To alleviate this burden, it is desirable to explore methods that reduce the need for extensive labeling. Fortunately, there are a vast number of unlabeled images available in the world,...
Saved in:
Main Authors: | , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Fully-supervised learning requires expensive and laborious annotations of labeled data for crowd-counting tasks. To alleviate this burden, it is desirable to explore methods that reduce the need for extensive labeling. Fortunately, there are a vast number of unlabeled images available in the world, making them easily accessible compared to labeled datasets. This paper proposes a self-supervised learning-based M-CNN framework with an attention mechanism that aims to leverage unlabeled data for pre-training the model. The framework consists of four sub-modules: a data augmentation framework, a self-supervised training network, a multi-column CNN, and an attention mechanism. These networks receive the images that undergo random processing using two defined augmentation transformations. Transformed images are then subjected to self-supervised learning and fed to a feature extraction network. FEN consists of M-CNN with five convolutional branches to extract features at a multi-scale level. These extracted features are then employed as an attention mechanism to focus on the head or shoulder location of people. To evaluate the effectiveness of our proposed model, experiments are conducted on two public datasets: ShanghaiTech Part A, Part B, and UCFQNRF. The experimental results demonstrate that our approach outperforms state-of-the-art semi-supervised methods, showcasing the effectiveness of our proposed approach in leveraging both unlabeled and limited labeled data for crowd counting tasks. |
---|---|
ISSN: | 2835-8864 |
DOI: | 10.1109/INMIC60434.2023.10465776 |