Loading…

Minimizing Parameter Overhead in Self Supervised Models for Target Task

Supervised deep learning models encounter two significant challenges: the labeled datasets for training and the parameter overhead, which leads to extensive GPU usage and other computational resource requirements. Several CNN models show state-of-the-art performance while compromising with either of...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on artificial intelligence 2024-04, Vol.5 (4), p.1-12
Main Authors: Kishore, Jaydeep, Mukherjee, Snehasis
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Supervised deep learning models encounter two significant challenges: the labeled datasets for training and the parameter overhead, which leads to extensive GPU usage and other computational resource requirements. Several CNN models show state-of-the-art performance while compromising with either of the challenges. Self-supervised models reduce the requirement for labeled training data, however, the problems of parameter overhead and GPU usage are rarely addressed. This paper proposes a method to address the two challenges of the image classification task. We introduce a transfer learning approach for a target data set, in which we take the learned features from a self-supervised model after minimizing its parameters by removing the final layer. The learned features are then fed into a CNN classifier, followed by a multi-layer perceptron (MLP), where the hyperparameters of both the CNN and MLP are automatically tuned (autotuned) using a Bayesian optimization based technique. Further, we reduce GFLOPs by limiting the search space for the hyperparameters, not compromising the performance. The proposed approach effectively deals with above challenges. The first challenge is addressed by utilizing the learned representations from the self-supervised model as a foundation for knowledge transfer in the proposed model. Rather than relying solely on labeled data, we employ the insights from unlabeled data by transferring knowledge from the self-supervised model to the task at hand, hence reducing the cost and effort associated with data annotation. We address the second challenge by utilizing a minimized self-supervised backbone model and constraining the search space. We experiment with a wide variety of benchmark datasets, such as datasets with a large number of small-sized images (CIFAR-10 and CIFAR-100), fewer large-sized images (Oxford-IIIT Pet Dataset and Oxford 102 flowers), and large-sized images with a variety of classes (Caltech101).The proposed method outperforms the state-of-theart with fewer parameters and GFLOPs. The codes are available at https://github.com/jk452/minimized-parameter-overload
ISSN:2691-4581
2691-4581
DOI:10.1109/TAI.2023.3322394