Loading…

HyperMAML: Few-shot adaptation of deep models with hypernetworks

Few-Shot learning aims to train models which can adapt to previously unseen tasks based on small amounts of data. One of the leading Few-Shot learning approaches is Model-Agnostic-Meta-Learning (MAML), which learns the general weights of the meta-model that are later adapted to downstream tasks. How...

Full description

Saved in:
Bibliographic Details
Published in:Neurocomputing (Amsterdam) 2024-09, Vol.598, p.128179, Article 128179
Main Authors: Przewięźlikowski, Marcin, Przybysz, Przemysław, Tabor, Jacek, Zięba, Maciej, Spurek, Przemysław
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Few-Shot learning aims to train models which can adapt to previously unseen tasks based on small amounts of data. One of the leading Few-Shot learning approaches is Model-Agnostic-Meta-Learning (MAML), which learns the general weights of the meta-model that are later adapted to downstream tasks. However, MAML’s main limitation lies in that the update procedure is realized by gradient-based optimization, which cannot always modify weights to the essential level in one or even a few iterations. Moreover, using many gradient steps results in time-consuming optimization and inference procedures. In this paper, we propose HyperMAML, a novel generalization of MAML, where the update procedure is also a part of the model. Namely, we replace gradient descent with a trainable Hypernetwork which updates the weights. Consequently, the model can generate significant updates whose range is not limited to a fixed number of gradient steps. Experiments show that HyperMAML outperforms MAML in most cases and performs comparably to state-of-the-art techniques in standard Few-Shot learning benchmarks. •We propose a Few-Shot learning model that produces task-specific parameter updates.•HyperMAML does not require loss calculation or backpropagation to update parameters.•Our approach offers Few-Shot accuracy superior to MAML and its numerous variants.
ISSN:0925-2312
1872-8286
DOI:10.1016/j.neucom.2024.128179