Loading…

LIPT: Improving Prompt Tuning with Late Inception Reparameterization

Prompt tuning is a mainstream technique for fine-tuning large language models (LLMs), offering minimal parameter adjustments by learning task-specific prompt vectors. However, it suffers from training costs due to network-wide backpropagation and weaker performance compared to methods like adapters...

Full description

Saved in:
Bibliographic Details
Published in:Electronics (Basel) 2024-12, Vol.13 (23), p.4741
Main Authors: He, Yawen, Feng, Ao, Gao, Zhengjie, Song, Xinyu
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Prompt tuning is a mainstream technique for fine-tuning large language models (LLMs), offering minimal parameter adjustments by learning task-specific prompt vectors. However, it suffers from training costs due to network-wide backpropagation and weaker performance compared to methods like adapters and LoRA, likely due to the limited capacity of soft prompts to encode task-specific information. This study introduces Late Inception Prompt Tuning (LIPT), a novel approach to soft prompt learning that enhances performance and efficiency by shortening backpropagation paths and employing a multidimensional bottleneck network with greater capacity. LIPT surpasses existing prompt tuning techniques on various benchmark tasks, delivering a 1.3% gain over LPT and a 5% improvement compared to standard prompt tuning when applied to RoBERTa-large, while converging more rapidly. It achieves an average accuracy of 90% across ten benchmark datasets. Notably, in certain scenarios, LIPT’s performance approaches that of full-parameter fine-tuning methods. To evaluate parameter-efficient fine-tuning (PEFT) comprehensively, we propose an Efficiency Indicator (EI) that balances accuracy and cost. LIPT is well suited for natural language understanding tasks, like sentiment analysis and text classification, with potential extensions to larger-scale models and tasks like text generation. This framework advances the scalability and practicality of fine-tuning methods for diverse applications.
ISSN:2079-9292
2079-9292
DOI:10.3390/electronics13234741