Loading…
LIPT: Improving Prompt Tuning with Late Inception Reparameterization
Prompt tuning is a mainstream technique for fine-tuning large language models (LLMs), offering minimal parameter adjustments by learning task-specific prompt vectors. However, it suffers from training costs due to network-wide backpropagation and weaker performance compared to methods like adapters...
Saved in:
Published in: | Electronics (Basel) 2024-12, Vol.13 (23), p.4741 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Prompt tuning is a mainstream technique for fine-tuning large language models (LLMs), offering minimal parameter adjustments by learning task-specific prompt vectors. However, it suffers from training costs due to network-wide backpropagation and weaker performance compared to methods like adapters and LoRA, likely due to the limited capacity of soft prompts to encode task-specific information. This study introduces Late Inception Prompt Tuning (LIPT), a novel approach to soft prompt learning that enhances performance and efficiency by shortening backpropagation paths and employing a multidimensional bottleneck network with greater capacity. LIPT surpasses existing prompt tuning techniques on various benchmark tasks, delivering a 1.3% gain over LPT and a 5% improvement compared to standard prompt tuning when applied to RoBERTa-large, while converging more rapidly. It achieves an average accuracy of 90% across ten benchmark datasets. Notably, in certain scenarios, LIPT’s performance approaches that of full-parameter fine-tuning methods. To evaluate parameter-efficient fine-tuning (PEFT) comprehensively, we propose an Efficiency Indicator (EI) that balances accuracy and cost. LIPT is well suited for natural language understanding tasks, like sentiment analysis and text classification, with potential extensions to larger-scale models and tasks like text generation. This framework advances the scalability and practicality of fine-tuning methods for diverse applications. |
---|---|
ISSN: | 2079-9292 2079-9292 |
DOI: | 10.3390/electronics13234741 |