Loading…
Single-Step Extraction of Transformer Attention With Dual-Gated Memtransistor Crossbars
We discuss how a dual-gated memtransistor crossbar can accelerate the extraction of the Transformer's attention scores. A memtransistor is a novel two-dimensional material-based device that offers non-volatile programmability and gate tunability. Leveraging these attributes, we demonstrate the...
Saved in:
Published in: | IEEE electron device letters 2024-10, Vol.45 (10), p.2005-2008 |
---|---|
Main Authors: | , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | We discuss how a dual-gated memtransistor crossbar can accelerate the extraction of the Transformer's attention scores. A memtransistor is a novel two-dimensional material-based device that offers non-volatile programmability and gate tunability. Leveraging these attributes, we demonstrate the extraction of quadratic-order products on a single memtransistor and the single-step extraction of attention scores without inferring intermediate query/key vectors. The query/key-free processing of memtransistor-based attention scoring results in 2.37\times lower energy with less than half crossbar cells. |
---|---|
ISSN: | 0741-3106 1558-0563 |
DOI: | 10.1109/LED.2024.3435540 |