Loading…
CTRAN: CNN-Transformer-based network for natural language understanding
Intent-detection (ID) and slot-filling (SF) are fundamental tasks for natural language understanding. This study introduces a new encoder–decoder CNN-Transformer-based architecture (CTRAN) designed for ID and SF. The encoder integrates of BERT, followed by several convolutional layers with different...
Saved in:
Published in: | Engineering applications of artificial intelligence 2023-11, Vol.126, p.107013, Article 107013 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Intent-detection (ID) and slot-filling (SF) are fundamental tasks for natural language understanding. This study introduces a new encoder–decoder CNN-Transformer-based architecture (CTRAN) designed for ID and SF. The encoder integrates of BERT, followed by several convolutional layers with different kernel sizes. We propose using a kernel size of 1 to preserve the one-to-one correspondence between input tokens and output tags. Subsequently, we rearrange the output of the convolutional layer using the window feature sequence and apply stacked Transformer encoders. The ID decoder leverages self-attention and a linear layer, while the SF decoder employs an aligned Transformer decoder with a zero diagonal mask, facilitating alignment between output tags and input tokens. We evaluate our model on ATIS and SNIPS datasets, achieving 98.46% and 98.30% F1 score for the SF task, respectively. These results surpass the previous state-of-the-art by 0.64% and 0.99%. Moreover, we compare two strategies: using a language model as the encoder or as word embedding. We find out that the latter strategy yields better results. |
---|---|
ISSN: | 0952-1976 |
DOI: | 10.1016/j.engappai.2023.107013 |