Loading…

CTRAN: CNN-Transformer-based network for natural language understanding

Intent-detection (ID) and slot-filling (SF) are fundamental tasks for natural language understanding. This study introduces a new encoder–decoder CNN-Transformer-based architecture (CTRAN) designed for ID and SF. The encoder integrates of BERT, followed by several convolutional layers with different...

Full description

Saved in:
Bibliographic Details
Published in:Engineering applications of artificial intelligence 2023-11, Vol.126, p.107013, Article 107013
Main Authors: Rafiepour, Mehrdad, Sartakhti, Javad Salimi
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Intent-detection (ID) and slot-filling (SF) are fundamental tasks for natural language understanding. This study introduces a new encoder–decoder CNN-Transformer-based architecture (CTRAN) designed for ID and SF. The encoder integrates of BERT, followed by several convolutional layers with different kernel sizes. We propose using a kernel size of 1 to preserve the one-to-one correspondence between input tokens and output tags. Subsequently, we rearrange the output of the convolutional layer using the window feature sequence and apply stacked Transformer encoders. The ID decoder leverages self-attention and a linear layer, while the SF decoder employs an aligned Transformer decoder with a zero diagonal mask, facilitating alignment between output tags and input tokens. We evaluate our model on ATIS and SNIPS datasets, achieving 98.46% and 98.30% F1 score for the SF task, respectively. These results surpass the previous state-of-the-art by 0.64% and 0.99%. Moreover, we compare two strategies: using a language model as the encoder or as word embedding. We find out that the latter strategy yields better results.
ISSN:0952-1976
DOI:10.1016/j.engappai.2023.107013