Loading…

A Multi-Task Hierarchical Approach for Intent Detection and Slot Filling

Spoken language understanding (SLU) plays an integral part in every dialogue system. To understand the intention of the user and extract the necessary information to help the user achieve desired goals is a challenging task. In this work, we propose an end-to-end hierarchical multi-task model that c...

Full description

Saved in:

Bibliographic Details
Published in:	Knowledge-based systems 2019-11, Vol.183, p.104846, Article 104846
Main Authors:	Firdaus, Mauajama, Kumar, Ankit, Ekbal, Asif, Bhattacharyya, Pushpak
Format:	Article
Language:	English
Subjects:	Artificial neural networks Conditional random fields Datasets Dependence Domains Hierarchical Information retrieval Intent detection Modules Multi-task Neural networks Recurrent neural networks Slot filling
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Spoken language understanding (SLU) plays an integral part in every dialogue system. To understand the intention of the user and extract the necessary information to help the user achieve desired goals is a challenging task. In this work, we propose an end-to-end hierarchical multi-task model that can jointly perform both intent detection and slot filling tasks for the datasets of varying domains. The primary aim is to capture context information in a dialogue to help the SLU module in a dialogue system to correctly understand the user and assist the user in achieving the desired goals. It is vital for the SLU module to capture the past information along with the present utterance said by the user to retrieve correct information. The dependency and correlation between the two tasks, i.e. intent detection and slot filling makes the multi-task learning framework effective in capturing the desired information provided by the user. We use Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) to capture contextual information for the utterances. We employ Conditional Random Field (CRF) to model label dependency. Both character and word level embeddings are provided as input to the models. We create a benchmark corpus for the SLU tasks, on TRAINS and FRAMES dataset for capturing more realistic and natural utterances spoken by the speakers in a human/machine dialogue system. Experimental results on multiple datasets of various domains (ATIS, SNIP, TRAINS and FRAMES) show that our proposed approach is effective compared to the individual models and the state-of-the-art methods.
ISSN:	0950-7051 1872-7409
DOI:	10.1016/j.knosys.2019.07.017