Loading…
Lower resources of spoken language understanding from voice to semantics
Spoken language understanding is traditionally designed as a pipeline consisting of multiple components. First, the speech signal is mapped into text through the automatic speech recognition module, and then the natural language understanding module converts the recognized text into structured data,...
Saved in:
Published in: | Journal of physics. Conference series 2020-04, Vol.1486 (5), p.52033 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Spoken language understanding is traditionally designed as a pipeline consisting of multiple components. First, the speech signal is mapped into text through the automatic speech recognition module, and then the natural language understanding module converts the recognized text into structured data, such as domain, intention and slot value. Usually these modules are trained separately. End-to-end speech comprehension, on the other hand, derives structured data directly from speech through a single model. However, end-to-end spoken language understanding based on a large amount of training data is difficult to achieve in different fields and different groups of people. For this reason, we introduced end-to-end oral comprehension based on pre-training with low resources and combined it with capsule vector. The experimental results show that the oral comprehension of this model with low resources is robust under different data sets. |
---|---|
ISSN: | 1742-6588 1742-6596 |
DOI: | 10.1088/1742-6596/1486/5/052033 |