Loading…

AISHELL-NER: Named Entity Recognition from Chinese Speech

Named Entity Recognition (NER) from speech is among Spoken Language Understanding (SLU) tasks, aiming to extract semantic information from the speech signal. NER from speech is usually made through a two-step pipeline that consists of (1) processing the audio using an Automatic Speech Recognition (A...

Full description

Saved in:
Bibliographic Details
Main Authors: Chen, Boli, Xu, Guangwei, Wang, Xiaobin, Xie, Pengjun, Zhang, Meishan, Huang, Fei
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page 8356
container_issue
container_start_page 8352
container_title
container_volume
creator Chen, Boli
Xu, Guangwei
Wang, Xiaobin
Xie, Pengjun
Zhang, Meishan
Huang, Fei
description Named Entity Recognition (NER) from speech is among Spoken Language Understanding (SLU) tasks, aiming to extract semantic information from the speech signal. NER from speech is usually made through a two-step pipeline that consists of (1) processing the audio using an Automatic Speech Recognition (ASR) system and (2) applying an NER tagger to the ASR outputs. Recent works have shown the capability of the End-to-End (E2E) approach for NER from English and French speech, which is essentially entity-aware ASR. However, due to the many homophones and polyphones that exist in Chinese, NER from Chinese speech is effectively a more challenging task. In this paper, we introduce a new dataset AISEHLL-NER for NER from Chinese speech. Extensive experiments are conducted to explore the performance of several state-of-the-art methods. The results demonstrate that the performance could be improved by combining entity-aware ASR and pretrained NER tagger, which can be easily applied to the modern SLU pipeline. The dataset is publicly available at github.com/Alibaba-NLP/AISHELL-NER.
doi_str_mv 10.1109/ICASSP43922.2022.9746955
format conference_proceeding
fullrecord <record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_9746955</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9746955</ieee_id><sourcerecordid>9746955</sourcerecordid><originalsourceid>FETCH-LOGICAL-i203t-6722d35f52b6cf2d9c6ffda170a4386a84ff566e8aee06727ffb8b9823fa21653</originalsourceid><addsrcrecordid>eNotj9tKw0AYhFehYA8-gTf7Aqn_nne9KyHaQqjSKHhXNsm_dsUkJclN395AC8PMzccwQwhlsGYM3PMu3RTFhxSO8zWHyZyR2il1RxZMayVhkr4ncy6MS5iD7weyGIZfALBG2jlxm12xzfI82WeHF7r3DdY0a8c4XugBq-6njWPsWhr6rqHpKbY4IC3OiNVpRWbB_w34eMsl-XrNPtNtkr-_TavyJHIQY6IN57VQQfFSV4HXrtIh1J4Z8FJY7a0MQWmN1iPCBJsQSls6y0XwnGklluTp2hsR8XjuY-P7y_F2U_wDlD5GqA</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>AISHELL-NER: Named Entity Recognition from Chinese Speech</title><source>IEEE Xplore All Conference Series</source><creator>Chen, Boli ; Xu, Guangwei ; Wang, Xiaobin ; Xie, Pengjun ; Zhang, Meishan ; Huang, Fei</creator><creatorcontrib>Chen, Boli ; Xu, Guangwei ; Wang, Xiaobin ; Xie, Pengjun ; Zhang, Meishan ; Huang, Fei</creatorcontrib><description>Named Entity Recognition (NER) from speech is among Spoken Language Understanding (SLU) tasks, aiming to extract semantic information from the speech signal. NER from speech is usually made through a two-step pipeline that consists of (1) processing the audio using an Automatic Speech Recognition (ASR) system and (2) applying an NER tagger to the ASR outputs. Recent works have shown the capability of the End-to-End (E2E) approach for NER from English and French speech, which is essentially entity-aware ASR. However, due to the many homophones and polyphones that exist in Chinese, NER from Chinese speech is effectively a more challenging task. In this paper, we introduce a new dataset AISEHLL-NER for NER from Chinese speech. Extensive experiments are conducted to explore the performance of several state-of-the-art methods. The results demonstrate that the performance could be improved by combining entity-aware ASR and pretrained NER tagger, which can be easily applied to the modern SLU pipeline. The dataset is publicly available at github.com/Alibaba-NLP/AISHELL-NER.</description><identifier>EISSN: 2379-190X</identifier><identifier>EISBN: 1665405406</identifier><identifier>EISBN: 9781665405409</identifier><identifier>DOI: 10.1109/ICASSP43922.2022.9746955</identifier><language>eng</language><publisher>IEEE</publisher><subject>Benchmark testing ; Conferences ; Data mining ; Dataset ; end-to-end ; named entity recognition ; Pipelines ; Semantics ; Signal processing ; Speech recognition ; transformer</subject><ispartof>ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, p.8352-8356</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9746955$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,23930,23931,25140,27925,54555,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9746955$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Chen, Boli</creatorcontrib><creatorcontrib>Xu, Guangwei</creatorcontrib><creatorcontrib>Wang, Xiaobin</creatorcontrib><creatorcontrib>Xie, Pengjun</creatorcontrib><creatorcontrib>Zhang, Meishan</creatorcontrib><creatorcontrib>Huang, Fei</creatorcontrib><title>AISHELL-NER: Named Entity Recognition from Chinese Speech</title><title>ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</title><addtitle>ICASSP</addtitle><description>Named Entity Recognition (NER) from speech is among Spoken Language Understanding (SLU) tasks, aiming to extract semantic information from the speech signal. NER from speech is usually made through a two-step pipeline that consists of (1) processing the audio using an Automatic Speech Recognition (ASR) system and (2) applying an NER tagger to the ASR outputs. Recent works have shown the capability of the End-to-End (E2E) approach for NER from English and French speech, which is essentially entity-aware ASR. However, due to the many homophones and polyphones that exist in Chinese, NER from Chinese speech is effectively a more challenging task. In this paper, we introduce a new dataset AISEHLL-NER for NER from Chinese speech. Extensive experiments are conducted to explore the performance of several state-of-the-art methods. The results demonstrate that the performance could be improved by combining entity-aware ASR and pretrained NER tagger, which can be easily applied to the modern SLU pipeline. The dataset is publicly available at github.com/Alibaba-NLP/AISHELL-NER.</description><subject>Benchmark testing</subject><subject>Conferences</subject><subject>Data mining</subject><subject>Dataset</subject><subject>end-to-end</subject><subject>named entity recognition</subject><subject>Pipelines</subject><subject>Semantics</subject><subject>Signal processing</subject><subject>Speech recognition</subject><subject>transformer</subject><issn>2379-190X</issn><isbn>1665405406</isbn><isbn>9781665405409</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2022</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotj9tKw0AYhFehYA8-gTf7Aqn_nne9KyHaQqjSKHhXNsm_dsUkJclN395AC8PMzccwQwhlsGYM3PMu3RTFhxSO8zWHyZyR2il1RxZMayVhkr4ncy6MS5iD7weyGIZfALBG2jlxm12xzfI82WeHF7r3DdY0a8c4XugBq-6njWPsWhr6rqHpKbY4IC3OiNVpRWbB_w34eMsl-XrNPtNtkr-_TavyJHIQY6IN57VQQfFSV4HXrtIh1J4Z8FJY7a0MQWmN1iPCBJsQSls6y0XwnGklluTp2hsR8XjuY-P7y_F2U_wDlD5GqA</recordid><startdate>20220523</startdate><enddate>20220523</enddate><creator>Chen, Boli</creator><creator>Xu, Guangwei</creator><creator>Wang, Xiaobin</creator><creator>Xie, Pengjun</creator><creator>Zhang, Meishan</creator><creator>Huang, Fei</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope></search><sort><creationdate>20220523</creationdate><title>AISHELL-NER: Named Entity Recognition from Chinese Speech</title><author>Chen, Boli ; Xu, Guangwei ; Wang, Xiaobin ; Xie, Pengjun ; Zhang, Meishan ; Huang, Fei</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i203t-6722d35f52b6cf2d9c6ffda170a4386a84ff566e8aee06727ffb8b9823fa21653</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Benchmark testing</topic><topic>Conferences</topic><topic>Data mining</topic><topic>Dataset</topic><topic>end-to-end</topic><topic>named entity recognition</topic><topic>Pipelines</topic><topic>Semantics</topic><topic>Signal processing</topic><topic>Speech recognition</topic><topic>transformer</topic><toplevel>online_resources</toplevel><creatorcontrib>Chen, Boli</creatorcontrib><creatorcontrib>Xu, Guangwei</creatorcontrib><creatorcontrib>Wang, Xiaobin</creatorcontrib><creatorcontrib>Xie, Pengjun</creatorcontrib><creatorcontrib>Zhang, Meishan</creatorcontrib><creatorcontrib>Huang, Fei</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Xplore (Online service)</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Chen, Boli</au><au>Xu, Guangwei</au><au>Wang, Xiaobin</au><au>Xie, Pengjun</au><au>Zhang, Meishan</au><au>Huang, Fei</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>AISHELL-NER: Named Entity Recognition from Chinese Speech</atitle><btitle>ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</btitle><stitle>ICASSP</stitle><date>2022-05-23</date><risdate>2022</risdate><spage>8352</spage><epage>8356</epage><pages>8352-8356</pages><eissn>2379-190X</eissn><eisbn>1665405406</eisbn><eisbn>9781665405409</eisbn><abstract>Named Entity Recognition (NER) from speech is among Spoken Language Understanding (SLU) tasks, aiming to extract semantic information from the speech signal. NER from speech is usually made through a two-step pipeline that consists of (1) processing the audio using an Automatic Speech Recognition (ASR) system and (2) applying an NER tagger to the ASR outputs. Recent works have shown the capability of the End-to-End (E2E) approach for NER from English and French speech, which is essentially entity-aware ASR. However, due to the many homophones and polyphones that exist in Chinese, NER from Chinese speech is effectively a more challenging task. In this paper, we introduce a new dataset AISEHLL-NER for NER from Chinese speech. Extensive experiments are conducted to explore the performance of several state-of-the-art methods. The results demonstrate that the performance could be improved by combining entity-aware ASR and pretrained NER tagger, which can be easily applied to the modern SLU pipeline. The dataset is publicly available at github.com/Alibaba-NLP/AISHELL-NER.</abstract><pub>IEEE</pub><doi>10.1109/ICASSP43922.2022.9746955</doi><tpages>5</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier EISSN: 2379-190X
ispartof ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, p.8352-8356
issn 2379-190X
language eng
recordid cdi_ieee_primary_9746955
source IEEE Xplore All Conference Series
subjects Benchmark testing
Conferences
Data mining
Dataset
end-to-end
named entity recognition
Pipelines
Semantics
Signal processing
Speech recognition
transformer
title AISHELL-NER: Named Entity Recognition from Chinese Speech
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T04%3A59%3A41IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=AISHELL-NER:%20Named%20Entity%20Recognition%20from%20Chinese%20Speech&rft.btitle=ICASSP%202022%20-%202022%20IEEE%20International%20Conference%20on%20Acoustics,%20Speech%20and%20Signal%20Processing%20(ICASSP)&rft.au=Chen,%20Boli&rft.date=2022-05-23&rft.spage=8352&rft.epage=8356&rft.pages=8352-8356&rft.eissn=2379-190X&rft_id=info:doi/10.1109/ICASSP43922.2022.9746955&rft.eisbn=1665405406&rft.eisbn_list=9781665405409&rft_dat=%3Cieee_CHZPO%3E9746955%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i203t-6722d35f52b6cf2d9c6ffda170a4386a84ff566e8aee06727ffb8b9823fa21653%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=9746955&rfr_iscdi=true