Loading…
Multi-scale-audio indexing for translingual spoken document retrieval
MEI (Mandarin-English Information) is an English-Chinese crosslingual spoken document retrieval (CL-SDR) system developed during the Johns Hopkins University Summer Workshop 2000. We integrate speech recognition, machine translation, and information retrieval technologies to perform CL-SDR. MEI advo...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | 608 vol.1 |
container_issue | |
container_start_page | 605 |
container_title | |
container_volume | 1 |
creator | Hsin-Min Wang Meng, H. Schone, P. Chen, B. Wai-Kit Lo |
description | MEI (Mandarin-English Information) is an English-Chinese crosslingual spoken document retrieval (CL-SDR) system developed during the Johns Hopkins University Summer Workshop 2000. We integrate speech recognition, machine translation, and information retrieval technologies to perform CL-SDR. MEI advocates a multi-scale paradigm, where both Chinese words and subwords (characters and syllables) are used in retrieval. The use of subword units can complement the word unit in handling the problems of Chinese word tokenization ambiguity, Chinese homophone ambiguity, and out-of-vocabulary words in audio indexing. This paper focuses on multi-scale audio indexing in MEI. Experiments are based on the Topic Detection and Tracking Corpora (TDT-2 and TDT-3), where we indexed Voice of America Mandarin news broadcasts by speech recognition on both the word and subword scales. We discuss the development of the MEI syllable recognizer, the representations of spoken documents using overlapping subword n-grams and lattice structures. Results show that augmenting words with subwords is beneficial to CL-SDR performance. |
doi_str_mv | 10.1109/ICASSP.2001.940904 |
format | conference_proceeding |
fullrecord | <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_940904</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>940904</ieee_id><sourcerecordid>940904</sourcerecordid><originalsourceid>FETCH-LOGICAL-i87t-1017375f6c93679844e1239553d59645e3204dbcf5a279a68e79b044b6e08b173</originalsourceid><addsrcrecordid>eNotj8tOwzAURC0eEmnhB7ryD7hcv2J7iapCkYpAahfsKie5QQY3qewEwd8TqaxGszijM4QsOCw5B3f_vHrY7d6WAoAvnQIH6oIUQhrHuIP3SzIDY0EaUFxdkYJrAazkyt2QWc6fAGCNsgVZv4xxCCzXPiLzYxN6GroGf0L3Qds-0SH5LsepjT7SfOq_sKNNX49H7AaacEgBv328Jdetjxnv_nNO9o_r_WrDtq9Pk-eWBWsGxoEbaXRb1k6WxlmlkAvptJaNdqXSKAWopqpb7YVxvrRoXAVKVSWCrSZ2Thbn2YCIh1MKR59-D-fz8g8ANEwP</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Multi-scale-audio indexing for translingual spoken document retrieval</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Hsin-Min Wang ; Meng, H. ; Schone, P. ; Chen, B. ; Wai-Kit Lo</creator><creatorcontrib>Hsin-Min Wang ; Meng, H. ; Schone, P. ; Chen, B. ; Wai-Kit Lo</creatorcontrib><description>MEI (Mandarin-English Information) is an English-Chinese crosslingual spoken document retrieval (CL-SDR) system developed during the Johns Hopkins University Summer Workshop 2000. We integrate speech recognition, machine translation, and information retrieval technologies to perform CL-SDR. MEI advocates a multi-scale paradigm, where both Chinese words and subwords (characters and syllables) are used in retrieval. The use of subword units can complement the word unit in handling the problems of Chinese word tokenization ambiguity, Chinese homophone ambiguity, and out-of-vocabulary words in audio indexing. This paper focuses on multi-scale audio indexing in MEI. Experiments are based on the Topic Detection and Tracking Corpora (TDT-2 and TDT-3), where we indexed Voice of America Mandarin news broadcasts by speech recognition on both the word and subword scales. We discuss the development of the MEI syllable recognizer, the representations of spoken documents using overlapping subword n-grams and lattice structures. Results show that augmenting words with subwords is beneficial to CL-SDR performance.</description><identifier>ISSN: 1520-6149</identifier><identifier>ISBN: 0780370414</identifier><identifier>ISBN: 9780780370418</identifier><identifier>EISSN: 2379-190X</identifier><identifier>DOI: 10.1109/ICASSP.2001.940904</identifier><language>eng</language><publisher>IEEE</publisher><subject>Error analysis ; Gold ; Indexing ; Information retrieval ; Information science ; Natural languages ; Radio broadcasting ; Speech recognition ; Systems engineering and theory ; Vocabulary</subject><ispartof>2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221), 2001, Vol.1, p.605-608 vol.1</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/940904$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,777,781,786,787,2052,4036,4037,27906,54536,54901,54913</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/940904$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Hsin-Min Wang</creatorcontrib><creatorcontrib>Meng, H.</creatorcontrib><creatorcontrib>Schone, P.</creatorcontrib><creatorcontrib>Chen, B.</creatorcontrib><creatorcontrib>Wai-Kit Lo</creatorcontrib><title>Multi-scale-audio indexing for translingual spoken document retrieval</title><title>2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221)</title><addtitle>ICASSP</addtitle><description>MEI (Mandarin-English Information) is an English-Chinese crosslingual spoken document retrieval (CL-SDR) system developed during the Johns Hopkins University Summer Workshop 2000. We integrate speech recognition, machine translation, and information retrieval technologies to perform CL-SDR. MEI advocates a multi-scale paradigm, where both Chinese words and subwords (characters and syllables) are used in retrieval. The use of subword units can complement the word unit in handling the problems of Chinese word tokenization ambiguity, Chinese homophone ambiguity, and out-of-vocabulary words in audio indexing. This paper focuses on multi-scale audio indexing in MEI. Experiments are based on the Topic Detection and Tracking Corpora (TDT-2 and TDT-3), where we indexed Voice of America Mandarin news broadcasts by speech recognition on both the word and subword scales. We discuss the development of the MEI syllable recognizer, the representations of spoken documents using overlapping subword n-grams and lattice structures. Results show that augmenting words with subwords is beneficial to CL-SDR performance.</description><subject>Error analysis</subject><subject>Gold</subject><subject>Indexing</subject><subject>Information retrieval</subject><subject>Information science</subject><subject>Natural languages</subject><subject>Radio broadcasting</subject><subject>Speech recognition</subject><subject>Systems engineering and theory</subject><subject>Vocabulary</subject><issn>1520-6149</issn><issn>2379-190X</issn><isbn>0780370414</isbn><isbn>9780780370418</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2001</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotj8tOwzAURC0eEmnhB7ryD7hcv2J7iapCkYpAahfsKie5QQY3qewEwd8TqaxGszijM4QsOCw5B3f_vHrY7d6WAoAvnQIH6oIUQhrHuIP3SzIDY0EaUFxdkYJrAazkyt2QWc6fAGCNsgVZv4xxCCzXPiLzYxN6GroGf0L3Qds-0SH5LsepjT7SfOq_sKNNX49H7AaacEgBv328Jdetjxnv_nNO9o_r_WrDtq9Pk-eWBWsGxoEbaXRb1k6WxlmlkAvptJaNdqXSKAWopqpb7YVxvrRoXAVKVSWCrSZ2Thbn2YCIh1MKR59-D-fz8g8ANEwP</recordid><startdate>2001</startdate><enddate>2001</enddate><creator>Hsin-Min Wang</creator><creator>Meng, H.</creator><creator>Schone, P.</creator><creator>Chen, B.</creator><creator>Wai-Kit Lo</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope></search><sort><creationdate>2001</creationdate><title>Multi-scale-audio indexing for translingual spoken document retrieval</title><author>Hsin-Min Wang ; Meng, H. ; Schone, P. ; Chen, B. ; Wai-Kit Lo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i87t-1017375f6c93679844e1239553d59645e3204dbcf5a279a68e79b044b6e08b173</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2001</creationdate><topic>Error analysis</topic><topic>Gold</topic><topic>Indexing</topic><topic>Information retrieval</topic><topic>Information science</topic><topic>Natural languages</topic><topic>Radio broadcasting</topic><topic>Speech recognition</topic><topic>Systems engineering and theory</topic><topic>Vocabulary</topic><toplevel>online_resources</toplevel><creatorcontrib>Hsin-Min Wang</creatorcontrib><creatorcontrib>Meng, H.</creatorcontrib><creatorcontrib>Schone, P.</creatorcontrib><creatorcontrib>Chen, B.</creatorcontrib><creatorcontrib>Wai-Kit Lo</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Xplore</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Hsin-Min Wang</au><au>Meng, H.</au><au>Schone, P.</au><au>Chen, B.</au><au>Wai-Kit Lo</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Multi-scale-audio indexing for translingual spoken document retrieval</atitle><btitle>2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221)</btitle><stitle>ICASSP</stitle><date>2001</date><risdate>2001</risdate><volume>1</volume><spage>605</spage><epage>608 vol.1</epage><pages>605-608 vol.1</pages><issn>1520-6149</issn><eissn>2379-190X</eissn><isbn>0780370414</isbn><isbn>9780780370418</isbn><abstract>MEI (Mandarin-English Information) is an English-Chinese crosslingual spoken document retrieval (CL-SDR) system developed during the Johns Hopkins University Summer Workshop 2000. We integrate speech recognition, machine translation, and information retrieval technologies to perform CL-SDR. MEI advocates a multi-scale paradigm, where both Chinese words and subwords (characters and syllables) are used in retrieval. The use of subword units can complement the word unit in handling the problems of Chinese word tokenization ambiguity, Chinese homophone ambiguity, and out-of-vocabulary words in audio indexing. This paper focuses on multi-scale audio indexing in MEI. Experiments are based on the Topic Detection and Tracking Corpora (TDT-2 and TDT-3), where we indexed Voice of America Mandarin news broadcasts by speech recognition on both the word and subword scales. We discuss the development of the MEI syllable recognizer, the representations of spoken documents using overlapping subword n-grams and lattice structures. Results show that augmenting words with subwords is beneficial to CL-SDR performance.</abstract><pub>IEEE</pub><doi>10.1109/ICASSP.2001.940904</doi></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1520-6149 |
ispartof | 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221), 2001, Vol.1, p.605-608 vol.1 |
issn | 1520-6149 2379-190X |
language | eng |
recordid | cdi_ieee_primary_940904 |
source | IEEE Electronic Library (IEL) Conference Proceedings |
subjects | Error analysis Gold Indexing Information retrieval Information science Natural languages Radio broadcasting Speech recognition Systems engineering and theory Vocabulary |
title | Multi-scale-audio indexing for translingual spoken document retrieval |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-19T20%3A07%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Multi-scale-audio%20indexing%20for%20translingual%20spoken%20document%20retrieval&rft.btitle=2001%20IEEE%20International%20Conference%20on%20Acoustics,%20Speech,%20and%20Signal%20Processing.%20Proceedings%20(Cat.%20No.01CH37221)&rft.au=Hsin-Min%20Wang&rft.date=2001&rft.volume=1&rft.spage=605&rft.epage=608%20vol.1&rft.pages=605-608%20vol.1&rft.issn=1520-6149&rft.eissn=2379-190X&rft.isbn=0780370414&rft.isbn_list=9780780370418&rft_id=info:doi/10.1109/ICASSP.2001.940904&rft_dat=%3Cieee_6IE%3E940904%3C/ieee_6IE%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i87t-1017375f6c93679844e1239553d59645e3204dbcf5a279a68e79b044b6e08b173%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=940904&rfr_iscdi=true |