Loading…

Enhancements in automatic Kannada speech recognition system by background noise elimination and alternate acoustic modelling

In this paper, the improvements in the recently implemented Kannada speech recognition system is demonstrated in detail. The Kannada automatic speech recognition (ASR) system consists of ASR models which are created by using Kaldi, IVRS call flow and weather and agricultural commodity prices informa...

Full description

Saved in:
Bibliographic Details
Published in:International journal of speech technology 2020-03, Vol.23 (1), p.149-167
Main Authors: Thimmaraja Yadava, G., Jayanna, H. S.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c319t-a603be8eb32953bfcf4151c941c721a29d9181d0f36927228886c871988cd1033
cites cdi_FETCH-LOGICAL-c319t-a603be8eb32953bfcf4151c941c721a29d9181d0f36927228886c871988cd1033
container_end_page 167
container_issue 1
container_start_page 149
container_title International journal of speech technology
container_volume 23
creator Thimmaraja Yadava, G.
Jayanna, H. S.
description In this paper, the improvements in the recently implemented Kannada speech recognition system is demonstrated in detail. The Kannada automatic speech recognition (ASR) system consists of ASR models which are created by using Kaldi, IVRS call flow and weather and agricultural commodity prices information databases. The task specific speech data used in the recently developed spoken dialogue system had high level of different background noises. The different types of noises present in collected speech data had an adverse effect on the on line and off line speech recognition performances. Therefore, to improve the speech recognition accuracy in Kannada ASR system, a noise reduction algorithm is developed which is a fusion of spectral subtraction with voice activity detection (SS-VAD) and minimum mean square error spectrum power estimator based on zero crossing (MMSE-SPZC) estimator. The noise elimination algorithm is added in the system before the feature extraction part. An alternative ASR models are created using subspace Gaussian mixture models (SGMM) and deep neural network (DNN) modeling techniques. The experimental results show that, the fusion of noise elimination technique and SGMM/DNN based modeling gives a better relative improvement of 7.68% accuracy compared to the recently developed GMM-HMM based ASR system. The least word error rate (WER) acoustic models could be used in spoken dialogue system. The developed spoken query system is tested from Karnataka farmers under uncontrolled environment.
doi_str_mv 10.1007/s10772-020-09671-5
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2363167970</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2363167970</sourcerecordid><originalsourceid>FETCH-LOGICAL-c319t-a603be8eb32953bfcf4151c941c721a29d9181d0f36927228886c871988cd1033</originalsourceid><addsrcrecordid>eNp9kM1OwzAQhCMEEqXwApwscQ547Sa2j6gqPwKJC5wtx9kUl8QudnKoxMNjWiRunHZ3NDNafUVxCfQaKBU3CagQrKSMllTVAsrqqJhBlSUJQI_zziWUbAH1aXGW0oZSqoRis-Jr5d-NtzigHxNxnphpDIMZnSVPxnvTGpK2iPadRLRh7d3ogidpl0YcSLMjjbEf6xgm3xIfXEKCvRucN3ubyarpR4z5RmJsmNJP8RBa7Hvn1-fFSWf6hBe_c1683a1elw_l88v94_L2ubQc1FiamvIGJTacqYo3ne0WUIFVC7CCgWGqVSChpR2vFROMSSlrKwUoKW0LlPN5cXXo3cbwOWEa9SZM-ak-acZrDrVQgmYXO7hsDClF7PQ2usHEnQaqfyjrA2WdKes9ZV3lED-EUjb7Nca_6n9S32fFgWk</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2363167970</pqid></control><display><type>article</type><title>Enhancements in automatic Kannada speech recognition system by background noise elimination and alternate acoustic modelling</title><source>Springer Nature</source><source>Linguistics and Language Behavior Abstracts (LLBA)</source><creator>Thimmaraja Yadava, G. ; Jayanna, H. S.</creator><creatorcontrib>Thimmaraja Yadava, G. ; Jayanna, H. S.</creatorcontrib><description>In this paper, the improvements in the recently implemented Kannada speech recognition system is demonstrated in detail. The Kannada automatic speech recognition (ASR) system consists of ASR models which are created by using Kaldi, IVRS call flow and weather and agricultural commodity prices information databases. The task specific speech data used in the recently developed spoken dialogue system had high level of different background noises. The different types of noises present in collected speech data had an adverse effect on the on line and off line speech recognition performances. Therefore, to improve the speech recognition accuracy in Kannada ASR system, a noise reduction algorithm is developed which is a fusion of spectral subtraction with voice activity detection (SS-VAD) and minimum mean square error spectrum power estimator based on zero crossing (MMSE-SPZC) estimator. The noise elimination algorithm is added in the system before the feature extraction part. An alternative ASR models are created using subspace Gaussian mixture models (SGMM) and deep neural network (DNN) modeling techniques. The experimental results show that, the fusion of noise elimination technique and SGMM/DNN based modeling gives a better relative improvement of 7.68% accuracy compared to the recently developed GMM-HMM based ASR system. The least word error rate (WER) acoustic models could be used in spoken dialogue system. The developed spoken query system is tested from Karnataka farmers under uncontrolled environment.</description><identifier>ISSN: 1381-2416</identifier><identifier>EISSN: 1572-8110</identifier><identifier>DOI: 10.1007/s10772-020-09671-5</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Acoustic noise ; Acoustics ; Agricultural commodities ; Algorithms ; Artificial Intelligence ; Automatic speech recognition ; Background noise ; Deep learning ; Engineering ; Error analysis ; Feature extraction ; Kannada language ; Modelling ; Neural networks ; Noise ; Noise reduction ; Performance enhancement ; Pricing ; Probabilistic models ; Product development ; Signal,Image and Speech Processing ; Social Sciences ; Speech ; Speech recognition ; Subtraction ; Voice activity detectors ; Voice recognition ; Weather</subject><ispartof>International journal of speech technology, 2020-03, Vol.23 (1), p.149-167</ispartof><rights>Springer Science+Business Media, LLC, part of Springer Nature 2020</rights><rights>2020© Springer Science+Business Media, LLC, part of Springer Nature 2020</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c319t-a603be8eb32953bfcf4151c941c721a29d9181d0f36927228886c871988cd1033</citedby><cites>FETCH-LOGICAL-c319t-a603be8eb32953bfcf4151c941c721a29d9181d0f36927228886c871988cd1033</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27915,27916,31260</link.rule.ids></links><search><creatorcontrib>Thimmaraja Yadava, G.</creatorcontrib><creatorcontrib>Jayanna, H. S.</creatorcontrib><title>Enhancements in automatic Kannada speech recognition system by background noise elimination and alternate acoustic modelling</title><title>International journal of speech technology</title><addtitle>Int J Speech Technol</addtitle><description>In this paper, the improvements in the recently implemented Kannada speech recognition system is demonstrated in detail. The Kannada automatic speech recognition (ASR) system consists of ASR models which are created by using Kaldi, IVRS call flow and weather and agricultural commodity prices information databases. The task specific speech data used in the recently developed spoken dialogue system had high level of different background noises. The different types of noises present in collected speech data had an adverse effect on the on line and off line speech recognition performances. Therefore, to improve the speech recognition accuracy in Kannada ASR system, a noise reduction algorithm is developed which is a fusion of spectral subtraction with voice activity detection (SS-VAD) and minimum mean square error spectrum power estimator based on zero crossing (MMSE-SPZC) estimator. The noise elimination algorithm is added in the system before the feature extraction part. An alternative ASR models are created using subspace Gaussian mixture models (SGMM) and deep neural network (DNN) modeling techniques. The experimental results show that, the fusion of noise elimination technique and SGMM/DNN based modeling gives a better relative improvement of 7.68% accuracy compared to the recently developed GMM-HMM based ASR system. The least word error rate (WER) acoustic models could be used in spoken dialogue system. The developed spoken query system is tested from Karnataka farmers under uncontrolled environment.</description><subject>Acoustic noise</subject><subject>Acoustics</subject><subject>Agricultural commodities</subject><subject>Algorithms</subject><subject>Artificial Intelligence</subject><subject>Automatic speech recognition</subject><subject>Background noise</subject><subject>Deep learning</subject><subject>Engineering</subject><subject>Error analysis</subject><subject>Feature extraction</subject><subject>Kannada language</subject><subject>Modelling</subject><subject>Neural networks</subject><subject>Noise</subject><subject>Noise reduction</subject><subject>Performance enhancement</subject><subject>Pricing</subject><subject>Probabilistic models</subject><subject>Product development</subject><subject>Signal,Image and Speech Processing</subject><subject>Social Sciences</subject><subject>Speech</subject><subject>Speech recognition</subject><subject>Subtraction</subject><subject>Voice activity detectors</subject><subject>Voice recognition</subject><subject>Weather</subject><issn>1381-2416</issn><issn>1572-8110</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>7T9</sourceid><recordid>eNp9kM1OwzAQhCMEEqXwApwscQ547Sa2j6gqPwKJC5wtx9kUl8QudnKoxMNjWiRunHZ3NDNafUVxCfQaKBU3CagQrKSMllTVAsrqqJhBlSUJQI_zziWUbAH1aXGW0oZSqoRis-Jr5d-NtzigHxNxnphpDIMZnSVPxnvTGpK2iPadRLRh7d3ogidpl0YcSLMjjbEf6xgm3xIfXEKCvRucN3ubyarpR4z5RmJsmNJP8RBa7Hvn1-fFSWf6hBe_c1683a1elw_l88v94_L2ubQc1FiamvIGJTacqYo3ne0WUIFVC7CCgWGqVSChpR2vFROMSSlrKwUoKW0LlPN5cXXo3cbwOWEa9SZM-ak-acZrDrVQgmYXO7hsDClF7PQ2usHEnQaqfyjrA2WdKes9ZV3lED-EUjb7Nca_6n9S32fFgWk</recordid><startdate>20200301</startdate><enddate>20200301</enddate><creator>Thimmaraja Yadava, G.</creator><creator>Jayanna, H. S.</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7T9</scope></search><sort><creationdate>20200301</creationdate><title>Enhancements in automatic Kannada speech recognition system by background noise elimination and alternate acoustic modelling</title><author>Thimmaraja Yadava, G. ; Jayanna, H. S.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c319t-a603be8eb32953bfcf4151c941c721a29d9181d0f36927228886c871988cd1033</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Acoustic noise</topic><topic>Acoustics</topic><topic>Agricultural commodities</topic><topic>Algorithms</topic><topic>Artificial Intelligence</topic><topic>Automatic speech recognition</topic><topic>Background noise</topic><topic>Deep learning</topic><topic>Engineering</topic><topic>Error analysis</topic><topic>Feature extraction</topic><topic>Kannada language</topic><topic>Modelling</topic><topic>Neural networks</topic><topic>Noise</topic><topic>Noise reduction</topic><topic>Performance enhancement</topic><topic>Pricing</topic><topic>Probabilistic models</topic><topic>Product development</topic><topic>Signal,Image and Speech Processing</topic><topic>Social Sciences</topic><topic>Speech</topic><topic>Speech recognition</topic><topic>Subtraction</topic><topic>Voice activity detectors</topic><topic>Voice recognition</topic><topic>Weather</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Thimmaraja Yadava, G.</creatorcontrib><creatorcontrib>Jayanna, H. S.</creatorcontrib><collection>CrossRef</collection><collection>Linguistics and Language Behavior Abstracts (LLBA)</collection><jtitle>International journal of speech technology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Thimmaraja Yadava, G.</au><au>Jayanna, H. S.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Enhancements in automatic Kannada speech recognition system by background noise elimination and alternate acoustic modelling</atitle><jtitle>International journal of speech technology</jtitle><stitle>Int J Speech Technol</stitle><date>2020-03-01</date><risdate>2020</risdate><volume>23</volume><issue>1</issue><spage>149</spage><epage>167</epage><pages>149-167</pages><issn>1381-2416</issn><eissn>1572-8110</eissn><abstract>In this paper, the improvements in the recently implemented Kannada speech recognition system is demonstrated in detail. The Kannada automatic speech recognition (ASR) system consists of ASR models which are created by using Kaldi, IVRS call flow and weather and agricultural commodity prices information databases. The task specific speech data used in the recently developed spoken dialogue system had high level of different background noises. The different types of noises present in collected speech data had an adverse effect on the on line and off line speech recognition performances. Therefore, to improve the speech recognition accuracy in Kannada ASR system, a noise reduction algorithm is developed which is a fusion of spectral subtraction with voice activity detection (SS-VAD) and minimum mean square error spectrum power estimator based on zero crossing (MMSE-SPZC) estimator. The noise elimination algorithm is added in the system before the feature extraction part. An alternative ASR models are created using subspace Gaussian mixture models (SGMM) and deep neural network (DNN) modeling techniques. The experimental results show that, the fusion of noise elimination technique and SGMM/DNN based modeling gives a better relative improvement of 7.68% accuracy compared to the recently developed GMM-HMM based ASR system. The least word error rate (WER) acoustic models could be used in spoken dialogue system. The developed spoken query system is tested from Karnataka farmers under uncontrolled environment.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s10772-020-09671-5</doi><tpages>19</tpages></addata></record>
fulltext fulltext
identifier ISSN: 1381-2416
ispartof International journal of speech technology, 2020-03, Vol.23 (1), p.149-167
issn 1381-2416
1572-8110
language eng
recordid cdi_proquest_journals_2363167970
source Springer Nature; Linguistics and Language Behavior Abstracts (LLBA)
subjects Acoustic noise
Acoustics
Agricultural commodities
Algorithms
Artificial Intelligence
Automatic speech recognition
Background noise
Deep learning
Engineering
Error analysis
Feature extraction
Kannada language
Modelling
Neural networks
Noise
Noise reduction
Performance enhancement
Pricing
Probabilistic models
Product development
Signal,Image and Speech Processing
Social Sciences
Speech
Speech recognition
Subtraction
Voice activity detectors
Voice recognition
Weather
title Enhancements in automatic Kannada speech recognition system by background noise elimination and alternate acoustic modelling
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-14T21%3A36%3A34IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Enhancements%20in%20automatic%20Kannada%20speech%20recognition%20system%20by%20background%20noise%20elimination%20and%20alternate%20acoustic%20modelling&rft.jtitle=International%20journal%20of%20speech%20technology&rft.au=Thimmaraja%20Yadava,%20G.&rft.date=2020-03-01&rft.volume=23&rft.issue=1&rft.spage=149&rft.epage=167&rft.pages=149-167&rft.issn=1381-2416&rft.eissn=1572-8110&rft_id=info:doi/10.1007/s10772-020-09671-5&rft_dat=%3Cproquest_cross%3E2363167970%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c319t-a603be8eb32953bfcf4151c941c721a29d9181d0f36927228886c871988cd1033%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2363167970&rft_id=info:pmid/&rfr_iscdi=true