Loading…

Difference in speech analysis results by compression

Mental health disorder has become a problem in many developed countries and in order to cope with it, a screening technology that will help to check depression and stress is being sought. The authors conducted research into estimating health status from the voice in a previous study, and have develo...

Full description

Saved in:
Bibliographic Details
Main Authors: Omiya, Yasuhiro, Hagiwara, Naoki, Takano, Takeshi, Shinohara, Shuji, Nakamura, Mitsuteru, Higuchi, Masakazu, Mitsuyoshi, Shunji, Tokuno, Shinichi
Format: Conference Proceeding
Language:eng ; jpn
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Mental health disorder has become a problem in many developed countries and in order to cope with it, a screening technology that will help to check depression and stress is being sought. The authors conducted research into estimating health status from the voice in a previous study, and have developed the MIMOSYS (Mind Monitoring System). The recorded voice might compress for efficiently transmitting or store the voice, so it is possible for sound quality deterioration caused by the coding of the voice to impact the results of the MIMOSYS analysis. The degradation of sound quality due to audio compression is performed by general signal quality evaluation, e.g. Peak Signal-to-Noise Ratio (PSNR) and mean opinion score (MOS). However, it is necessary to individually evaluate the impact on the health indicator based on the voice features. The purpose of this study is to verify the impact of voice sound quality degradation by compression on health state evaluation using voice. In the experiment, we used recorded voice of the 979 subjects of reading 17 fixed phrases, and AAC/MP3/WMA coding was applied assuming compression when recording and archiving. Here, the average PSNR square wave between original wave format file and compressed files with an AAC, MP3, and WMA coding were 29.58dB, 55.96dB, and 29.58dB. The audio before and after compressing was analyzed to compare the degree of health by correlation evaluation. The results show that there is a strong correlation between before and after compression, suggesting the possibility of using compressed audio for health state evaluation using voice.
ISSN:2189-8723
DOI:10.1109/ICIIBMS.2017.8279713