Loading…
Design of Voice Privacy System using Linear Prediction
Speaker's identity is the most crucial information exploited (implicitly) by an Automatic Speaker Verification (ASV) system. Numerous attacks can be obliterated simultaneously if privacy preservation is exercised for a speaker's identity. The baseline of the Voice Privacy Challenge 2020 by...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Speaker's identity is the most crucial information exploited (implicitly) by an Automatic Speaker Verification (ASV) system. Numerous attacks can be obliterated simultaneously if privacy preservation is exercised for a speaker's identity. The baseline of the Voice Privacy Challenge 2020 by INTERSPEECH uses the Linear Prediction (LP) model of speech, and McAdam's coefficient for achieving speaker de-identification. The baseline approach focuses on altering only the pole angles using McAdam's coefficient. However, from speech acoustics and digital resonator design, the radius of the poles is associated with various energy losses. The energy losses implicitly carry speaker-specific information during speech production. To that effect, the authors have brought fine-tuned changes in both pole angle and pole radius, resulting in 18.98% higher value of EER for Vctk-test-com dataset, and 5% lower WER for Libri-test dataset compared to the baseline. This means privacy-preservation is indeed improved by our approach. Furthermore, we have exploited the relatively poor spectral resolution of female speakers to our advantage for achieving effective anonymization. To that effect, gender-based analysis of the obtained results reveals that our approach leads to better speaker anonymization for females as compared to the male speakers. |
---|---|
ISSN: | 2640-0103 |