Loading…

Privacy versus Emotion Preservation Trade-offs in Emotion-Preserving Speaker Anonymization

Advances in speech technology now allow unprecedented access to personally identifiable information through speech. To protect such information, the differential privacy field has explored ways to anonymize speech while preserving its utility, including linguistic and paralinguistic aspects. However...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org 2024-09
Main Authors: Cai, Zexin, Henry Li Xinyuan, Garg, Ashi, GarcĂ­a-Perera, Leibny Paola, Duh, Kevin, Khudanpur, Sanjeev, Andrews, Nicholas, Wiesner, Matthew
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Advances in speech technology now allow unprecedented access to personally identifiable information through speech. To protect such information, the differential privacy field has explored ways to anonymize speech while preserving its utility, including linguistic and paralinguistic aspects. However, anonymizing speech while maintaining emotional state remains challenging. We explore this problem in the context of the VoicePrivacy 2024 challenge. Specifically, we developed various speaker anonymization pipelines and find that approaches either excel at anonymization or preserving emotion state, but not both simultaneously. Achieving both would require an in-domain emotion recognizer. Additionally, we found that it is feasible to train a semi-effective speaker verification system using only emotion representations, demonstrating the challenge of separating these two modalities.
ISSN:2331-8422