Loading…

Many-to-Many Singing Performance Style Transfer on Pitch and Energy Contours

Singing voice conversion (SVC) aims to convert the singer identity of a singing voice to that of another singer. However, most existing SVC systems only perform the conversion of timbre information, while leaving other information unchanged. This approach does not consider other aspects of singer id...

Full description

Saved in:
Bibliographic Details
Published in:IEEE signal processing letters 2025, Vol.32, p.166-170
Main Authors: Hsu, Yu-Teng, Wang, Jun-You, Jang, Jyh-Shing Roger
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Singing voice conversion (SVC) aims to convert the singer identity of a singing voice to that of another singer. However, most existing SVC systems only perform the conversion of timbre information, while leaving other information unchanged. This approach does not consider other aspects of singer identity, particularly a singer's performance style, which is reflected in the pitch (F0) and the energy (volume dynamics) contours of singing. To address this issue, this paper proposes a many-to-many singing performance style transfer system that converts the pitch and energy contours of one singer's style to another singer's. To achieve this target, we utilize two AutoVC-like autoencoders with an information bottleneck to automatically disentangle performance style from other musical contents, one for the pitch contour while another for the energy contour. Experiment results suggested that the proposed model can perform singing performance style transfer in a many-to-many conversion scenario, resulting in improved singer identity similarity to the target singer.
ISSN:1070-9908
1558-2361
DOI:10.1109/LSP.2024.3506858