Singing Voice Conversion Challenge 2025

Introduction of the Voice Conversion Challenge

Voice conversion (VC) refers to the digital cloning of a person's voice; it can be used to modify audio waveform so that it appear as if spoken by someone else (target) than the original speaker (source). The voice conversion challenge (VCC) series aims to advance and compare different methods to approach the core VC technology using a common dataset, metrics and baseline systems provided by the organizers. Rather than focusing on developing the best performing system, the core motivation of the VCC series has always been to provide researchers with information about which methods are currently state-of-the-art, through reproducible systems and experiments.

The latest VCC advanced the application to singing voices with the singing voice conversion challenge (SVCC), where the top systems showed an impressive performance in naturalness. However, similarity scores were not as high as expected, due to the fact that singing voices are much complex to evaluate due to different singing styles that can be sung by the same singer.

SVCC 2025

With this motivation in mind, we are pleased to announce SVCC 2025 and aim to further advance the state-of-the-art in this research field. This year, we focus on singing style conversion (SSC). Compared to singing voice conversion (SVC) which only converts singer identity, SSC focuses on converting the how the singer sings the song and changes the singing style, without changing the linguistic contents and identity of the source singer. SSC is more challenging than VC and SVC, as there are various ways to sing a song in different styles, but still need to follow music theory such that the converted singing voice is still pleasant to listen to. From the research community point of view, SSC is the intersection of speech processing and music processing. SSC is a new, novel, and challenging research field, and we hope to attract attention from researchers in both communities to facilitate interdisciplinary research.

How to Participate

Registration is free. To participate, please complete the registration form:

Challenge Registration

We will only send the training data and instructions to the registered participants.

Please make sure to read the challenge rules before participating.

Challenge Tasks

Task 1: In-Domain Singing Style Conversion

Convert source singer A's singing style from style 1 to style 2
Source singer A is in the training dataset
Reference singing voice in style 2 from singer A is provided in the training dataset

Task 2: Zero-Shot Singing Style Conversion

Convert source singer B's singing style from style 1 to style 2
Source singer B is NOT in the training dataset
Reference singing voice in style 2 from singer B will not be provided
Participants would need to use a reference singing voice in style 2 from a different singer in the training dataset to complete the task

Task	Source	Reference	Conversion
Task 1	Singer A, in style 1	Singer A, in style 2	Singer A, in style 2
Task 2	Singer B, in style 1	Any singer except B, in style 2	Singer B, in style 2

Training data

Contains training data of Task 1 singer A (~4.5 hours, in all 7 singing styles).
No training data of the Task 2 singer B will be provided.
Other singers in the training dataset (~70 hours, in all 7 singing styles) will be provided as additional data.
It will be up to participants how they will choose the target reference style.
Datasets include waveform files and annotated labels (aligned phoneme and MIDI, global and local style labels, transcriptions).
The SVCC 2025 dataset is a subset of the GTSinger dataset. Thus, participants will NOT be allowed to use the GTSinger dataset for training. Please refer to the challenge rules for more details.

Test set details

The participants will be provided with a test set, with each phrase containing 4 source singing styles.
Participants will then have to convert each phrase into the specified singing styles for each phrase.
Participants will only be provided with waveform files and NOT the annotated labels.

Provided singing styles

The challenge will focus on 7 singing styles:
Breathy, Falsetto, Mixed Voice, Pharyngeal, Glissando, Vibrato, and a Control style.

Subjective evaluation details

Naturalness: 5-scale mean opinion score.
Singer identity similarity: 4-scale AB test. Please refer to the SVCC 2023 paper for more details.
Singer style similarity: 4-scale XAB test. Please refer to the Baseline 1 paper for more details.