Voice Conversion Challenge 2016 Rules
- Please email to email@example.com if you want to participate in the challenge.
- There is no fee for registration.
Voice Data Provided in the Challenge
- The organizers will provide the data set consisting of 5 source and 5 target speakers' voices. The same 162 sentences are uttered by each speaker, i.e., in total 1,620 utterances included in the data set.
- No transcription is included in the data set. Only waveform files corresponding to the above utterances are included.
- After registration, a password for downloading the data set will be issued.
- A text file (README) describing more detail information will also be included in the data set. Please carefully read it.
Task of the Challenge
- The task of this challenge is speaker conversion.
- Training step
- Each participant needs to develop voice conversion systems for all pairs of the source and target speakers using 162 parallel utterance pairs in each pair as training data.
- In total, 25 conversion systems (i.e., 5 sources by 5 targets) will be developed.
- Conversion step
- Another voice data set of the same 5 source speakers will be provided later, which consists of 54 utterances for each source speaker, i.e., in total 270 utterances.
- Each participant needs to convert these sorce speakers' voice samples into individual target speaker's voices with the developed 25 conversion systems.
- In total, 1,350 converted voice samples (54 utterances times 25 speaker pairs) will be generated.
- These converted voice samples will be submitted to the organizers, and then they will be evaluated in the listening tests on naturalness and speaker similarity.
- No manual edition or modification in the conversion step is allowed. Participants can manually optimize individual conversion systems in the training step but they cannot in the conversion step (e.g., even manual tuning of system parameters is NOT allowed in the conversion step).
- The use of manual transcriptions (such as phoneme information or linguistic information) on both training and evaluation data sets is NOT allowed (but automatic speech recognition systems may be used to generate automatic transcriptions).
- Any acoustic features including suprasegmental and duration features may be transformed.
- To develop a conversion system for a certain speaker-pair, the use of utterances of the other source and target speakers in the data set provided by the organizers is NOT allowed.
- Participants are free of using additional data (for training purpose) different from the data set provided by the organizers.
- Participants are also free of discarding some utterances from the data set for training.
- It is not permissible for a single participant to submit multiple entries because the listening test will become unmanageable. If participants involved in joint projects or consortia who wish to submit multiple systems, please ask the organisers in advance to agree this.
- Participants need to complete a form giving the general technical specification of their developed conversion system, to facilitate easy cross-system comparisons (e.g. is it a GMM-based system? does it convert prosodic features? etc).
- If you are in any doubt about how to apply these rules, please contact the organizers (firstname.lastname@example.org) immediately.
- More details
- General guidelines for the submission of the entries by the participants:
- Participant questionnaire (the form to give the general technical specification):
Expert Listeners for Listening Tests
- Each participant needs to recruit at least several volunteer listeners as expert listeners for each of the evaluation tests (on naturalness and speaker similarity). Native speakers are preferable but not necessary.
⇒ We have decided to conduct the listening tests with only paid native listeners this time.
- The organisers would also appreciate assistance in advertising the Challenge as widely as possible (e.g., to your students or colleagues).
Retention of Submitted Voice Samples
- Any voice samples that you submit for evaluation will be retained by the Voice Conversion Challenge 2016 organizers for future use.
- When participants will submit the converted voices, we will ask all participants to give the organizers permission to publically distribute the submitted voices and the corresponding listening test results in an anonymized form. We really appreciate if all participants approve this consent agreement!
- We would like to ask each perticipant to submit a paper describing their entry. Now we are trying to make an opportunity for participants to present their papers. We will announce this later.
⇒ We are pleased to inform you that our special session proposal to INTERSPEECH has been accepted!
So, please prepare your paper and submit it to our special session, 7.19 Voice Conversion Challenge 2016 as shown in INTERSPEECH web page.
The paper submission deadline is 23 March 2016.
Please note that the review process is the same as for all other submissions.
[back to Voice Conversion Challenge page]
Contact information: email@example.com