Voice Conversion Challenge 2018
We are glad to invite you to participate in the 2nd Voice Conversion Challenge to compare different voice conversion systems and approaches using the same voice data.
The previous challenge can be accessed below:
Tasks of the 2nd Challenge
The objective is speaker conversion, which is a well-known basic problem in voice conversion. We have prepared two tasks:
Other voices of the same source speakers will be provided later as test data consistsing of around 50 sentences for each speaker. Each participant will generate converted voices from them using the developed 16 conversion systems.
The resulting 16 converted voice sets will be evaluated in terms of perceived naturalness and similarity through listening tests.
- Hub task (main task): parallel training
- We will provide voices of 4 source and 4 target speakers (consisting of both female and male speakers) from fixed corpora as training data. Each speaker utters the same sentences set consisting of around 80 sentences.
- Using these parallel data sets, voice conversion systems for all speaker-pair combinations (16 speaker-pairs in total) will be developed by each participant.
- Spoke task (optional task): nonparallel training
- We will also provide voices of other 4 source speakers (consisting of both female and male speakers) from fixed corpora as training data. Each speaker utters another sentences set consisting of around 80 sentences. The target speakers are the same as in the hub task. Therefore, the sentence set of the source speakers is different from that of the target speakers.
- Using these nonparallel data sets, voice conversion systems for all speaker-pair combinations (16 speaker-pairs in total) will be developed by each participant.
We focus on 22.05 kHz speech and signal-to-signal conversion strategies. No transcriptions will be provided for the test set, and the use of manual annotations is NOT allowed (check the rules section for more detailed information). Participants are free of using additional data (for training purposes).
<< Important changes compared to the 1st challenge >>
There are some important changes regarding rules and listening tests compared to the 2016 challenge:
- In the 2018 challenge you are allowed to mix and combine different source speaker's data to train speaker-independent models.
- In the 2018 challenge you may use orthographic transcriptions of the released training data to train your voice conversion systems. Note that we will not provide orthographic transcriptions of speech data in the evaluation set.
- In the 2018 challenge you may perform manual annotations of the released training data. However, we will not allow you to perform manual annotations of speech data in the evaluation set.
- In the 2018 challenge listening tests will use natural speech at 22.05kHz sampling frequency as the reference signal.
The tentative schedule is as follows:
- October 1st: release of training data
- December 1st: release of evaluation data
- December 8th: deadline to submit the converted audio.
- January 26th: notification of results
How to Participate?
There is no fee for registration. Please register your team at the following page by September 29th if you want to participate in the challenge.
- Registration page (closed)
Participants need to follow strictly the Challenge rules. Please read carefully the following page:
This work was supported in part by
- iFLYTEK (http://www.iflytek.com/en/)
- JSPS KAKENHI Grant Number JP17H06101
- MEXT KAKENHI Grant Numbers (15H01686, 16H06302, 17H04687)
- Junichi Yamagishi & Jaime Lorenzo-Trueba (National Institute of Informatics)
- Tomoki Toda (Nagoya University)
- Daisuke Saito (Tokyo University)
- Fernando Villavicencio (ObEN)
- Tomi Kinnunen (University of Eastern Finland)
- Zhenhua Ling (University of Science and Technology of China)
Contact information: vcc2018__at__vc-challenge.org