Voice Conversion Challenge 2020
Registration opens! Please register your team at this page by Feb. 28th, 2020!
We are glad to invite you to participate in the 3rd Voice Conversion Challenge to compare different voice conversion systems and approaches using the same voice data.
The previous challenges can be accessed below:
Tasks of the 3rd Challenge
The objective is speaker conversion, which is a well-known basic problem in voice conversion. We plan to prepare two tasks based on nonparallel training:
We focus on 24 kHz speech and signal-to-signal conversion strategies. No transcriptions will be provided for the test set, and the use of manual annotations is NOT allowed. Participants are free of using additional data (for training purposes). We basically follow the rules of the 2018 challenge:
- 1st task: voice conversion within the same language
- In training, a sentence set uttered by the source speaker is different from that uttered by the target speaker, but they are still the same language. Moreover, only a small number of sentences are shared between these two sentence sets.
- In conversion, the source speaker's voice is converted as if it is uttered by the target speaker while keeping linguistic contents unchanged.
- We will provide voices of 4 source and 4 target speakers (consisting of both female and male speakers) from fixed corpora as training data. Each speaker utters a sentence set consisting of 70 sentences. Only 20 sentences are parallel and the other 50 sentences are nonparallel between the source and target speakers.
- Using these data sets, voice conversion systems for all speaker-pair combinations (16 speaker-pairs in total) will be developed by each participant.
- 2nd task: cross-lingual voice conversion
- In training, a sentence set uttered by the source speaker is totally different from that uttered by the target speaker as a language of the source speaker is different from that of the target speaker.
- In conversion, the source speaker's voice in the source language is converted as if it is uttered by the target speaker while keeping linguistic contents unchanged.
- We will also provide voices of other 6 target speakers (consisting of both female and male speakers) from fixed corpora as training data. The source speakers are the same as in the 1st task. Each target speaker utters another sentence set consisting of around 70 sentences in a different language.
- Using these nonparallel data sets, voice conversion systems for all speaker-pair combinations (24 speaker-pairs in total) will be developed by each participant.
- Other voices of the same source speakers will be provided later as test data consistsing of around 25 sentences for each speaker. Each participant will generate converted voices from them using the developed 16 conversion systems for the 1st task or 24 conversion systems for the 2nd task.
- The resulting converted voice sets will be evaluated in terms of perceived naturalness and similarity through listening tests.
- In the 2020 challenge you are allowed to mix and combine different source speaker's data to train speaker-independent models.
- In the 2020 challenge you may use orthographic transcriptions of the released training data to train your voice conversion systems. Note that we will not provide orthographic transcriptions of speech data in the evaluation set.
- In the 2020 challenge you may perform manual annotations of the released training data. However, we will not allow you to perform manual annotations of speech data in the evaluation set.
- In the 2020 challenge listening tests will use natural speech at 24 kHz sampling frequency as the reference signal.
The tentative schedule is as follows:
- March 9th, 2020: release of training data
- May 11th, 2020: release of evaluation data
- May 18th, 2020: deadline to submit the converted audio.
- July 20th, 2020: notification of results
How to Participate?
There is no fee for registration. Please register your team at the following page by February 28th, 2020 if you want to participate in the challenge.
- Tomoki Toda (Nagoya University)
- Junichi Yamagishi (National Institute of Informatics)
- Tomi Kinnunen (University of Eastern Finland)
- Zhenhua Ling (University of Science and Technology of China)
- Rohan Kumar Das (National University of Singapore)
Contact information: vcc2020__at__vc-challenge.org