Samples for "Speaker Adaptation of a Multilingual Acoustic Model for Cross-language Synthesis"
There are two target speakers, the first one speaks Mandarin and English fluently, and the second speaks Korean and English fluently. For each table, each row represents different adaptation dataset used to adapt the base multilingual acoustic model. Different target languages for synthesis are presented in each column.
A bilingual Mandarin/English target speaker:
Copy-synthesis audio in Mandarin (ZH): Copy-synthesis audio in English (EN):
Example 1
Development data \ Target Language
Mandarin
English
Japanese
Mandarin data only (ZH)
English data only (EN)
Mixed data Mandarin+English (ZH + EN)
Mixed data Mandarin+extraEnglish (ZH + EN*)
N/A
Mixed English+extraMandarin (EN + ZH*)
N/A
Example 2
Development data \ Target Language
Mandarin
English
Japanese
Mandarin data only (ZH)
English data only (EN)
Mixed data Mandarin+English (ZH + EN)
Mixed data Mandarin+extraEnglish (ZH + EN*)
N/A
Mixed English+extraMandarin (EN + ZH*)
N/A
Example 3
Development data \ Target language
Mandarin
English
Japanese
Mandarin data only (ZH)
English data only (EN)
Mixed data Mandarin+English (ZH + EN)
Mixed data Mandarin+extraEnglish (ZH + EN*)
N/A
Mixed data English+extraMandarin (EN + ZH*)
N/A
A bilingual Korean/English target speaker:
Copy-synthesis audio in Korean (KO): Copy-synthesis audio in English (EN):