Busca avançada
Ano de início
Entree


YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for Everyone

Texto completo
Autor(es):
Mostrar menos -
Casanova, Edresson ; Weber, Julian ; Shulby, Christopher ; Candido Junior, Arnaldo ; Goelge, Eren ; Ponti, Moacir Antonelli ; Chaudhuri, K ; Jegelka, S ; Song, L ; Szepesvari, C ; Niu, G ; Sabato, S
Número total de Autores: 12
Tipo de documento: Artigo Científico
Fonte: INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162; v. N/A, p. 12-pg., 2022-01-01.
Resumo

YourTTS brings the power of a multilingual approach to the task of zero-shot multi-speaker TTS. Our method builds upon the VITS model and adds several novel modifications for zero-shot multispeaker and multilingual training. We achieved state-of-the-art (SOTA) results in zero-shot multispeaker TTS and results comparable to SOTA in zero-shot voice conversion on the VCTK dataset. Additionally, our approach achieves promising results in a target language with a single-speaker dataset, opening possibilities for zero-shot multispeaker TTS and zero-shot voice conversion systems in low-resource languages. Finally, it is possible to fine-tune the YourTTS model with less than 1 minute of speech and achieve state-of-theart results in voice similarity and with reasonable quality. This is important to allow synthesis for speakers with a very different voice or recording characteristics from those seen during training. (AU)

Processo FAPESP: 19/07316-0 - Teoria de singularidades e aplicações a geometria diferencial, equações diferenciais e visão computacional
Beneficiário:Farid Tari
Modalidade de apoio: Auxílio à Pesquisa - Temático