Busca avançada
Ano de início
Entree


Diversity in Data for Speech Processing in Brazilian Portuguese

Texto completo
Autor(es):
Craveiro, Giovana Meloni ; Galdino, Julio Cesar
Número total de Autores: 2
Tipo de documento: Artigo Científico
Fonte: INTELLIGENT SYSTEMS, BRACIS 2024, PT IV; v. 15415, p. 15-pg., 2025-01-01.
Resumo

Striving to attend AI ethical guidelines is essential when developing and testing AI systems in order to ensure safe and trustworthy applications. However, these guidelines can be too general. The analysis presented here concerns the ethical principle of diversity, by discussing its application to the field of speech processing, using the task of prosodic segmentation of spontaneous speech as a case study. Particularly, it covers the relevance of including diversity of speaker's profiles and regional variants in data used for training and developing AI applications, in the context of Brazilian Portuguese (BP). The contributions brought by this study are: (i) a discussion of the application of the diversity principle in the context of corpora for speech applications, considering some relevant aspects and the process we formulated to select a diverse sample of speakers to compose our corpus; (ii) a literature review of the current scenario of available corpora for the task of prosodic segmentation of spontaneous speech in BP, focused on the diversity of the data; (iii) a publicly available speech corpus (The corpus is publicly available in our Github repository https://github.com/nilc-nlp/MuPe-Diversidades/under the CC BY-NC-ND 4.0 license) containing 2 h 32min 15 s of spontaneous speech audios in BP, their revised transcriptions with automatic prosodic segmentation annotation, elaborated to comprise diversity of age, gender, and accents (17 Brazilian states). (AU)

Processo FAPESP: 19/07665-4 - Centro de Inteligência Artificial
Beneficiário:Fabio Gagliardi Cozman
Modalidade de apoio: Auxílio à Pesquisa - Programa eScience e Data Science - Centros de Pesquisa em Engenharia