Diversity in Data for Speech Processing in Brazilian Portuguese

Craveiro, Giovana Meloni; Galdino, Julio Cesar

Full text
Author(s):	Craveiro, Giovana Meloni ; Galdino, Julio Cesar Total Authors: 2
Document type:	Journal article
Source:	INTELLIGENT SYSTEMS, BRACIS 2024, PT IV; v. 15415, p. 15-pg., 2025-01-01.
Abstract
Striving to attend AI ethical guidelines is essential when developing and testing AI systems in order to ensure safe and trustworthy applications. However, these guidelines can be too general. The analysis presented here concerns the ethical principle of diversity, by discussing its application to the field of speech processing, using the task of prosodic segmentation of spontaneous speech as a case study. Particularly, it covers the relevance of including diversity of speaker's profiles and regional variants in data used for training and developing AI applications, in the context of Brazilian Portuguese (BP). The contributions brought by this study are: (i) a discussion of the application of the diversity principle in the context of corpora for speech applications, considering some relevant aspects and the process we formulated to select a diverse sample of speakers to compose our corpus; (ii) a literature review of the current scenario of available corpora for the task of prosodic segmentation of spontaneous speech in BP, focused on the diversity of the data; (iii) a publicly available speech corpus (The corpus is publicly available in our Github repository https://github.com/nilc-nlp/MuPe-Diversidades/under the CC BY-NC-ND 4.0 license) containing 2 h 32min 15 s of spontaneous speech audios in BP, their revised transcriptions with automatic prosodic segmentation annotation, elaborated to comprise diversity of age, gender, and accents (17 Brazilian states). (AU)

FAPESP's process:	19/07665-4 - Center for Artificial Intelligence
Grantee:	Fabio Gagliardi Cozman
Support Opportunities:	Research Grants - Research Program in eScience and Data Science - Research Centers in Engineering Program

Short URL