Advanced search
Start date
Betweenand

Processing of exclamatory and interrogative sentences for Automatic Speech Recognition in Brazilian Portuguese

Grant number: 24/00536-2
Support Opportunities:Scholarships in Brazil - Master
Start date: September 01, 2024
End date: August 31, 2026
Field of knowledge:Linguistics, Literature and Arts - Linguistics - Linguistic Theory and Analysis
Principal Investigator:Flaviane Romani Fernandes Svartman
Grantee:Rian Pereira Fernandes
Host Institution: Faculdade de Filosofia, Letras e Ciências Humanas (FFLCH). Universidade de São Paulo (USP). São Paulo , SP, Brazil
Company:Universidade de São Paulo (USP). Centro de Inovação da USP (INOVA)
Associated research grant:19/07665-4 - Center for Artificial Intelligence, AP.eScience.CPE

Abstract

This research, which is linked to the TaRSila project, from the Center for Artificial Intelligence (C4AI), collaboration IBM/FAPESP/USP (process FAPESP 2019/07665-4) (COZMAN, 2019 - current), from the University of São Paulo, aims to main objectives: (i) the identification and descriptive analysis of data with errors that were generated by the Whisper model (RADFORD et al., 2022) regarding the punctuation of exclamatory and interrogative sentences in Brazilian Portuguese (hereinafter, PB); and (ii) the construction of an Automatic Speech Recognition model (hereinafter, ASR) that presents good performance when using these punctuation marks.In order to achieve these objectives, this project seeks to: (a) understand the prosodic behavior of exclamatory and interrogative sentences in BP, based on the study of the intonation of these sentence types, in light of the theoretical framework of Intoational Phonology (cf. 8), in a speech corpus for which automatic punctuation is generated; (b) verification of errors arising from the computational model, followed by a comparison with the study mentioned in (a), in order to understand the motivation for such errors; and (c) a study on the functioning of the wav2vec 2.0 model (BAEVSKI, et al., 2020), for the subsequent construction of an efficient model for scoring exclamatory and interrogative sentences in BP.The hypothesis of this research is that the occurrence of ASR punctuation errors is related, among other factors, to the difficulty of the computational model dealing with the prosody of interrogative and exclamatory sentence types. Our objective is to confirm or refute this hypothesis, based on the analysis of errors in automatic transcriptions, and, aiming to improve these transcriptions, build a computational model that presents better performance on the task in question.

News published in Agência FAPESP Newsletter about the scholarship:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)