Advanced search
Start date
Betweenand

Leveraging the potential of automatic speaker recognition using artificial intelligence and data from twins: implications to forensic and speech sciences

Grant number: 24/06797-2
Support Opportunities:Scholarships abroad - Research Internship - Post-doctor
Start date: January 06, 2025
End date: January 05, 2026
Field of knowledge:Linguistics, Literature and Arts - Linguistics
Principal Investigator:Sandra Madureira
Grantee:Julio Cesar Cavalcanti de Oliveira
Supervisor: Gabriel Skantze
Host Institution: Faculdade de Filosofia, Comunicação, Letras e Artes. Pontifícia Universidade Católica de São Paulo (PUC-SP). São Paulo , SP, Brazil
Institution abroad: KTH Royal Institute of Technology, Sweden  
Associated to the scholarship:23/11070-1 - Multidimensional acoustic-phonetic analysis of identical twins and non-genetically related subjects: implications for the forensic speaker comparison in different dialects and speaking styles, BP.PD

Abstract

This research project aims to assess the performance of automatic speaker recognition using Artificial Intelligence (AI) and data from speakers with varying degrees of similarity. The open-source SpeechBrain toolkit will be used for the analyses. The participant pool will consist of 80 speakers, including 10 identical and 10 non-identical male twin pairs, as well as 10 identical and 10 non-identical female twin pairs. All participants are Brazilian Portuguese speakers from São Paulo and Campinas. Two different speaking styles will be considered in the analyses: conversational speech (spontaneous dialogue) and directed speech (interview). The system's performance will be evaluated while comparing identical twins, non-identical twins, and all speakers (via cross-pair comparisons). Such analysis will allow us to understand the extent to which the ASR system's embeddings, based on cepstral features, depend on genetic versus environmental factors. Additionally, the impact of the speakers' sex on the system performance will also be estimated through the comparison of male and female voices. The effects of sample duration will also be explored. Two performance metrics widely applied in forensic research will be used for analyzing the system's performance: the equal error rate (EER) and Log cost-likelihood ratios (Cllr). The findings from this study will shed light on the potential applications and limitations of ASR technology in adverse scenarios, with implications for law enforcement and security systems. Ultimately, this research will contribute to strategies for enhancing ASR technologies, particularly for low-resource languages like Brazilian Portuguese.

News published in Agência FAPESP Newsletter about the scholarship:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)