Advanced search
Start date
Betweenand

Paraconsistent Feature Analysis of Speech Signals: fighting the voice spoofing attacks

Abstract

According to recent experiments, a playback speech signal or a text-to-speech synthesis based on data collected from a speaker, among other possibilities, suffices to authenticate the subject. Consequently, current systems for biometric speaker identification (BSV) are still vulnerable. Considering this to be an open problem in the field of speech processing, for which much work is required before deciding on a credible implementation, this research project intends to improve BSV, reducing the chances of voice spoofing. Upon a broad literature review, with particular attention to the speakers' prosodic mechanism, the investigative procedure unfolds. For feature extraction, the intention is to compare the potential of learned features with the analysis provided by classical handcrafted extraction, aided by paraconsistent feature engineering, which treats conflictive data as potentially informative. Then, in order to correctly authenticate hundreds of speakers enrolled in the system, the accuracy and performance of recent strategies, such as Convolutional Neural Networks (CNNs) and Deep Belief Networks (DBNs), and classical approaches, such as pattern-matching algorithms, will be evaluated and compared taking two modalities into account: text-dependent and text-independent. Lastly, the intention is to publish the results in renowned scientific journals. (AU)

Articles published in Agência FAPESP Newsletter about the research grant:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)

Scientific publications (15)
(References retrieved automatically from Web of Science and SciELO through information on FAPESP grants and their corresponding numbers as mentioned in the publications by the authors)
CAOBIANCO, LUIZ GUSTAVO; GUIDO, RODRIGO CAPOBIANCO; DA SILVA, IVAN NUNES. Wavelet-based features selected with Paraconsistent Feature Engineering successfully classify events in low-voltage grids. MEASUREMENT, v. 170, . (19/04475-0)
SCALVENZI, RAFAEL RUBIATI; GUIDO, RODRIGO CAPOBIANCO; MARRANGHELLO, NORIAN. Wavelet-packets Associated with Support Vector Machine Are Effective for Monophone Sorting in Music Signals. INTERNATIONAL JOURNAL OF SEMANTIC COMPUTING, v. 13, n. 3, p. 415-425, . (19/04475-0)
ALMEIDA, GILDASIO CASTELLO, JR.; GUIDO, RODRIGO CAPOBIANCO; BALARIN SILVA, HENRIQUE MONTEIRO; BRANDAO, CINARA CASSIA; DE MATTOS, LUIZ CARLOS; LOPES, BERNARDO T.; MACHADO, AYDANO PAMPONET; AMBROSIO, RENATO, JR.. New artificial intelligence index based on Scheimpflug corneal tomography to distinguish subclinical keratoconus from healthy corneas. JOURNAL OF CATARACT AND REFRACTIVE SURGERY, v. 48, n. 10, p. 7-pg., . (15/17226-7, 19/04475-0)
GUIDO, RODRIGO CAPOBIANCO; PEDROSO, FERNANDO; FURLAN, ANDRE; CONTRERAS, RODRIGO COLNAGO; CAOBIANCO, LUIZ GUSTAVO; NETO, JOGI SUDA. CWT x DWT x DTWT x SDTWT: Clarifying terminologies and roles of different types of wavelet transforms. INTERNATIONAL JOURNAL OF WAVELETS MULTIRESOLUTION AND INFORMATION PROCESSING, v. 18, n. 6, . (19/04475-0, 15/14358-0)
LEMOS ESCOLA, JOAO PAULO; DE SOUZA, UENDER BARBOSA; GUIDO, RODRIGO CAPOBIANCO; DA SILVA, IVAN NUNES; FREITAS, JOVANDER DA SILVA; OLIVEIRA, LUCAS DE ARAUJO. mesh network case study for digital audio signal processing in Smart Far. INTERNET OF THINGS, v. 17, . (19/04475-0)
PATIL, ANKUR T.; ACHARYA, RAJUL; PATIL, HEMANT A.; GUIDO, RODRIGO CAPOBIANCO. Improving the potential of Enhanced Teager Energy Cepstral Coefficients (ETECC) for replay attack detection. COMPUTER SPEECH AND LANGUAGE, v. 72, . (19/04475-0)
GUIDO, RODRIGO CAPOBIANCO; PEDROSO, FERNANDO; CONTRERAS, RODRIGO COLNAGO; RODRIGUES, LUCIENE CAVALCANTI; GUARIGLIA, EMANUEL; NETO, JOGI SUDA. Introducing the Discrete Path Transform (DPT) and its applications in signal analysis, artefact removal, and spoken word recognition. DIGITAL SIGNAL PROCESSING, v. 117, . (19/04475-0)
LEMOS ESCOLA, JOAO PAULO; DE SOUZA, UENDER BARBOSA; GUIDO, RODRIGO CAPOBIANCO; DA SILVA, IVAN NUNES. The Haar Wavelet Transform in IoT Digital Audio Signal Processing. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, v. 41, n. 7, p. 11-pg., . (19/04475-0)
FONSECA, EVERTHON SILVA; GUIDO, RODRIGO CAPOBIANCO; BARBON JUNIOR, SYLVIO; DEZANI, HENRIQUE; GATI, RODRIGO ROSSETO; MOSCONI PEREIRA, DENIS CESAR. Acoustic investigation of speech pathologies based on the discriminative paraconsistent machine (DPM). Biomedical Signal Processing and Control, v. 55, . (19/04475-0)
DE SOUZA, UENDER BARBOSA; LEMOS ESCOLA, JOAO PAULO; BOTTURA MACCAGNAN, DOUGLAS HENRIQUE; BRITO, LEONARDO DA CUNHA; GUIDO, RODRIGO CAPOBIANCO. Empirical mode decomposition applied to acoustic detection of a cicadid pest. COMPUTERS AND ELECTRONICS IN AGRICULTURE, v. 199, p. 14-pg., . (19/04475-0)
GUPTA, SIDDHANT; PATIL, ANKUR T.; PUROHIT, MIRALI; PARMAR, MIHIR; PATEL, MAITREYA; PATIL, HEMANT A.; GUIDO, RODRIGO CAPOBIANCO. Residual Neural Network precisely quantifies dysarthria severity-level based on short-duration speech segments. NEURAL NETWORKS, v. 139, p. 105-117, . (19/04475-0)
DE ALMEIDA JR, GILDASIO CASTELLO; GUIDO, RODRIGO CAPOBIANCO; NETO, JOGI SUDA; ROSA, JOAO MARCOS; CASTIGLIONI, LILIAN; DE MATTOS, LUIZ CARLOS; BRANDAO, CINARA CASSIA. Corneal Tomography Multivariate Index (CTMVI) effectively distinguishes healthy corneas from those susceptible to ectasia. Biomedical Signal Processing and Control, v. 70, . (15/17226-7, 19/04475-0)
LEMOS ESCOLA, JOAO PAULO; GUIDO, RODRIGO CAPOBIANCO; DA SILVA, IVAN NUNES; CARDOSO, ALEXANDRE MORAES; BOTTURA MACCAGNAN, DOUGLAS HENRIQUE; DEZOTTI, ARTUR KENZO. Automated acoustic detection of a cicadid pest in coffee plantations. COMPUTERS AND ELECTRONICS IN AGRICULTURE, v. 169, . (19/04475-0)