Advanced search
Start date

Audio-Visual Speech Processing by Machine Learning


This research plan addresses a common basis for a number of areas in signal processing such as speech analysis, speech coding and audio coding, speech recognition and audio feature recognition as well as source separation with regularizations to carry out adjustments suitable to the desired application. Traditionally, speech analysis, in addition to its own importance, also provides signal representations and model parameters that are necessary to the other areas. In this role it is losing appeal with deep learning and parallels are set to be established in order to bring about some interpretation. Beyond usual types of time-frequency decomposition and modification and autoregressive analysis, new algorithms will be explored and proposed based on machine learning and deep learning for enhancement, separation and synthesis of speech and audio signals, partially or totally replacing traditional analysis. Research will focus on generative machines capable of handling video signals and time series as well.Additionally, the parameters and representations of the speech signal will also be used to model and elaborate non-intrusive speech quality metrics; for this purpose, the speech signal is degraded using different communication system parameters. (AU)

Articles published in Agência FAPESP Newsletter about the research grant:
Articles published in other media outlets (0 total):
More itemsLess items

Scientific publications (10)
(References retrieved automatically from Web of Science and SciELO through information on FAPESP grants and their corresponding numbers as mentioned in the publications by the authors)
RIBEIRO, DAVID AUGUSTO; SILVA, JUAN CASAVILCA; LOPES ROSA, RENATA; SAADI, MUHAMMAD; MUMTAZ, SHAHID; WUTTISITTIKULKIJ, LUNCHAKORN; ZEGARRA RODRIGUEZ, DEMOSTENES; AL OTAIBI, SATTAM. Light Field Image Quality Enhancement by a Lightweight Deformable Deep Learning Framework for Intelligent Transportation Systems. ELECTRONICS, v. 10, n. 10 MAY 2021. Web of Science Citations: 0.
TERRA VIEIRA, SAMUEL; LOPES ROSA, RENATA; ZEGARRA RODRIGUEZ, DEMOSTENES; ARJONA RAMIREZ, MIGUEL; SAADI, MUHAMMAD; WUTTISITTIKULKIJ, LUNCHAKORN. Q-Meter: Quality Monitoring System for Telecommunication Services Based on Sentiment Analysis Using Deep Learning. SENSORS, v. 21, n. 5 MAR 2021. Web of Science Citations: 0.
MILITANI, DAVI RIBEIRO; DE MORAES, HERMES PIMENTA; ROSA, RENATA LOPES; WUTTISITTIKULKIJ, LUNCHAKORN; RAMIREZ, MIGUEL ARJONA; RODRIGUEZ, DEMOSTENES ZEGARRA. Enhanced Routing Algorithm Based on Reinforcement Machine Learning-A Case of VoIP Service. SENSORS, v. 21, n. 2 JAN 2021. Web of Science Citations: 0.
RODRIGUEZ, DEMOSTENES Z.; CARRILLO, DICK; RAMIREZ, MIGUEL A.; NARDELLI, PEDRO H. J.; MOELLER, SEBASTIAN. Incorporating Wireless Communication Parameters Into the E-Model Algorithm. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, v. 29, p. 956-968, 2021. Web of Science Citations: 0.
BARBOSA, RODRIGO CARVALHO; AYUB, MUHAMMAD SHOAIB; ROSA, RENATA LOPES; RODRIGUEZ, DEMOSTENES ZEGARRA; WUTTISITTIKULKIJ, LUNCHAKORN. Lightweight PVIDNet: A Priority Vehicles Detection Network Model Based on Deep Learning for Intelligent Traffic Lights. SENSORS, v. 20, n. 21 NOV 2020. Web of Science Citations: 0.
VIEIRA, SAMUEL TERRA; ROSA, RENATA LOPES; RODRIGUEZ, DEMOSTENES ZEGARRA. A Speech Quality Classifier based on Tree-CNN Algorithm that Considers Network Degradations. JOURNAL OF COMMUNICATIONS SOFTWARE AND SYSTEMS, v. 16, n. 2, p. 180-187, JUN 2020. Web of Science Citations: 0.
DA SILVA, MARIELLE JORDANE; MELGAREJO, DICK CARRILLO; ROSA, RENATA LOPES; RODRIGUEZ, DEMOSTENES ZEGARRA. Speech Quality Classifier Model based on DBN that Considers Atmospheric Phenomena. JOURNAL OF COMMUNICATIONS SOFTWARE AND SYSTEMS, v. 16, n. 1, p. 75-84, MAR 2020. Web of Science Citations: 0.
ROSA, RENATA LOPES; DE SILVA, MARIELLE JORDANE; SILVA, DOUGLAS HENRIQUE; AYUB, MUHAMMAD SHOAIB; CARRILLO, DICK; NARDELLI, PEDRO H. J.; RODRIGUEZ, DEMOSTENES ZEGARRA. Event Detection System Based on User Behavior Changes in Online Social Networks: Case of the COVID-19 Pandemic. IEEE ACCESS, v. 8, p. 158806-158825, 2020. Web of Science Citations: 0.
HAJAROLASVADI, NOUSHIN; RAMIREZ, MIGUEL ARJONA; BECCARO, WESLEY; DEMIREL, HASAN. Generative Adversarial Networks in Human Emotion Synthesis: A Review. IEEE ACCESS, v. 8, p. 218499-218529, 2020. Web of Science Citations: 0.
NUNES, RODRIGO DANTAS; ROSA, RENATA LOPES; RODRIGUEZ, DEMOSTENES ZEGARRA. Performance improvement of a non-intrusive voice quality metric in lossy networks. IET COMMUNICATIONS, v. 13, n. 20, p. 3401-3408, DEC 19 2019. Web of Science Citations: 0.

Please report errors in scientific publications list by writing to: