Advanced search
Start date
Betweenand

Development of recurrent Convolutional Neural Network architectures for facial expression recognition

Abstract

The number of smart devices and personal electronic equipments that we own and interact with is increasing every day. Given that facial expressions and corporal language are the most emotion-related signals humans emit, it is natural to incorporate human expression recognition capabilities into smart devices to improve their functionalities. Furthermore, human expression recognition techniques facilitate the interaction with people in an efficient manner. In this context, facial expression recognition is now an attractive solution to advance on inclusion and universal access issues, such as ease-of-use of electronic equipments, eliminating barriers to be part of and to benefit from all the opportunities offered by modern societies. A first step in that direction is the recognition and classification of human facial expressions as basic constituents to infer more complex human emotional states. Therefore, we need robust automatic face analysis algorithms to be included in our smart devices. To automatically recognize facial expressions we need a robust description of each expression, or in general, a robust face or image descriptor. Furthermore, that description should be general enough to accommodate the different ways in which each expression can be performed, while maintaining the discrimination among the different types of expressions. Moreover, there are other challenges that a robust descriptor should overcome. For example, for daily-use devices, such as smartphones, the environment in which pictures are captured varies immensely, e.g., we may have non-constant illumination, noise, rotation and background changes, among others. Thus, the challenge of creating an image descriptor is to overcome the changes in environmental conditions as well as inter- and intra-class variations. Thus, this project aims to develop and analyze several architectures of deep neural networks to create robust facial descriptors that are discriminative between classes, yet enclose the intra-class variations due to imaging conditions, appearance changes, noise, and other factors. And simultaneously, use temporal information from the videos to enhance the classification, thus further reducing error and enhancing the recognition. Our general objective is to develop new network architectures, based on Convolutional Neural Networks and Recurrent Neural Networks, to describe and recognize temporal facial expressions robustly. This goal includes the design and analysis of network architectures that use different techniques for learning, description and extraction of the information, and use of internal layers to provide memory to the network, as well as their implementation, and evaluation in standard benchmark databases of human expressions accepted by the community. (AU)

Scientific publications
(References retrieved automatically from Web of Science and SciELO through information on FAPESP grants and their corresponding numbers as mentioned in the publications by the authors)
QUISPE, RODOLFO; TTITO, DARWIN; RIVERA, ADIN; PEDRINI, HELIO. Multi-Stream Networks and Ground Truth Generation for Crowd Counting. INTERNATIONAL JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING SYSTEMS, v. 11, n. 1, p. 33-41, 2020. Web of Science Citations: 0.

Please report errors in scientific publications list by writing to: cdi@fapesp.br.