Advanced search
Start date
Betweenand

Semantic Segmentation Using Hourglass Learning Model

Grant number: 19/18678-0
Support type:Scholarships abroad - Research Internship - Doctorate
Effective date (Start): November 01, 2019
Effective date (End): October 31, 2020
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computer Systems
Principal researcher:Gerberth Adín Ramírez Rivera
Grantee:Darwin Danilo Saire Pilco
Supervisor abroad: Tabbone Salvatore Antoine
Home Institution: Instituto de Computação (IC). Universidade Estadual de Campinas (UNICAMP). Campinas , SP, Brazil
Research place: Université de Lorraine (UL), France  
Associated to the scholarship:17/16597-7 - Semantic Segmentation on Videos, BP.DR

Abstract

Semantic segmentation task aims to create a dense classification by labeling pixel-wise each object present in images or videos. Convolutional neural network (CNN) approaches have been proved useful by exhibiting the best results in this task. Some challenges remain, however, such as the low-resolution of feature maps and the loss of spatial precision, both produced in the CNNs by limited local neighborhoods, i.e., filters with small size and regular shape. How to solve these problems and obtain consistent results is still an open problem: Thus, making semantic segmentation a rather difficult problem even using deep learning models. On the other hand, Graph Neural Networks (GNNs) approach prove the ability to reflect local and global properties on unstructured data, as well as taking into account the irregular connections. In this research project, to solve these problems, we propose to create a deep learning architecture which combines local features extraction of CNNs with the global features extraction of GNNs and their irregular connections between pixels to address the semantic segmentation task on images. Our proposed architecture is end-to-end trainable and extracts local and global information to discriminate between several object classes present in images. In this way, the final result of our proposed architecture is the generation of densely labeled images.