Semantic segmentation task aims to create a dense classification by labeling pixel-wise each object present in images or videos. Convolutional neural network (CNN) approaches have been proved useful by exhibiting the best results in this task. Some challenges remain, however, such as the low-resolution of feature maps and the loss of spatial precision, both produced in the CNNs by limited local neighborhoods, i.e., filters with small size and regular shape. How to solve these problems and obtain consistent results is still an open problem: Thus, making semantic segmentation a rather difficult problem even using deep learning models. On the other hand, Graph Neural Networks (GNNs) approach prove the ability to reflect local and global properties on unstructured data, as well as taking into account the irregular connections. In this research project, to solve these problems, we propose to create a deep learning architecture which combines local features extraction of CNNs with the global features extraction of GNNs and their irregular connections between pixels to address the semantic segmentation task on images. Our proposed architecture is end-to-end trainable and extracts local and global information to discriminate between several object classes present in images. In this way, the final result of our proposed architecture is the generation of densely labeled images.
News published in Agência FAPESP Newsletter about the scholarship: