Advanced search
Start date
Betweenand

Multimodal Models for Images and 3D Representations in a Unified Vision and Language Approach

Grant number: 24/09462-1
Support Opportunities:Scholarships in Brazil - Doctorate
Start date: October 01, 2024
End date: March 31, 2028
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computing Methodologies and Techniques
Principal Investigator:Moacir Antonelli Ponti
Grantee:Márcus Vinícius Lobo Costa
Host Institution: Instituto de Ciências Matemáticas e de Computação (ICMC). Universidade de São Paulo (USP). São Carlos , SP, Brazil
Associated research grant:19/07316-0 - Singularity theory and its applications to differential geometry, differential equations and computer vision, AP.TEM

Abstract

Visual recognition in image classification, object detection, and semantic segmentation remains a significant challenge in computer vision, particularly when it comes to learning 3D data representations. The advent of advanced deep learning techniques in vision, inspired by breakthroughs in natural language processing, has led to the emergence of a new multimodal paradigm: Vision-Language Models (VLMs). These models integrate visual and textual representations, offering a promising direction for future research. However, VLMs are currently not well-suited for addressing the complexities of 3D data representation. Understanding the semantics and feature representation of each point in a 3D projection space is crucial for advancing in this domain. This project aims to unify native learning representations to access and utilize these multimodal and shared representations effectively. Additionally, we plan to employ text retrieval and generation techniques to elucidate the semantic relationships between textual descriptions and visual-spatial content. Despite progress in the literature, there are no existing techniques that fully resolve this problem. Our research will focus on bridging this gap, providing a foundation for more robust and integrated 3D data representation solutions.

News published in Agência FAPESP Newsletter about the scholarship:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)