Multimodal Representation Space for Text-Guided Data Generation
Noisy scene graph with self-supervised learning on graph neural network for visual...
Multimodal Models for Images and 3D Representations in a Unified Vision and Langua...
Brazil in focus: a multimodal large language model aware of the brazilian context ...
Multilingual and multimodal learning for Brazilian Portuguese