Noisy scene graph with self-supervised learning on graph neural network for visual...
Visual question answering task with graph convolution networks
Evaluating different representations for extracting standard meta-features from un...
Multilingual Vision Language Model with In-Context Learning Ability