Advanced search
Start date
Betweenand

Learning context rich representations for computer vision

Grant number: 22/15304-4
Support Opportunities:Research Projects - Thematic Grants
Start date: November 01, 2023
End date: October 31, 2028
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computing Methodologies and Techniques
Principal Investigator:Nina Sumiko Tomita Hirata
Grantee:Nina Sumiko Tomita Hirata
Host Institution: Instituto de Matemática e Estatística (IME). Universidade de São Paulo (USP). São Paulo , SP, Brazil
Pesquisadores principais:
Luciano da Fontoura Costa ; Roberto Hirata Junior ; Roberto Marcondes Cesar Junior
Associated researchers:Claudia Lucia Mendes de Oliveira ; Fabio Miranda ; Henrique Morimitsu ; Hugo Neves de Oliveira ; Isabelle Bloch ; Jose Claudio Teixeira e Silva Junior ; Márcia de Almeida Rizzutto ; Rafael Jeferson Pezzuto Damaceno ; Xiaoyi Jiang ; Zhangyang Wang
Associated scholarship(s):25/00043-9 - Analysis of Meaning Attribution During the Embedding Process in Computer Vision and Natural Language Models Applied to the Urban Environment, BP.IC
24/23406-7 - Integration of manually extracted image features into convolutional neural networks, BP.IC
24/17415-3 - Multisensor analysis for the assessment of infrastructure and urban environment, BP.PD
+ associated scholarships 24/19866-2 - Detection of astronomical objects in multi-band images, BP.IC
24/17777-2 - Machine Learning Engineering and Techniques - 1, BP.TT
24/16942-0 - Retinal image segmentation, BP.IC
24/10882-5 - Sidewalk segmentation and analysis, BP.PD
24/04381-3 - Machine Learning Engineering in situations with optimization and hardware constraints - 1, BP.TT
23/17610-8 - Machine Learning Techniques Applied to Waveform Inversion for Porosity Estimation in Hydrocarbon Reservoirs, BP.DR
23/11498-1 - Deep learning applied to facial recognition, BP.IC - associated scholarships

Abstract

Computer Vision methods are used to extract information from images and videos, but their contextual elements are not always sufficient to extract correct and accurate information. In these cases, content from other sources and types of data such as audio and text, or other information external to the data, such as a priori knowledge, can be used to complement and enrich the context of the information of interest. Additionally, the application context can impose various restrictions such as hardware limitations, the need to guarantee privacy, among others. Therefore, modern Computer Vision methods need to be able to automatically integrate the contextual elements of the information of interest and also those related to the application in question. The objective of this project is the development of computer vision models and methods that are capable of generating context-rich representations. The project will be organized around three main integrated research lines: (i) Optimum use of unsupervised data; (ii) Alignment of multi-modal domains; (iii) Properties of representations. Of special interest are computer vision applications involving edge devices (edge computing) and mobile devices (such as smartphones and mini-computers). To develop, test and validate the methods, we intend to build an experimental setup consisting of multiple cameras and sensors that will allow the construction of supervised datasets to be explored by the group. (AU)

Articles published in Agência FAPESP Newsletter about the research grant:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)

Scientific publications (9)
(References retrieved automatically from Web of Science and SciELO through information on FAPESP grants and their corresponding numbers as mentioned in the publications by the authors)
BENATTI, ALEXANDRE; COSTA, LUCIANO DA F.. On the transient and equilibrium features of growing fractal complex networks. CHAOS SOLITONS & FRACTALS, v. 183, p. 7-pg., . (15/22308-2, 22/15304-4)
OLIVEIRA, HUGO; GAMA, PEDRO H. T.; BLOCH, ISABELLE; CESAR JR, ROBERTO MARCONDES. Meta-learners for few-shot weakly-supervised medical image segmentation. PATTERN RECOGNITION, v. 153, p. 13-pg., . (20/06744-5, 15/22308-2, 22/15304-4, 17/50236-1)
PERIN, GABRIEL J.; CHEN, XUXI; LIU, SHUSEN; KAILKHURA, BHAVYA; WANG, ZHANGYANG; GALLAGHER, BRIAN. RankMean: Module-Level Importance Score for Merging Fine-tuned Large Language Models. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, v. N/A, p. 7-pg., . (23/15047-4, 22/11645-1, 22/15304-4)
BENATTI, ALEXANDRE; DE ARRUDA, HENRIQUE FERRAZ; COSTA, LUCIANO DA FONTOURA. Interrelating neuronal morphology by coincidence similarity networks. Journal of Theoretical Biology, v. 606, p. 11-pg., . (15/22308-2, 18/10489-0, 22/15304-4)
NAKAZONO, L.; VALENCA, R. R.; SOARES, G.; IZBICKI, R.; IVEZIC, Z.; R LIMA, E., V; HIRATA, N. S. T.; SODRE JR, L.; OVERZIER, R.; ALMEIDA-FERNANDES, F.; et al. The Quasar Catalogue for S-PLUS DR4 (QuCatS) and the estimation of photometric redshifts. Monthly Notices of the Royal Astronomical Society, v. 531, n. 1, p. 13-pg., . (19/26492-3, 23/07068-1, 21/12744-0, 21/08983-0, 18/20977-2, 19/11321-9, 11/51680-6, 21/09468-1, 23/05003-0, 19/01312-2, 15/22308-2, 22/15304-4)
CASAGRANDE, LUAN; HIRATA, R., JR.. RIPARIAN ZONES CLASSIFICATION USING SATELLITE/UAV SYNERGY AND DEEP LEARNING. IGARSS 2024-2024 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, IGARSS 2024, v. N/A, p. 5-pg., . (22/15304-4)
MORIMITSU, HENRIQUE; ZHU, XIAOBIN; CESAR-JR, ROBERTO M.; JI, XIANGYANG; YIN, XU-CHENG. RAPIDFlow: Recurrent Adaptable Pyramids with Iterative Decoding for Efficient Optical Flow Estimation. 2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024, v. N/A, p. 7-pg., . (15/22308-2, 22/15304-4)
PERIN, GABRIEL J.; HIRATA, NINA S. T.. Few-shot Retinal Disease Classification on the Brazilian Multilabel Ophtalmological Dataset. 2024 37TH SIBGRAPI CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES, SIBGRAPI 2024, v. N/A, p. 6-pg., . (23/15047-4, 22/11645-1, 22/15304-4)
PEZZUTO DAMACENO, RAFAEL J.; CESAR, ROBERTO M., JR.. An End-to-End Deep Learning Approach for Video Captioning Through Mobile Devices. PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2023, PT I, v. 14469, p. 15-pg., . (15/22308-2, 22/15304-4, 22/12204-9)