Scholarship 25/10215-1 - Aprendizado computacional

Grant number:	25/10215-1
Support Opportunities:	Scholarships in Brazil - Scientific Initiation
Start date:	July 01, 2025
Status:	Discontinued
Field of knowledge:	Physical Sciences and Mathematics - Computer Science - Computing Methodologies and Techniques

Principal Investigator:	Ana Carolina Lorena
Grantee:	Douglas Bergamim Fernandes

Host Institution:	Divisão de Ciência da Computação (IEC). Instituto Tecnológico de Aeronáutica (ITA). São José dos Campos , SP, Brazil

Associated research grant:	21/06870-3 - Beyond algorithm selection: meta-learning for data and algorithm analysis and understanding, AP.JP2

Associated scholarship(s):	25/19111-4 - Evaluating different data representations for extracting standard meta-features from unstructured datasets, BE.EP.IC

Abstract The growing use of Machine Learning (ML) techniques in areas such as computer vision and natural language processing has intensified the demand for methods capable of handling unstructured data, such as images and text. These types of data often exhibit high dimensionality and carry large amounts of information, making the task of identifying the most suitable ML algorithms for each scenario both complex and costly. In this context, Meta-learning (MtL) emerges as a promising approach to support the selection process by investigating which intrinsic characteristics of datasets are related to algorithm performance. However, most of the meta-features available in the literature were developed for structured, tabular data, which limits their applicability in more modern settings. To overcome this limitation, previous studies have shown that data such as images and text can be represented through embeddings - numerical vectors obtained from pre-trained deep neural networks - making them compatible with meta-feature extraction tools. Each neural network architecture generates a distinct representation, capturing different aspects of the original data. This project proposes to investigate how useful different embedded representations are for extracting standard meta-features from unstructured datasets. The PyMFE (Python Meta-Feature Extractor) library already provides a Python implementation for extracting meta-features from datasets, but its application is restricted to attribute-value formatted data. Public datasets such as CIFAR-10 and CIFAR-100 will be used, and the experiments will aim to assess the impact of the embedding choice on the quality of the extracted meta-features. The goal is to contribute to expanding the applicability of Meta-learning in response to the current demands of Machine Learning. (AU)

News published in Agência FAPESP Newsletter about the scholarship:
More items Less items
TITULO

Articles published in other media outlets ( ):
More items Less items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)

Short URL