Advanced search
Start date
Betweenand

Meta-Learning Applied to Imbalanced Datasets Using Data Complexity Measures

Grant number: 19/13015-2
Support Opportunities:Scholarships abroad - Research Internship - Doctorate
Start date: September 01, 2019
End date: May 31, 2020
Field of knowledge:Physical Sciences and Mathematics - Computer Science
Principal Investigator:André Carlos Ponce de Leon Ferreira de Carvalho
Grantee:Victor Hugo Barella
Supervisor: Nathalie Japkowicz
Host Institution: Instituto de Ciências Matemáticas e de Computação (ICMC). Universidade de São Paulo (USP). São Carlos , SP, Brazil
Institution abroad: American University (AU), United States  
Associated to the scholarship:15/01382-0 - The influence of pre-processing data techniques on classification algorithms, BP.DR

Abstract

Several works have investigated the effect of data imbalance on the performance of predictive models. Classification tasks using imbalanced data are not challenging on their own. When the classes are linearly separable, a regular classification algorithm usually induces predictive models able to distinguish the classes properly. Imbalance data poses a difficulty for the minority class when the training sets have class overlapping or complex decision borders. Assessing these characteristics is fundamental to understand the classification task difficulty and to choose adequate pre-processing techniques for imbalanced data. Measures able to identify the complexity of a classification task for a given dataset have been proposed. They are called data complexity measures. These measures use different criteria to identify how difficult it is to induce any classifier from a dataset. This project proposes using the data complexity measures in a meta-learning approach to assess the characteristics regarding the nature of the imbalance problem in classification tasks.

News published in Agência FAPESP Newsletter about the scholarship:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)