Advanced search
Start date
Betweenand

Ensemble of Classiers for Unbalanced Data sets

Grant number: 16/20465-6
Support Opportunities:Scholarships abroad - Research Internship - Doctorate
Start date: May 01, 2017
End date: October 31, 2017
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computer Systems
Principal Investigator:André Carlos Ponce de Leon Ferreira de Carvalho
Grantee:Everlandio Rebouças Queiroz Fernandes
Supervisor: Joost Nico Kok
Host Institution: Instituto de Ciências Matemáticas e de Computação (ICMC). Universidade de São Paulo (USP). São Carlos , SP, Brazil
Institution abroad: Universiteit Leiden, Netherlands  
Associated to the scholarship:13/11615-6 - Ensemble of Classifiers with Dynamic Update for Credit Risk Analysis, BP.DR

Abstract

In many practical classification problems, the data set used for the induction of a classifier is significantly unbalanced. This occurs when the number of instances of a particular class is much lower than the number of instances in the other(s) class(es). Unbalanced data sets can compromise the performance of most classical classification algorithms, since these algorithms assume a balanced distribution of instances between the classes. A commonly adopted strategy to deal with this problem, is to train the classifier over a balanced sample of the original data set. However, this procedure may discard instances that could be important for a better discrimination of classes, affecting the performance of the resulting classifier. Moreover, in different application scenarios, the strategy of combining several classifiers in a structure known as ensemble has been shown to be efficient, resulting in a stable predictive accuracy and often higher than the accuracy obtained by a classifier alone. In this context, this research project aims to investigate and propose methods to construct ensembles of classifiers where each base classifier is induced by a different sample from the original dataset and different base classification algorithms can be used. Other purpose of this study is to investigate methods for ensemble pruning in order to reduce memory and processing costs.

News published in Agência FAPESP Newsletter about the scholarship:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)