Advanced search
Start date
Betweenand


Noise detection in classification problems

Full text
Author(s):
Luís Paulo Faina Garcia
Total Authors: 1
Document type: Doctoral Thesis
Press: São Carlos.
Institution: Universidade de São Paulo (USP). Instituto de Ciências Matemáticas e de Computação (ICMC/SB)
Defense date:
Examining board members:
André Carlos Ponce de Leon Ferreira de Carvalho; Heloisa de Arruda Camargo; Alexandre Plastino de Carvalho; Ana Carolina Lorena; Ronaldo Cristiano Prati
Advisor: André Carlos Ponce de Leon Ferreira de Carvalho
Abstract

In many areas of knowledge, considerable amounts of time have been spent to comprehend and to treat noisy data, one of the most common problems regarding information collection, transmission and storage. These noisy data, when used for training Machine Learning techniques, lead to increased complexity in the induced classification models, higher processing time and reduced predictive power. Treating them in a preprocessing step may improve the data quality and the comprehension of the problem. This Thesis aims to investigate the use of data complexity measures capable to characterize the presence of noise in datasets, to develop new efficient noise ltering techniques in such subsamples of problems of noise identification compared to the state of art and to recommend the most properly suited techniques or ensembles for a specific dataset by meta-learning. Both artificial and real problem datasets were used in the experimental part of this work. They were obtained from public data repositories and a cooperation project. The evaluation was made through the analysis of the effect of artificially generated noise and also by the feedback of a domain expert. The reported experimental results show that the investigated proposals are promising. (AU)

FAPESP's process: 11/14602-7 - Noise detection and elimination for classification problems
Grantee:Luís Paulo Faina Garcia
Support Opportunities: Scholarships in Brazil - Doctorate (Direct)