Biblioteca Virtual - Centro de Documentação e Informação da FAPESP

Busca avançada

Pesquisar - Utilize aspas para obter um resultado mais específico

Índice

Área do conhecimento

Ano de início

Entree

Texto completo
Autor(es):	Okimoto, Lucas C. ; Lorena, Ana C. ; IEEE Número total de Autores: 3
Tipo de documento:	Artigo Científico
Fonte:	2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN); v. N/A, p. 8-pg., 2019-01-01.
Resumo
Feature selection (FS) is a pre-processing step often mandatory in data analysis by Machine Learning techniques. Its objective is to reduce data dimensionality by identifying and maintaining only the relevant features from a dataset. In this work we evaluate the use of complexity measures of classification problems in FS. These descriptors allow estimating the intrinsic difficulty of a classification problem by regarding on characteristics of the dataset available for learning. We propose a combined univariate-multivariate FS technique which employs two complexity measures: Fisher's maximum discriminant ratio and sum of intra-extra class distances. The results reveal that the complexity measures are indeed suitable for estimating feature importance in classification datasets. Large reductions in the numbers of features were obtained, while preserving, in general, the predictive accuracy of two strong classification techniques: Support Vector Machines and Random Forests. (AU)

Processo FAPESP:	12/22608-8 - Uso de medidas de complexidade de dados no suporte ao aprendizado de máquina supervisionado
Beneficiário:	Ana Carolina Lorena
Modalidade de apoio:	Auxílio à Pesquisa - Jovens Pesquisadores