Advanced search
Start date
Betweenand


Statistical versus Distance-Based Meta-Features for Clustering Algorithm recommendation Using Meta-Learning

Full text
Author(s):
Pimentel, Bruno Almeida ; de Carvalho, Andre C. P. L. E. ; IEEE
Total Authors: 3
Document type: Journal article
Source: 2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN); v. N/A, p. 8-pg., 2018-01-01.
Abstract

When a Machine Learning algorithm is applied to a dataset, the predictive performance of the algorithm depends on how suitable its bias is to the the data distribution in the dataset, which leads researchers to create a large number of algorithms. The recommendation of the most suitable algorithm for a new dataset can occur by trial and error, trying a large number of algorithms with distinct bias. However, this approach usually has a high computational cost. This cost could be reduced if the most suitable algorithm(s) could be recommended. Meta-learning has been successfully used for recommendation of the best Machine Learning algorithm in several Machine Learning tasks. Meta-learning can rank algorithms according to their adequacy for a new dataset and use this ranking to recommend the algorithms to be used. As the recommended ranking is based on dataset features, dataset characterization (using meta features) is of crucial importance for the successful use of meta learning. Clustering is one of the main application of Machine Learning algorithms, however few works investigate the use of meta-learning for the recommendation of clustering algorithms. Moreover, the existing works use a poor methodology for the evaluation of the algorithm recommendation method and a small number of datasets. This paper proposes a comparison between two types of meta-features for clustering algorithm recommendation using meta-learning. Experimental results show in which situations the use of each type of meta-features is more suitable. (AU)

FAPESP's process: 12/22608-8 - Use of data complexity measures in the support of supervised machine learning
Grantee:Ana Carolina Lorena
Support Opportunities: Research Grants - Young Investigators Grants
FAPESP's process: 17/20265-0 - Use of meta-learning for clustering algorithm selection problems
Grantee:Bruno Almeida Pimentel
Support Opportunities: Scholarships in Brazil - Post-Doctoral
FAPESP's process: 16/18615-0 - Advanced machine learning
Grantee:André Carlos Ponce de Leon Ferreira de Carvalho
Support Opportunities: Research Grants - Research Partnership for Technological Innovation - PITE