Busca avançada
Ano de início
Entree
(Referência obtida automaticamente do Web of Science, por meio da informação sobre o financiamento pela FAPESP e o número do processo correspondente, incluída na publicação pelos autores.)

Multiclass from Binary: Expanding One-Versus-All, One-Versus-One and ECOC-Based Approaches

Texto completo
Autor(es):
Rocha, Anderson [1] ; Goldenstein, Siome Klein [1]
Número total de Autores: 2
Afiliação do(s) autor(es):
[1] Univ Estadual Campinas, Inst Comp, BR-13083970 Sao Paulo - Brazil
Número total de Afiliações: 1
Tipo de documento: Artigo Científico
Fonte: IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS; v. 25, n. 2, p. 289-302, FEB 2014.
Citações Web of Science: 51
Resumo

Recently, there has been a lot of success in the development of effective binary classifiers. Although many statistical classification techniques have natural multiclass extensions, some, such as the support vector machines, do not. The existing techniques for mapping multiclass problems onto a set of simpler binary classification problems run into serious efficiency problems when there are hundreds or even thousands of classes, and these are the scenarios where this paper's contributions shine. We introduce the concept of correlation and joint probability of base binary learners. We learn these properties during the training stage, group the binary leaner's based on their independence and, with a Bayesian approach, combine the results to predict the class of a new instance. Finally, we also discuss two additional strategies: one to reduce the number of required base learners in the multiclass classification, and another to find new base learners that might best complement the existing set. We use these two new procedures iteratively to complement the initial solution and improve the overall performance. This paper has two goals: finding the most discriminative binary classifiers to solve a multiclass problem and keeping up the efficiency, i.e., small number of base learners. We validate and compare the method with a diverse set of methods of the literature in several public available datasets that range from small (10 to 26 classes) to large multiclass problems (1000 classes) always using simple reproducible scenarios. (AU)

Processo FAPESP: 10/05647-4 - Computação forense e criminalística de documentos: coleta, organização, classificação e análise de evidências
Beneficiário:Anderson de Rezende Rocha
Modalidade de apoio: Auxílio à Pesquisa - Jovens Pesquisadores