The Importance of the Test Set Size in Quantification Assessment

Maletzke, Andre; Hassan, Waqar; dos Reis, Denis; Batista, Gustavo; Bessiere, C

Autor(es):	Maletzke, Andre ; Hassan, Waqar ; dos Reis, Denis ; Batista, Gustavo ; Bessiere, C Número total de Autores: 5
Tipo de documento:	Artigo Científico
Fonte:	PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE; v. N/A, p. 7-pg., 2020-01-01.
Resumo
Quantification is a task similar to classification in the sense that it learns from a labeled training set. However, quantification is not interested in predicting the class of each observation, but rather measure the class distribution in the test set. The community has developed performance measures and experimental setups tailored to quantification tasks. Nonetheless, we argue that a critical variable, the size of the test sets, remains ignored. Such disregard has three main detrimental effects. First, it implicitly assumes that quantifiers will perform equally well for different test set sizes. Second, it increases the risk of cherry-picking by selecting a test set size for which a particular proposal performs best. Finally, it disregards the importance of designing methods that are suitable for different test set sizes. We discuss these issues with the support of one of the broadest experimental evaluations ever performed, with three main outcomes. (i) We empirically demonstrate the importance of the test set size to assess quantifiers. (ii) We show that current quantifiers generally have a mediocre performance on the smallest test sets. (iii) We propose a meta-learning scheme to select the best quantifier based on the test size that can outperform the best single quantification method. (AU)

Processo FAPESP:	17/22896-7 - Detecção de Contexto Não-Supervisionada em Fluxos de Dados para Classificação
Beneficiário:	Denis Moreira dos Reis
Modalidade de apoio:	Bolsas no Brasil - Doutorado

URL curto