The Importance of the Test Set Size in Quantification Assessment

Maletzke, Andre; Hassan, Waqar; dos Reis, Denis; Batista, Gustavo; Bessiere, C

Author(s):	Maletzke, Andre ; Hassan, Waqar ; dos Reis, Denis ; Batista, Gustavo ; Bessiere, C Total Authors: 5
Document type:	Journal article
Source:	PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE; v. N/A, p. 7-pg., 2020-01-01.
Abstract
Quantification is a task similar to classification in the sense that it learns from a labeled training set. However, quantification is not interested in predicting the class of each observation, but rather measure the class distribution in the test set. The community has developed performance measures and experimental setups tailored to quantification tasks. Nonetheless, we argue that a critical variable, the size of the test sets, remains ignored. Such disregard has three main detrimental effects. First, it implicitly assumes that quantifiers will perform equally well for different test set sizes. Second, it increases the risk of cherry-picking by selecting a test set size for which a particular proposal performs best. Finally, it disregards the importance of designing methods that are suitable for different test set sizes. We discuss these issues with the support of one of the broadest experimental evaluations ever performed, with three main outcomes. (i) We empirically demonstrate the importance of the test set size to assess quantifiers. (ii) We show that current quantifiers generally have a mediocre performance on the smallest test sets. (iii) We propose a meta-learning scheme to select the best quantifier based on the test size that can outperform the best single quantification method. (AU)

FAPESP's process:	17/22896-7 - Unsupervised Context Detection of Streaming Data For Classification
Grantee:	Denis Moreira dos Reis
Support Opportunities:	Scholarships in Brazil - Doctorate

Short URL