Advanced search
Start date
Betweenand


Quantification in Data Streams: Initial Results

Full text
Author(s):
Maletzke, Andre G. ; dos Reis, Denis M. ; Batista, Gustavo E. A. P. A. ; IEEE
Total Authors: 4
Document type: Journal article
Source: 2017 6TH BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS); v. N/A, p. 6-pg., 2017-01-01.
Abstract

In the last decades, learning from data streams has attracted the attention of researchers and practitioners due to its large number of applications. These applications have motivated the research community to propose a significant number of methods that can be used to solve problems in diverse tasks, more prominently in classification, prediction, and clustering. However, a relevant task known as quantification has remained largely unexplored. The quantification goal is to provide an estimate of the class prevalence in an unlabeled set. In this paper, we discuss the relevance and challenges of quantification for data streams and compare how it differs from the batch setting, in which quantification has attracted more attention from the research community. We propose an algorithm to estimate the class distribution in a data stream and frame our algorithm in the active learning framework. In addition, we define two other approaches as baseline and topline strategies for this problem. The experimental results demonstrate that our algorithm has significantly higher quantification accuracy than the baseline and almost as large as the topline while requiring a fraction of the true labels requested by the latter approach. (AU)

FAPESP's process: 16/04986-6 - Intelligent traps and sensors: an innovative approach to control insect pests and disease vectors
Grantee:Gustavo Enrique de Almeida Prado Alves Batista
Support Opportunities: Research Grants - eScience and Data Science Program - Regular Program Grants