Advanced search
Start date
Betweenand


Data stream classification with concept drift and verification latency

Full text
Author(s):
Denis Moreira dos Reis
Total Authors: 1
Document type: Master's Dissertation
Press: São Carlos.
Institution: Universidade de São Paulo (USP). Instituto de Ciências Matemáticas e de Computação (ICMC/SB)
Defense date:
Examining board members:
Gustavo Enrique de Almeida Prado Alves Batista; Robson Leonardo Ferreira Cordeiro; Ricardo Bastos Cavalcante Prudêncio; Marcela Xavier Ribeiro
Advisor: Gustavo Enrique de Almeida Prado Alves Batista
Abstract

Despite the relatively maturity of batch-mode supervised learning research, in which the data typifies stationary problems, many real world applications deal with data streams whose statistical distribution changes over time, causing what is known as concept drift. A large body of research has been done in the last years, with the objective of creating new models that are accurate even in the presence of concept drifts. However, most of them assume that, once the classification algorithm labels an event, its actual label become readily available. This work explores the complementary situations, with a review of the most important published works and an analysis over the impact of delayed true labeling, including no true label availability at all. Furthermore, this work proposes a new algorithm that heavily reduces the complexity of applying Kolmogorov- Smirnov non-parametric hypotheis test, turning it into an uselful tool for analysis on data streams. As an instantiation of its usefulness, we present an unsupervised drift-detection method that, along with Active Learning and Transfer Learning approaches, decreases the number of true labels that are required to keep good classification performance over time, even in the presence of concept drifts. (AU)

FAPESP's process: 14/12333-7 - Data stream classification with concept drift and extreme verification latency
Grantee:Denis Moreira dos Reis
Support Opportunities: Scholarships in Brazil - Master