Advanced search
Start date
Betweenand

An approach based on the stability of clustering algorithms to ensure concept drift detection on data streams

Grant number: 14/13323-5
Support type:Regular Research Grants
Duration: October 01, 2014 - September 30, 2016
Field of knowledge:Physical Sciences and Mathematics - Computer Science
Principal Investigator:Rodrigo Fernandes de Mello
Grantee:Rodrigo Fernandes de Mello
Home Institution: Instituto de Ciências Matemáticas e de Computação (ICMC). Universidade de São Paulo (USP). São Carlos , SP, Brazil
Assoc. researchers:Ricardo Araújo Rios

Abstract

Several industrial, scientific and commercial processes produce data continuously over time, in large volumes and at high frequencies, called data streams. In the Machine Learning area, researchers have been modeling and analyzing the behaviour of these streams in attempt to understand the different phenomena. In many scenarios, data streams have their behaviour changed over time, which is called concept drift. The detection of such changes is an important task as it increases the information of the phenomenon under study. Several works accomplish such task assuming that (i) data is previously labeled and/or (ii) there is not temporal relations among observations produced -- those assumptions are hard to assert for data streams. This project proposes a method for detecting concept drift on unlabeled data streams assuming temporal dependencies among observations. Initially, consecutive windows of data will be extracted of the same stream. Then every window will be decomposed on the deterministic and stochastic components using the Empirical Mode Decomposition method and the Recurrence Quantification Analysis. Next, these components are modeled with the purpose of removing the dependencies among observations. In the next step, an stable clustering algorithm is applied on data, producing dendrograms. Finally, these models are then compared using the Gromov-Hausdorff distance which results in order to indicate concept drift. (AU)

Scientific publications (10)
(References retrieved automatically from Web of Science and SciELO through information on FAPESP grants and their corresponding numbers as mentioned in the publications by the authors)
PAGLIOSA, LUCAS DE CARVALHO; DE MELLO, RODRIGO FERNANDES. Semi-supervised time series classification on positive and unlabeled problems using cross-recurrence quantification analysis. PATTERN RECOGNITION, v. 80, p. 53-63, AUG 2018. Web of Science Citations: 1.
FERREIRA, MARTHA DAIS; CORREA, DEBORA CRISTINA; NONATO, LUIS GUSTAVO; DE MELLO, RODRIGO FERNANDES. Designing architectures of convolutional neural networks to solve practical problems. EXPERT SYSTEMS WITH APPLICATIONS, v. 94, p. 205-217, MAR 15 2018. Web of Science Citations: 11.
DA COSTA, FAUSTO G.; DUARTE, FELIPE S. L. G.; VALLIM, ROSANE M. M.; DE MELLO, RODRIGO F. Multidimensional surrogate stability to detect data stream concept drift. EXPERT SYSTEMS WITH APPLICATIONS, v. 87, p. 15-29, NOV 30 2017. Web of Science Citations: 6.
PAGLIOSA, LUCAS DE CARVALHO; DE MELLO, RODRIGO FERNANDES. Applying a kernel function on time-dependent data to provide supervised-learning guarantees. EXPERT SYSTEMS WITH APPLICATIONS, v. 71, p. 216-229, APR 1 2017. Web of Science Citations: 7.
FERREIRA, MARTHA DAIS; CORREA, DEBORA CRISTINA; GRIVET, MARCOS ANTONIO; DOS SANTOS, GEOVAN TAVARES; DE MELLO, RODRIGO FERNANDES; NONATO, LUIS GUSTAVO. On Accuracy and Time Processing Evaluation of Cover Song Identification Systems. JOURNAL OF NEW MUSIC RESEARCH, v. 45, n. 4, p. 333-342, DEC 2016. Web of Science Citations: 1.
DA COSTA, F. G.; RIOS, R. A.; DE MELLO, R. F. Using dynamical systems tools to detect concept drift in data streams. EXPERT SYSTEMS WITH APPLICATIONS, v. 60, p. 39-50, OCT 30 2016. Web of Science Citations: 6.
PEREIRA, CASSIO M. M.; DE MELLO, RODRIGO F. PTS: Projected Topological Stream clustering algorithm. Neurocomputing, v. 180, n. SI, p. 16-26, MAR 5 2016. Web of Science Citations: 1.
RIOS, RICARDO ARAUJO; DE MELLO, RODRIGO FERNANDES. Applying Empirical Mode Decomposition and mutual information to separate stochastic and deterministic influences embedded in signals. Signal Processing, v. 118, p. 159-176, JAN 2016. Web of Science Citations: 12.
VALLIM, ROSANE M. M.; DE MELLO, RODRIGO F. Unsupervised change detection in data streams: an application in music analysis. PROGRESS IN ARTIFICIAL INTELLIGENCE, v. 4, n. 1-2, p. 1-10, DEC 2015. Web of Science Citations: 1.
PEREIRA, CASSIO M. M.; DE MELLO, RODRIGO F. Persistent homology for time series and spatial data clustering. EXPERT SYSTEMS WITH APPLICATIONS, v. 42, n. 15-16, p. 6026-6038, SEP 2015. Web of Science Citations: 11.

Please report errors in scientific publications list by writing to: cdi@fapesp.br.