Advanced search
Start date
Betweenand

An approach based on the stability of clustering algorithms to ensure concept drift detection on data streams

Grant number: 14/13323-5
Support Opportunities:Regular Research Grants
Start date: October 01, 2014
End date: September 30, 2016
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computer Systems
Principal Investigator:Rodrigo Fernandes de Mello
Grantee:Rodrigo Fernandes de Mello
Host Institution: Instituto de Ciências Matemáticas e de Computação (ICMC). Universidade de São Paulo (USP). São Carlos , SP, Brazil
Associated researchers:Ricardo Araújo Rios

Abstract

Several industrial, scientific and commercial processes produce data continuously over time, in large volumes and at high frequencies, called data streams. In the Machine Learning area, researchers have been modeling and analyzing the behaviour of these streams in attempt to understand the different phenomena. In many scenarios, data streams have their behaviour changed over time, which is called concept drift. The detection of such changes is an important task as it increases the information of the phenomenon under study. Several works accomplish such task assuming that (i) data is previously labeled and/or (ii) there is not temporal relations among observations produced -- those assumptions are hard to assert for data streams. This project proposes a method for detecting concept drift on unlabeled data streams assuming temporal dependencies among observations. Initially, consecutive windows of data will be extracted of the same stream. Then every window will be decomposed on the deterministic and stochastic components using the Empirical Mode Decomposition method and the Recurrence Quantification Analysis. Next, these components are modeled with the purpose of removing the dependencies among observations. In the next step, an stable clustering algorithm is applied on data, producing dendrograms. Finally, these models are then compared using the Gromov-Hausdorff distance which results in order to indicate concept drift. (AU)

Articles published in Agência FAPESP Newsletter about the research grant:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)

Scientific publications (11)
(References retrieved automatically from Web of Science and SciELO through information on FAPESP grants and their corresponding numbers as mentioned in the publications by the authors)
PEREIRA, CASSIO M. M.; DE MELLO, RODRIGO F.. PTS: Projected Topological Stream clustering algorithm. Neurocomputing, v. 180, n. SI, p. 16-26, . (14/13323-5, 13/04453-0)
PAGLIOSA, LUCAS DE CARVALHO; DE MELLO, RODRIGO FERNANDES. Semi-supervised time series classification on positive and unlabeled problems using cross-recurrence quantification analysis. PATTERN RECOGNITION, v. 80, p. 53-63, . (15/22406-4, 14/13323-5)
DA COSTA, FAUSTO G.; DUARTE, FELIPE S. L. G.; VALLIM, ROSANE M. M.; DE MELLO, RODRIGO F.. Multidimensional surrogate stability to detect data stream concept drift. EXPERT SYSTEMS WITH APPLICATIONS, v. 87, p. 15-29, . (14/21636-3, 14/13323-5)
PEREIRA, CASSIO M. M.; DE MELLO, RODRIGO F.. Persistent homology for time series and spatial data clustering. EXPERT SYSTEMS WITH APPLICATIONS, v. 42, n. 15-16, p. 6026-6038, . (14/13323-5, 13/04453-0)
DA COSTA, F. G.; RIOS, R. A.; DE MELLO, R. F.. Using dynamical systems tools to detect concept drift in data streams. EXPERT SYSTEMS WITH APPLICATIONS, v. 60, p. 39-50, . (14/13323-5)
RIOS, RICARDO ARAUJO; DE MELLO, RODRIGO FERNANDES. Applying Empirical Mode Decomposition and mutual information to separate stochastic and deterministic influences embedded in signals. Signal Processing, v. 118, p. 159-176, . (14/13323-5, 09/18293-9)
PAGLIOSA, LUCAS DE CARVALHO; DE MELLO, RODRIGO FERNANDES. Applying a kernel function on time-dependent data to provide supervised-learning guarantees. EXPERT SYSTEMS WITH APPLICATIONS, v. 71, p. 216-229, . (15/22406-4, 14/13323-5)
FERREIRA, MARTHA DAIS; CORREA, DEBORA CRISTINA; NONATO, LUIS GUSTAVO; DE MELLO, RODRIGO FERNANDES. Designing architectures of convolutional neural networks to solve practical problems. EXPERT SYSTEMS WITH APPLICATIONS, v. 94, p. 205-217, . (11/22749-8, 14/13323-5, 12/17961-0)
VALLIM, ROSANE M. M.; DE MELLO, RODRIGO F.. Unsupervised change detection in data streams: an application in music analysis. PROGRESS IN ARTIFICIAL INTELLIGENCE, v. 4, n. 1-2, p. 1-10, . (14/13323-5, 13/16480-1)
FERREIRA, MARTHA DAIS; CORREA, DEBORA CRISTINA; GRIVET, MARCOS ANTONIO; DOS SANTOS, GEOVAN TAVARES; DE MELLO, RODRIGO FERNANDES; NONATO, LUIS GUSTAVO. On Accuracy and Time Processing Evaluation of Cover Song Identification Systems. JOURNAL OF NEW MUSIC RESEARCH, v. 45, n. 4, p. 333-342, . (11/22749-8, 14/13323-5, 12/17961-0)
PEREIRA, CASSIO M. M.; DE MELLO, RODRIGO F.. PTS: Projected Topological Stream clustering algorithm. Neurocomputing, v. 180, p. 11-pg., . (13/04453-0, 14/13323-5)