Busca avançada
Ano de início
Entree
(Referência obtida automaticamente do Web of Science, por meio da informação sobre o financiamento pela FAPESP e o número do processo correspondente, incluída na publicação pelos autores.)

PTS: Projected Topological Stream clustering algorithm

Texto completo
Autor(es):
Pereira, Cassio M. M. [1] ; de Mello, Rodrigo F. [1]
Número total de Autores: 2
Afiliação do(s) autor(es):
[1] Inst Math & Comp Sci, BR-13566590 Sao Carlos, SP - Brazil
Número total de Afiliações: 1
Tipo de documento: Artigo Científico
Fonte: Neurocomputing; v. 180, n. SI, p. 16-26, MAR 5 2016.
Citações Web of Science: 1
Resumo

High-dimensional data streams clustering is an attractive research topic, as there are several applications that generate a high number of attributes, bringing new challenges in terms of partitioning due to the curse of dimensionality. In addition, those applications produce unbounded sequences of data which cannot be stored for later analysis. Although the importance of this scenario, there are still very few algorithms available in the literature to meet this task. Despite the theoretical foundation of mathematical topology for dealing with high-dimensional spaces, none of those approaches have investigated the problem of finding topologically similar projected clusters in high-dimensional data streams. Among the advantages of topology is the possibility to analyze data in a coordinate-free and noise-robust manner. In a previous research, we have shown that topologically similar clusters can be meaningful considering real-world data sets. In this paper, we extend those ideas and propose PTS, an algorithm for finding topologically similar clusters in high-dimensional data streams. The algorithm is capable of finding traditional projected clusters and then merging them according to topological features computed using persistent homology. Experiments with synthetic data streams of dimensions d = 8,16,32,64 and 128 confirm the ability of PTS to find topologically similar projected clusters. (C) 2015 Elsevier B.V. All rights reserved. (AU)

Processo FAPESP: 14/13323-5 - Abordagem baseada na estabilidade de algoritmos de agrupamento de dados para garantir a detecção de mudanças de conceito em fluxos de dados
Beneficiário:Rodrigo Fernandes de Mello
Modalidade de apoio: Auxílio à Pesquisa - Regular
Processo FAPESP: 13/04453-0 - Agrupamento de fluxos contínuos de dados de alta dimensionalidade
Beneficiário:Cássio Martini Martins Pereira
Modalidade de apoio: Bolsas no Brasil - Pós-Doutorado