Advanced search
Start date

Clustering Data Streams with Automatic Estimation of Number of Clusters

Grant number: 10/15049-7
Support Opportunities:Scholarships in Brazil - Doctorate
Effective date (Start): January 01, 2011
Effective date (End): January 31, 2014
Field of knowledge:Physical Sciences and Mathematics - Computer Science
Principal Investigator:Eduardo Raul Hruschka
Grantee:Jonathan de Andrade Silva
Host Institution: Instituto de Ciências Matemáticas e de Computação (ICMC). Universidade de São Paulo (USP). São Carlos , SP, Brazil
Associated scholarship(s):12/10396-6 - Evolutionary algorithms for data stream clustering, BE.EP.DR


Data clustering techniques are commonly used to find clusters. The number of clusters is usually a priori unknown. Such techniques assume that the data set is fixed-size and can be stored entirely into main memory. However, an actual and important challenge involves applying such data clustering techniques into sources of data in which data flows continuously over dynamic environments. These data sources are known as data streams. In many data stream applications, data access operations are restricted to one (or to a small number) of passes over the data with time and memory restrictions. In this sense, some data stream clustering algorithms have been proposed in literature. Many of these techniques are based on the k-means algorithm. However, k-means suffers from several major drawbacks, particularly related to local minima and to the need of specifying the number of clusters in advance. In this context, the main goal of this project involves the development and evaluation of algorithms for data stream clustering that estimate automatically the number of clusters from data.

News published in Agência FAPESP Newsletter about the scholarship:
Articles published in other media outlets (0 total):
More itemsLess items

Scientific publications
(References retrieved automatically from Web of Science and SciELO through information on FAPESP grants and their corresponding numbers as mentioned in the publications by the authors)
SILVA, JONATHAN DE ANDRADE; HRUSCHKA, EDUARDO RAUL. A Support System for Clustering Data Streams with a Variable Number of Clusters. ACM TRANSACTIONS ON AUTONOMOUS AND ADAPTIVE SYSTEMS, v. 11, n. 2, . (10/15049-7)
SILVA, JONATHAN DE ANDRADE; HRUSCHKA, EDUARDO RAUL; GAMA, JOAO. An evolutionary algorithm for clustering data streams with a variable number of clusters. EXPERT SYSTEMS WITH APPLICATIONS, v. 67, p. 228-238, . (10/15049-7)
Academic Publications
(References retrieved automatically from State of São Paulo Research Institutions)
SILVA, Jonathan de Andrade. Clustering data streams with automatic estimation of the number of cluster. 2015. Doctoral Thesis - Universidade de São Paulo (USP). Instituto de Ciências Matemáticas e de Computação (ICMC/SB) São Carlos.

Please report errors in scientific publications list by writing to: