Advanced search
Start date
Betweenand


Evolutionary algorithms for gausian mixture models with and without constraints

Full text
Author(s):
Thiago Ferreira Covões
Total Authors: 1
Document type: Doctoral Thesis
Press: São Carlos.
Institution: Universidade de São Paulo (USP). Instituto de Ciências Matemáticas e de Computação (ICMC/SB)
Defense date:
Examining board members:
Eduardo Raul Hruschka; Nelson Francisco Favilla Ebecken; Ana Carolina Lorena; Solange Oliveira Rezende; Fernando José von Zuben
Advisor: Eduardo Raul Hruschka
Abstract

In the last decade, researchers have been giving considerable attention to the field of Constrained Clustering. Algorithms in this field assume that along with the objects to be clustered, the user also provides some constraints about which kind of clustering (s)he prefers. In this thesis, two scenarios are studied: clustering with and without constraints. The developments are based on finite mixture models, namely, models with Gaussian components, which are usually called Gaussian Mixture Models (GMMs). In this context the main problems addressed are: (i) parameter estimation of GMMs; (ii) efficiently integrating constraints in the learning process allowing both constraints and the data to be added in the modeling in an online fashion; (iii) estimating, by using constraints derived from pre-determined concepts (usually named classes), the number of clusters per concept. Evolutionary algorithms were adopted to develop solutions for such problems. These algorithms analyze more than one solution simultaneously and use information provided by previous solutions to guide the search process. Specifically, an evolutionary algorithm based on procedures that perform splitting and merging of components to estimate the parameters of a GMM was developed. This algorithm was compared to an algorithm considered as the state-of-the-art in the literature, obtaining competitive results while requiring less parameters and being more computationally efficient. Besides the aforementioned contributions, two algorithms for online constrained clustering were developed. Both algorithms are based on well known algorithms from the literature and get better results than their predecessors. Finally, two algorithms to estimate the number of clusters per class were also developed. Both algorithms were compared to well established algorithms from the literature of constrained clustering, and obtained equal or better results than the ones obtained by the contenders. The successful estimation of the number of clusters per class is helpful to a variety of data mining tasks, such as data summarization and problem decomposition of challenging classification problems. (AU)

FAPESP's process: 09/17795-0 - Evolutionary Algorithms for Constrained Clustering
Grantee:Thiago Ferreira Covões
Support Opportunities: Scholarships in Brazil - Doctorate