Scholarship 23/14070-2 - Aprendizado computacional, Algoritmos - BV FAPESP
Advanced search
Start date
Betweenand

Gaussian Kernel fuzzy c-means algorithms with automatic learning of bandwidth parameters

Grant number: 23/14070-2
Support Opportunities:Scholarships in Brazil - Program to Stimulate Scientific Vocations
Start date: January 05, 2024
End date: February 24, 2024
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computing Methodologies and Techniques
Principal Investigator:Francisco de Assis Tenorio de Carvalho
Grantee:Débora van Putten Chaves
Host Institution: Centro de Informática (CIn). Universidade Federal de Pernambuco (UFPE). Ministério da Educação (Brasil). Recife , SP, Brazil

Abstract

The conventional Gaussian kernel fuzzy c-means clustering algorithms require selecting the width hyper-parameter.This hyper-parameter is tuned once and for all, and it is the same for all variables. Thus, implicitly the conventional Gaussian kernel c-means assumes that the variables are equally re-scaled; therefore, they have the same importance to the clustering task. Besides, the performance of the Gaussian kernel-based clustering algorithm depends on the selection of the width hyper-parameter, which needs to be optimized. Traditionally, empirical and cross- validation approaches have been used for that optimization. Moreover, few approaches have been proposed to automatically learn the width hyper-parameter. In this research we aim to study and compare two previous approaches with automated learning of the width hyper-parameter. The first is a kernel-based fuzzy clustering algorithm with an optimized width parameter that is updated according to the gradient method during each iteration process. In this first approach, the width parameter is updated in each iteration of the algorithm, but it is unique for all variables. The second is a fuzzy c-means clustering algorithm that learns the width parameters using an adaptive Gaussian kernel.In this second approach, each variable has its own width parameter that is also updated in each iteration of the algorithm. The activities to be developed by the student will contribute to the evaluation of the two approaches through statistics (mean, standard deviation) and hypothesis tests, based on indices that measure the quality of the fuzzy partition produced by these algorithms. This will help produce evidence about whether the observed difference between the two variants, in terms of the quality of the partition produced, is due to chance or whether, on average, one of them is superior to the other. (AU)

News published in Agência FAPESP Newsletter about the scholarship:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)