Advanced search
Start date
Betweenand

Graph-based clustering with minimum spanning trees and Jensen-Shannon distance

Grant number: 24/05031-6
Support Opportunities:Scholarships in Brazil - Scientific Initiation
Start date: August 01, 2024
End date: July 31, 2025
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computing Methodologies and Techniques
Principal Investigator:Alexandre Luís Magalhães Levada
Grantee:Matheus dos Santos Sousa
Host Institution: Centro de Ciências Exatas e de Tecnologia (CCET). Universidade Federal de São Carlos (UFSCAR). São Carlos , SP, Brazil

Abstract

Data clustering is a fundamental task in many areas of computer science and engineering, with applications in pattern recognition, data mining, image processing, among others. In recent years, graph-based approaches have gained prominence due to their ability to model complex relationships between data. Furthermore, distance metrics based on Information Theory have been widely used to measure the divergence between probability distributions, especially in problems related to data analysis. This project aims to develop an innovative data clustering method, combining concepts from graph theory with the Jensen-Shannon Distance. The main objective is to explore the effectiveness of Minimum Spanning Trees (MST's) as a framework for representing relationships between data points, incorporating the Jensen-Shannon Distance to calculate the similarity between potential clusters. The proposed method is expected to be able to identify significant clusters in a computationally efficient manner in complex data sets, overcoming the limitations of classical clustering algorithms. Furthermore, we intend to demonstrate the advantage of combining graphs with the Jensen-Shannon distance in terms of the quality of the detected clusters. Ultimately, this project can significantly contribute to advancing the field of data clustering by introducing an innovative approach that combines established graph theory techniques with probability-based distance metrics. The results obtained can have practical applications in several areas, including complex network analysis, bioinformatics, natural language processing, among others.

News published in Agência FAPESP Newsletter about the scholarship:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)