Advanced search
Start date
Betweenand

Scalable distributed algorithms for approximating the kNNG

Grant number: 22/04934-7
Support Opportunities:Scholarships in Brazil - Scientific Initiation
Start date: August 01, 2022
End date: November 30, 2023
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computing Methodologies and Techniques
Principal Investigator:Murilo Coelho Naldi
Grantee:Gabriel Meirelles Carvalho Orlando
Host Institution: Centro de Ciências Exatas e de Tecnologia (CCET). Universidade Federal de São Carlos (UFSCAR). São Carlos , SP, Brazil
Associated scholarship(s):23/00993-1 - Investigation of fast parallel KNNG for multiple density estimation, BE.EP.IC

Abstract

Finding the K Nearest Neighbors Graph (kNNG) of a dataset is an essential estimation used in several algorithms, such as clustering and anomaly detection. However, this operation is costly since when calculating all k neighbors of all points in the set, there is a quadratic complexity concerning the size of the data set. Thus, there is a physical limitation in the execution of these algorithms since if large amounts of data are used, the algorithm tends to become slower, given the complexity of memory and the complexity of time. Due to this limitation, the existence of kNNG algorithms that build the approximate graph but, at the same time, maintain quality and are scalable is fundamental. Therefore, an idea that allows the fulfillment of these requirements is the distribution and parallelization of the algorithms that find the approximate kNNG. Thus, to find out which algorithms that build the approximate kNNG have the best results in terms of computational cost and quality of the result and can be parallel and distributed, it is necessary to carry out a comparative study between these methods. This study encompasses the theoretical comparison, based on the computational complexity of the algorithms, and the practical comparison, based on the algorithm's precision, accuracy, and the study of the advantages and disadvantages of each algorithm that builds the approximate kNNG. The result of this study will optimize clustering and outlier detection algorithms, such as HDBSCAN, which uses the process of building the kNNG in its execution and thus can be improved if integrated with these algorithms.(AU)

News published in Agência FAPESP Newsletter about the scholarship:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)