Advanced search
Start date
Betweenand
(Reference retrieved automatically from Web of Science through information on FAPESP grant and its corresponding number as mentioned in the publication by the authors.)

Revisiting agglomerative clustering

Full text
Author(s):
Tokuda, Eric K. [1] ; Comin, Cesar H. [2] ; Costa, Luciano da F. [1]
Total Authors: 3
Affiliation:
[1] Univ Sao Paulo, Inst Phys, Av Trabalhador Sao Carlense 400, Sao Paulo, SP - Brazil
[2] Univ Fed Sao Carlos, Comp Sci Dept, Rod Washington Luis, Km 235, Sao Carlos, SP - Brazil
Total Affiliations: 2
Document type: Journal article
Source: PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS; v. 585, JAN 1 2022.
Web of Science Citations: 0
Abstract

Hierarchical agglomerative methods stand out as particularly effective and popular approaches for clustering data. Yet, these methods have not been systematically compared regarding the important issue of false positives while searching for clusters. A model of clusters involving a higher density nucleus surrounded by a transition, followed by outliers is adopted as a means to quantify the relevance of the obtained clusters and address the problem of false positives. Six traditional methodologies, namely the single, average, median, complete, centroid and Ward's linkage criteria are compared with respect to the adopted model. Unimodal and bimodal datasets obeying uniform, gaussian, exponential and power-law distributions are considered for this comparison. The obtained results include the verification that many methods detect two clusters in unimodal data. The single-linkage method was found to be more resilient to false positives. Also, several methods detected clusters not corresponding directly to the nucleus. (C) 2021 Elsevier B.V. All rights reserved. (AU)

FAPESP's process: 19/01077-3 - Integrating computer vision and complex networks for urban analysis
Grantee:Eric Keiji Tokuda
Support Opportunities: Scholarships in Brazil - Post-Doctoral
FAPESP's process: 15/22308-2 - Intermediate representations in Computational Science for knowledge discovery
Grantee:Roberto Marcondes Cesar Junior
Support Opportunities: Research Projects - Thematic Grants
FAPESP's process: 18/09125-4 - Representation, characterization and modeling of biological images using complex networks
Grantee:Cesar Henrique Comin
Support Opportunities: Regular Research Grants