Busca avançada
Ano de início
Entree


Hierarchical clustering: Visualization, feature importance and model selection

Texto completo
Autor(es):
Cabezas, Luben M. C. ; Izbicki, Rafael ; Stern, Rafael B.
Número total de Autores: 3
Tipo de documento: Artigo Científico
Fonte: APPLIED SOFT COMPUTING; v. 141, p. 12-pg., 2023-07-01.
Resumo

We propose methods for the analysis of hierarchical clustering that fully use the multi-resolution structure provided by a dendrogram. Specifically, we propose a loss for choosing between clustering methods, a feature importance score and a graphical tool for visualizing the segmentation of features in a dendrogram. Current approaches to these tasks lead to loss of information since they require the user to generate a single partition of the instances by cutting the dendrogram at a specified level. Our proposed methods, instead, use the full structure of the dendrogram. The key insight behind the proposed methods is to view a dendrogram as a phylogeny. This analogy permits the assignment of a feature value to each internal node of a tree through an evolutionary model. Real and simulated datasets provide evidence that our proposed framework has desirable outcomes and gives more insights than state-of-art approaches. We provide an R package that implements our methods. & COPY; 2023 Elsevier B.V. All rights reserved. (AU)

Processo FAPESP: 20/10861-7 - Uma abordagem baseada em data-splitting para comparar algoritmos de agrupamento hierárquico
Beneficiário:Luben Miguel Cruz Cabezas
Modalidade de apoio: Bolsas no Brasil - Iniciação Científica
Processo FAPESP: 13/07699-0 - Centro de Pesquisa, Inovação e Difusão em Neuromatemática - NeuroMat
Beneficiário:Oswaldo Baffa Filho
Modalidade de apoio: Auxílio à Pesquisa - Centros de Pesquisa, Inovação e Difusão - CEPIDs
Processo FAPESP: 19/11321-9 - Redes neurais em problemas de inferência estatística
Beneficiário:Rafael Izbicki
Modalidade de apoio: Auxílio à Pesquisa - Regular