Busca avançada
Ano de início
Entree


Selection of the number of clusters in functional data analysis

Texto completo
Autor(es):
Zambom, Adriano Zanin ; Alfonso Collazos, Julian ; Dias, Ronaldo
Número total de Autores: 3
Tipo de documento: Artigo Científico
Fonte: JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION; v. 92, n. 14, p. 19-pg., 2022-03-23.
Resumo

Identifying the number K of clusters in a dataset is one of the most difficult problems in clustering analysis. A choice of K that correctly characterizes the features of the data is essential for building meaningful clusters. In this paper we tackle the problem of estimating the number of clusters in functional data analysis by introducing a new measure that can be used with different procedures in selecting the optimal K. The main idea is to use a combination of two test statistics, which measure the lack of parallelism and the mean distance between curves, to compute criteria such as the within and between cluster sum of squares. Simulations in challenging scenarios suggest that procedures using this measure can detect the correct number of clusters more frequently than existing methods in the literature. The application of the proposed method is illustrated on several real datasets. (AU)

Processo FAPESP: 19/04535-2 - Cadeias de Markov com alcance variável: incorporando variáveis exógenas
Beneficiário:Nancy Lopes Garcia
Modalidade de apoio: Auxílio à Pesquisa - Pesquisador Visitante - Internacional
Processo FAPESP: 17/15306-9 - Incorporando Covariaveis Functionais em Modelos de Regressão Não Paramétricos
Beneficiário:Nancy Lopes Garcia
Modalidade de apoio: Auxílio à Pesquisa - Regular
Processo FAPESP: 18/04654-9 - Séries temporais, ondaletas e dados de alta dimensão
Beneficiário:Pedro Alberto Morettin
Modalidade de apoio: Auxílio à Pesquisa - Temático