Busca avançada
Ano de início
Entree
(Referência obtida automaticamente do Web of Science, por meio da informação sobre o financiamento pela FAPESP e o número do processo correspondente, incluída na publicação pelos autores.)

Internal Evaluation of Unsupervised Outlier Detection

Texto completo
Autor(es):
Marques, Henrique O. [1] ; Campello, Ricardo J. G. B. [2] ; Sander, Jorg [3] ; Zimek, Arthur [4]
Número total de Autores: 4
Afiliação do(s) autor(es):
[1] Univ Sao Paulo, Inst Math & Comp Sci ICMC, BR-13566590 Sao Carlos, SP - Brazil
[2] Univ Newcastle, Sch Math & Phys Sci MAPS, Univ Dr, Callaghan, NSW 2308 - Australia
[3] Univ Alberta, Dept Comp Sci, Edmonton, AB T6G 2E8 - Canada
[4] Univ Southern Denmark, Dept Math & Comp Sci IMADA, Campusvej 55, DK-5230 Odense - Denmark
Número total de Afiliações: 4
Tipo de documento: Artigo Científico
Fonte: ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA; v. 14, n. 4 JUL 2020.
Citações Web of Science: 0
Resumo

Although there is a large and growing literature that tackles the unsupervised outlier detection problem, the unsupervised evaluation of outlier detection results is still virtually untouched in the literature. The so-called internal evaluation, based solely on the data and the assessed solutions themselves, is required if one wants to statistically validate (in absolute terms) or just compare (in relative terms) the solutions provided by different algorithms or by different parameterizations of a given algorithm in the absence of labeled data. However, in contrast to unsupervised cluster analysis, where indexes for internal evaluation and validation of clustering solutions have been conceived and shown to be very useful, in the outlier detection domain, this problem has been notably overlooked. Here we discuss this problem and provide a solution for the internal evaluation of outlier detection results. Specifically, we describe an index called Internal, Relative Evaluation of Outlier Solutions (IREOS) that can evaluate and compare different candidate outlier detection solutions. Initially, the index is designed to evaluate binary solutions only, referred to as top-n outlier detection results. We then extend IREOS to the general case of non-binary solutions, consisting of outlier detection scorings. We also statistically adjust IREOS for chance and extensively evaluate it in several experiments involving different collections of synthetic and real datasets. (AU)

Processo FAPESP: 17/04161-0 - Avaliação, Seleção de Modelos e Detecção Não Supervisionada de Outliers em Subespaços de Dados
Beneficiário:Henrique Oliveira Marques
Modalidade de apoio: Bolsas no Exterior - Estágio de Pesquisa - Doutorado
Processo FAPESP: 15/06019-0 - Avaliação, seleção de modelos e detecção não supervisionada de outliers em espaços e subespaços de dados
Beneficiário:Henrique Oliveira Marques
Modalidade de apoio: Bolsas no Brasil - Doutorado