Busca avançada
Ano de início
Entree
(Referência obtida automaticamente do Web of Science, por meio da informação sobre o financiamento pela FAPESP e o número do processo correspondente, incluída na publicação pelos autores.)

A tourist walk approach for internal and external outlier detection

Texto completo
Autor(es):
Rodrigues, Rafael D. [1] ; Zhao, Liang [1] ; Zheng, Qiusheng [2] ; Zhang, Junbao [3]
Número total de Autores: 4
Afiliação do(s) autor(es):
[1] Univ Sao Paulo, Fac Philosophy Sci & Letters Ribeirao Preto FFCLR, Ribeirao Preto, SP - Brazil
[2] Zhongyuan Univ Technol, Henan Key Lab Publ Opin Intelligent Anal, Zhengzhou - Peoples R China
[3] Zhongyuan Univ Technol, Sch Comp Sci, Zhengzhou - Peoples R China
Número total de Afiliações: 3
Tipo de documento: Artigo Científico
Fonte: Neurocomputing; v. 393, p. 203-213, JUN 14 2020.
Citações Web of Science: 0
Resumo

Outlier detection is a fundamental task for knowledge discovery in data mining, especially in the Big Data era. It aims to detect data items that deviate from the general pattern of a given data set. In this paper, we present a new outlier detection technique using tourist walks starting from each data sample and varying the memory size. Specifically, a data sample gets a higher outlier score if it participates in few tourist walk attractors, while it gets a low score if it participates in a large number of attractors. Experimental results on artificial and real data sets show good performance of the proposed method. In comparison to classical outlier detection methods, the proposed one shows the following salient features: (1) It finds out outliers by identifying the structure of the input data set instead of considering only physical features, such as distance, similarity or density. (2) It can detect not only external outliers as classical methods do, but also internal outliers staying among various normal data groups. (3) By varying the memory size, the tourist walks can characterize both local and global structures of the data set. (4) A parallel implementation is quite convenient due to the nature of large amount of independent walking of the algorithm. (5) The proposed method is a deterministic technique. Therefore, only one run is sufficient, in contrast to stochastic techniques, which require many runs. Moreover, in this work, we find, for the first time, that tourist walks can generate complex attractors in various crossing shapes. Such complex attractors reveal data structures in more details. Consequently, it can improve the outlier detection performance. (C) 2019 Elsevier B.V. All rights reserved. (AU)

Processo FAPESP: 15/50122-0 - Fenômenos dinâmicos em redes complexas: fundamentos e aplicações
Beneficiário:Elbert Einstein Nehrer Macau
Modalidade de apoio: Auxílio à Pesquisa - Temático