| Texto completo | |
| Autor(es): |
Número total de Autores: 3
|
| Afiliação do(s) autor(es): | [1] Univ Sao Paulo, Sao Paulo - Brazil
[2] Univ Fed Sao Carlos, Sao Carlos - Brazil
[3] Budapest Univ Technol & Econ, Budapest - Hungary
Número total de Afiliações: 3
|
| Tipo de documento: | Artigo Científico |
| Fonte: | INFORMATION SCIENCES; v. 557, p. 407-420, MAY 2021. |
| Citações Web of Science: | 0 |
| Resumo | |
An important question in many machine learning applications is whether two samples arise from the same generating distribution. Although an old topic in Statistics, simple accept/reject decisions given by most hypothesis tests are often not enough: it is well known that the rejection of the null hypothesis does not imply that differences between the two groups are meaningful from a practical perspective. In this work, we present a novel nonparametric approach to visually assess the dissimilarity between the datasets that goes beyond two-sample testing. The key idea of our approach is to measure the distance between two (possibly) high-dimensional datasets using variational autoencoders. We also show how this framework can be used to create a formal statistical test to test the hypothesis that both samples arise from the same distribution. We evaluate both the distance measurement and hypothesis testing approaches on simulated and real world datasets. The results show that our approach is useful for data exploration (as it, for instance, allows for quantification of the discrepancy/separability between categories of images), which can be particularly helpful in early phases of the a machine learning pipeline. (C) 2020 The Author(s). Published by Elsevier Inc. (AU) | |
| Processo FAPESP: | 19/11321-9 - Redes neurais em problemas de inferência estatística |
| Beneficiário: | Rafael Izbicki |
| Modalidade de apoio: | Auxílio à Pesquisa - Regular |
| Processo FAPESP: | 17/03363-8 - Interpretabilidade e eficiência em testes de hipótese |
| Beneficiário: | Rafael Izbicki |
| Modalidade de apoio: | Auxílio à Pesquisa - Regular |