| Texto completo | |
| Autor(es): |
Dalmasso, N.
[1]
;
Pospisil, T.
[1, 2]
;
Lee, A. B.
[1]
;
Izbicki, R.
[3]
;
Freeman, P. E.
[1]
;
Malz, A. I.
[4]
Número total de Autores: 6
|
| Afiliação do(s) autor(es): | [1] Carnegie Mellon Univ, Dept Stat & Data Sci, 5000 Forbes Ave, Pittsburgh, PA 15213 - USA
[2] Google LLC, 1600 Amphitheatre Pkwy, Mountain View, CA 94043 - USA
[3] Univ Fed Sao Carlos, Dept Stat, Sao Paulo - Brazil
[4] NYU, Ctr Cosmol & Particle Phys, New York, NY 10003 - USA
Número total de Afiliações: 4
|
| Tipo de documento: | Artigo Científico |
| Fonte: | ASTRONOMY AND COMPUTING; v. 30, JAN 2020. |
| Citações Web of Science: | 0 |
| Resumo | |
It is well known in astronomy that propagating non-Gaussian prediction uncertainty in photometric redshift estimates is key to reducing bias in downstream cosmological analyses. Similarly, likelihood-free inference approaches, which are beginning to emerge as a tool for cosmological analysis, require a characterization of the full uncertainty landscape of the parameters of interest given observed data. However, most machine learning (ML) or training-based methods with open-source software target point prediction or classification, and hence fall short in quantifying uncertainty in complex regression and parameter inference settings such as the applications mentioned above. As an alternative to methods that focus on predicting the response (or parameters) y from features x, we provide nonparametric conditional density estimation (CDE) tools for approximating and validating the entire probability density function (PDF) p(y vertical bar x) of y given (i.e., conditional on) x. This density approach offers a more nuanced accounting of uncertainty in situations with, e.g., nonstandard error distributions and multimodal or heteroskedastic response variables that are often present in astronomical data sets. As there is no one-size-fits-all CDE method, and the ultimate choice of model depends on the application and the training sample size, the goal of this work is to provide a comprehensive range of statistical tools and open-source software for nonparametric CDE and method assessment which can accommodate different types of settings - involving, e.g., mixed-type input from multiple sources, functional data, and images - and which in addition can easily be fit to the problem at hand. Specifically, we introduce four CDE software packages in Python and R based on ML prediction methods adapted and optimized for CDE: NNKCDE, RFCDE, FlexCode, and DeepCDE. Furthermore, we present the cdetools package with evaluation metrics. This package includes functions for computing a CDE loss function for tuning and assessing the quality of individual PDFs, together with diagnostic functions that probe the population-level performance of the PDFs. We provide sample code in Python and R as well as examples of applications to photometric redshift estimation and likelihood-free cosmological inference via CDE. (C) 2020 Elsevier B.V. All rights reserved. (AU) | |
| Processo FAPESP: | 19/11321-9 - Redes neurais em problemas de inferência estatística |
| Beneficiário: | Rafael Izbicki |
| Modalidade de apoio: | Auxílio à Pesquisa - Regular |
| Processo FAPESP: | 17/03363-8 - Interpretabilidade e eficiência em testes de hipótese |
| Beneficiário: | Rafael Izbicki |
| Modalidade de apoio: | Auxílio à Pesquisa - Regular |