GUBS criterion: Arbitrary trade-offs between cost and probability-to-goal in stochastic planning based on Expected Utility Theory

Crispino, Gabriel Nunes; Freire, Valdinei; Delgado, Karina Valdivia

Texto completo
Autor(es):	Crispino, Gabriel Nunes ; Freire, Valdinei ; Delgado, Karina Valdivia Número total de Autores: 3
Tipo de documento:	Artigo Científico
Fonte:	ARTIFICIAL INTELLIGENCE; v. 316, p. 45-pg., 2023-03-01.
Resumo
Stochastic Shortest Path MDPs (SSP-MDPs) are used to model probabilistic sequential decision problems where the objective is to minimize the expected accumulated cost to goal. However, in the presence of dead-ends, the conventional criterion for SSP-MDPs, which minimizes the expected accumulated cost, can become ill-defined. Lexicographic criteria can solve this by preferring policies that reach the goal with the highest possible probability. Other criteria can instead make a trade-off between some cost measure and probability-to-goal. However, both of these approaches can lead to policies that might not represent the choice of a real decision-maker. In this work, we propose the GUBS criterion to address these problems. GUBS combines goal prioritization over histories with Expected Utility Theory and is the only criterion between all criteria analyzed that not only allows for a trade-off between a large accumulated cost and a small loss in probability-to-goal, but also guarantees arbitrary trade-offs that can be tuned from its parameters without previous knowledge of the problem being solved. We also propose eGUBS, which is a particular case of GUBS when the exponential utility function is used, and two algorithms for optimally solving these problems: eGUBS-VI, a VI-based algorithm; and eGUBS-AO, a heuristic search algorithm. Results indicate that, when there is a good heuristic function available or when the state space is too large, eGUBS-AO can perform better than eGUBS-VI by doing an efficient search. In other cases, eGUBS-VI's simpler approach might have better results. (c) 2022 Elsevier B.V. All rights reserved. (AU)

Processo FAPESP:	18/11236-9 - Processos de decisão Markovianos e risco
Beneficiário:	Karina Valdivia Delgado
Modalidade de apoio:	Auxílio à Pesquisa - Regular


Processo FAPESP:	19/07665-4 - Centro de Inteligência Artificial
Beneficiário:	Fabio Gagliardi Cozman
Modalidade de apoio:	Auxílio à Pesquisa - Programa eScience e Data Science - Centros de Pesquisa em Engenharia

URL curto