Risk-Sensitive Markov Decision Process with Limited Budget

de Melo Moreira, Daniel Augusto; Delgado, Karina Valdivia; de Barros, Leliane Nunes; IEEE

Texto completo
Autor(es):	de Melo Moreira, Daniel Augusto ; Delgado, Karina Valdivia ; de Barros, Leliane Nunes ; IEEE Número total de Autores: 4
Tipo de documento:	Artigo Científico
Fonte:	2017 6TH BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS); v. N/A, p. 6-pg., 2017-01-01.
Resumo
Markov Decision Process (MDP) commonly have the objective of finding a policy that minimizes the expected cumulative cost. Although this optimization criterion is useful, some policy executions may result in a too high cost, which for some applications is unacceptable (e.g. policies for military operations). A better optimization problem for those applications is based on probability maximization of cumulative costs within a threshold, called Risk-Sensitive MDP (RS-MDP). In the frame of RS-MDP, we propose a new challenging problem of finding the minimum budget for which the probability maximization of cumulative costs converges to a maximum. To solve this problem we propose a modified algorithm based on TVI-DP (a previous solution for RS-MDPs) and demonstrate its correctness. We also propose two major efficient improvements for memory saving and early termination. Finally, the empirical results show the proposed algorithm can solve large instances of RS-MDPs. (AU)

Processo FAPESP:	15/01587-0 - Armazenagem, modelagem e análise de sistemas dinâmicos para aplicações em e-Science
Beneficiário:	João Eduardo Ferreira
Modalidade de apoio:	Auxílio à Pesquisa - Programa eScience e Data Science - Temático

URL curto