Busca avançada
Ano de início
Entree


A Unified Framework for Average Reward Criterion and Risk

Texto completo
Autor(es):
Silva Reis, Willy Arthur ; Delgado, Karina Valdivia ; Freire, Valdinei
Número total de Autores: 3
Tipo de documento: Artigo Científico
Fonte: INTELLIGENT SYSTEMS, BRACIS 2024, PT I; v. 15412, p. 15-pg., 2025-01-01.
Resumo

The average reward criterion is used to solve infinite-horizon MDPs. This risk-neutral criterion depends on the stochastic process in the limit and can use (i) the accumulated reward at infinity, which considers sequences of states of size h = infinity, or (ii) the steady state distribution of the MDP (i.e., the probability that the system is in each state in the long term), which considers sequences of states of size h = 1. In many situations, it is desirable to consider risk during the process at each stage, which can be achieved with the average reward criterion using a utility function or a risk measure such as VaR and CVaR. The objective of this work is to propose a mathematical framework that allows a unified treatment of the existing literature using average reward and risk, including works that use exponential utility functions and CVaR, as well as to include interpretations with 1 <= h <= infinity not present in the literature. These new interpretations allow differentiating policies that may not be distinguished from existing criteria. A numerical example shows the behaviors of the criteria considering this new framework. (AU)

Processo FAPESP: 18/11236-9 - Processos de decisão Markovianos e risco
Beneficiário:Karina Valdivia Delgado
Modalidade de apoio: Auxílio à Pesquisa - Regular
Processo FAPESP: 19/07665-4 - Centro de Inteligência Artificial
Beneficiário:Fabio Gagliardi Cozman
Modalidade de apoio: Auxílio à Pesquisa - Programa eScience e Data Science - Centros de Pesquisa em Engenharia