Likelihood Estimator for Multi Model-Based Reinforcement Learning

Albarrans, Guilherme; Freire, Valdinei

Texto completo
Autor(es):	Albarrans, Guilherme ; Freire, Valdinei Número total de Autores: 2
Tipo de documento:	Artigo Científico
Fonte:	INTELLIGENT SYSTEMS, BRACIS 2024, PT II; v. 15413, p. 15-pg., 2025-01-01.
Resumo
Efficiently learning and generalizing from interactions with the environment is a concern in reinforcement learning. Model-based approaches offer promising potential to address these challenges, leveraging the underlying structure of the world to speed learning and improve generalization. By assuming a structured environment, such approaches aim to capture key patterns from limited interactions and synthesize experiences for enhanced exploitation and generalization to unexplored territories. In this paper, we test a model-based reinforcement learning framework aimed at further advancing the frontiers of efficiency and generalization. Motivated by the insight that environments often exhibit distinct regions with varying dynamics, we introduce additional assumptions about the structure of the world to facilitate faster generalization. Specifically, we construct a model comprising a classifier that, given a state input, selects parameters characterizing the operating regime of the environment. Both the model and the regional parameters are learned from past experiences using a likelihood function. Leveraging this model, we employ a rapid exploring random tree based planner to generate new real experiences, capitalizing on the identified structural nuances within the environment. To evaluate the efficacy learning a segmented model, we conduct a comparative analysis against a traditional method that employs a single general model to learn the dynamics of the entire environment. Our results demonstrate the superiority of the segmented approach in terms of both efficiency and generalization, underscoring the benefits of incorporating additional assumptions about the structure of the world into model-based reinforcement learning paradigms. (AU)

Processo FAPESP:	19/07665-4 - Centro de Inteligência Artificial
Beneficiário:	Fabio Gagliardi Cozman
Modalidade de apoio:	Auxílio à Pesquisa - Programa eScience e Data Science - Centros de Pesquisa em Engenharia

URL curto