Busca avançada
Ano de início
Entree


A Method for the Online Construction of the Set of States of a Markov Decision Process Using Answer Set Programming

Texto completo
Autor(es):
Ferreira, Leonardo Anjoletto ; Bianchi, Reinaldo A. C. ; Santos, Paulo E. ; Lopez de Mantaras, Ramon ; Mouhoub, M ; Sadaoui, S ; Mohamed, OA ; Ali, M
Número total de Autores: 8
Tipo de documento: Artigo Científico
Fonte: RECENT TRENDS AND FUTURE TECHNOLOGY IN APPLIED INTELLIGENCE, IEA/AIE 2018; v. 10868, p. 13-pg., 2018-01-01.
Resumo

Non-stationary domains, that change in unpredicted ways, are a challenge for agents searching for optimal policies in sequential decision-making problems. This paper presents a combination of Markov Decision Processes (MDP) with Answer Set Programming (ASP), named Online ASP for MDP (oASP(MDP)), which is a method capable of constructing the set of domain states while the agent interacts with a changing environment. oASP(MDP) updates previously obtained policies, learnt by means of Reinforcement Learning (RL), using rules that represent the domain changes observed by the agent. These rules represent a set of domain constraints that are processed as ASP programs reducing the search space. Results show that oASP(MDP) is capable of finding solutions for problems in non-stationary domains without interfering with the action-value function approximation process. (AU)

Processo FAPESP: 16/18792-9 - Descrição, representação e solução de jogos espaciais
Beneficiário:Paulo Eduardo Santos
Modalidade de apoio: Auxílio à Pesquisa - Parceria para Inovação Tecnológica - PITE