Advanced search
Start date
Betweenand
(Reference retrieved automatically from Web of Science through information on FAPESP grant and its corresponding number as mentioned in the publication by the authors.)

MOO-MDP: An Object-Oriented Representation for Cooperative Multiagent Reinforcement Learning

Full text
Author(s):
Da Silva, Felipe Leno [1] ; Glatt, Ruben [1] ; Reali Costa, Anna Helena [1]
Total Authors: 3
Affiliation:
[1] Univ Sao Paulo, Intelligent Techn Lab, BR-05508970 Sao Paulo - Brazil
Total Affiliations: 1
Document type: Journal article
Source: IEEE TRANSACTIONS ON CYBERNETICS; v. 49, n. 2, p. 567-579, FEB 2019.
Web of Science Citations: 2
Abstract

Reinforcement learning (RL) is a widely known technique to enable autonomous learning. Even though RL methods achieved successes in increasingly large and complex problems, scaling solutions remains a challenge. One way to simplify (and consequently accelerate) learning is to exploit regularities in a domain, which allows generalization and reduction of the learning space. While object-oriented Markov decision processes (OO-MDPs) provide such generalization opportunities, we argue that the learning process may be further simplified by dividing the workload of tasks amongst multiple agents, solving problems as multiagent systems (MAS). In this paper, we propose a novel combination of OO-MDP and MAS, called multiagent OO-MDP (MOO-MDP). Our proposal accrues the benefits of both OO-MDP and MAS, better addressing scalability issues. We formalize the general model MOO-MDP and present an algorithm to solve deterministic cooperative MOO-MDPs. We show that our algorithm learns optimal policies while reducing the learning space by exploiting state abstractions. We experimentally compare our results with earlier approaches in three domains and evaluate the advantages of our approach in sample efficiency and memory requirements. (AU)

FAPESP's process: 16/21047-3 - ALIS: Autonomous Learning in Intelligent System
Grantee:Anna Helena Reali Costa
Support Opportunities: Regular Research Grants
FAPESP's process: 15/16310-4 - Transfer Learning in Reinforcement Learning Multi-Agent Systems
Grantee:Felipe Leno da Silva
Support Opportunities: Scholarships in Brazil - Doctorate