DECAF: Deep Case-based Policy Inference for knowledge transfer in Reinforcement Learning

Glatt, Ruben; Da Silva, Felipe Leno; da Costa Bianchi, Reinaldo Augusto; Reali Costa, Anna Helena

Full text
Author(s):	Glatt, Ruben ^[1] ; Da Silva, Felipe Leno ^[1] ; da Costa Bianchi, Reinaldo Augusto ^[2] ; Reali Costa, Anna Helena ^[1] Total Authors: 4
Affiliation:	^[1] Univ Sao Paulo, Av Prof Luciano Gualberto 158, BR-05508010 Sao Paulo - Brazil ^[2] FEIs Univ Ctr, Av Humberto Alencar Castelo Branco 3972, BR-09850901 Sao Bernardo Do Campo, SP - Brazil Total Affiliations: 2
Document type:	Journal article
Source:	EXPERT SYSTEMS WITH APPLICATIONS; v. 156, OCT 15 2020.
Web of Science Citations:	0
Abstract
Having the ability to solve increasingly complex problems using Reinforcement Learning (RL) has prompted researchers to start developing a greater interest in systematic approaches to retain and reuse knowledge over a variety of tasks. With Case-based Reasoning (CBR) there exists a general methodology that provides a framework for knowledge transfer which has been underrepresented in the RL literature so far. We formulate a terminology for the CBR framework targeted towards RL researchers with the goal of facilitating communication between the respective research communities. Based on this framework, we propose the Deep Case-based Policy Inference (DECAF) algorithm to accelerate learning by building a library of cases and reusing them if they are similar to a new task when training a new policy. DECAF guides the training by dynamically selecting and blending policies according to their usefulness for the current target task, reusing previously learned policies for a more effective exploration but still enabling the adaptation to particularities of the new task. We show an empirical evaluation in the Atari game playing domain depicting the benefits of our algorithm with regards to sample efficiency, robustness against negative transfer, and performance increase when compared to state-of-the-art methods. (C) 2020 Elsevier Ltd. All rights reserved. (AU)

FAPESP's process:	16/21047-3 - ALIS: Autonomous Learning in Intelligent System
Grantee:	Anna Helena Reali Costa
Support Opportunities:	Regular Research Grants


FAPESP's process:	15/16310-4 - Transfer Learning in Reinforcement Learning Multi-Agent Systems
Grantee:	Felipe Leno da Silva
Support Opportunities:	Scholarships in Brazil - Doctorate


FAPESP's process:	18/00344-5 - Reusing previous task solutions in multiagent reinforcement learning
Grantee:	Felipe Leno da Silva
Support Opportunities:	Scholarships abroad - Research Internship - Doctorate


FAPESP's process:	16/18792-9 - Describing, representing and solving spatial puzzles
Grantee:	Paulo Eduardo Santos
Support Opportunities:	Research Grants - Research Partnership for Technological Innovation - PITE

Short URL