DECAF: Deep Case-based Policy Inference for knowledge transfer in Reinforcement Learning

Glatt, Ruben; Da Silva, Felipe Leno; da Costa Bianchi, Reinaldo Augusto; Reali Costa, Anna Helena

Texto completo
Autor(es):	Glatt, Ruben ^[1] ; Da Silva, Felipe Leno ^[1] ; da Costa Bianchi, Reinaldo Augusto ^[2] ; Reali Costa, Anna Helena ^[1] Número total de Autores: 4
Afiliação do(s) autor(es):	^[1] Univ Sao Paulo, Av Prof Luciano Gualberto 158, BR-05508010 Sao Paulo - Brazil ^[2] FEIs Univ Ctr, Av Humberto Alencar Castelo Branco 3972, BR-09850901 Sao Bernardo Do Campo, SP - Brazil Número total de Afiliações: 2
Tipo de documento:	Artigo Científico
Fonte:	EXPERT SYSTEMS WITH APPLICATIONS; v. 156, OCT 15 2020.
Citações Web of Science:	0
Resumo
Having the ability to solve increasingly complex problems using Reinforcement Learning (RL) has prompted researchers to start developing a greater interest in systematic approaches to retain and reuse knowledge over a variety of tasks. With Case-based Reasoning (CBR) there exists a general methodology that provides a framework for knowledge transfer which has been underrepresented in the RL literature so far. We formulate a terminology for the CBR framework targeted towards RL researchers with the goal of facilitating communication between the respective research communities. Based on this framework, we propose the Deep Case-based Policy Inference (DECAF) algorithm to accelerate learning by building a library of cases and reusing them if they are similar to a new task when training a new policy. DECAF guides the training by dynamically selecting and blending policies according to their usefulness for the current target task, reusing previously learned policies for a more effective exploration but still enabling the adaptation to particularities of the new task. We show an empirical evaluation in the Atari game playing domain depicting the benefits of our algorithm with regards to sample efficiency, robustness against negative transfer, and performance increase when compared to state-of-the-art methods. (C) 2020 Elsevier Ltd. All rights reserved. (AU)

Processo FAPESP:	16/21047-3 - ALIS: Aprendizado Autônomo em Sistemas Inteligentes
Beneficiário:	Anna Helena Reali Costa
Modalidade de apoio:	Auxílio à Pesquisa - Regular


Processo FAPESP:	15/16310-4 - Transferência de Conhecimento no Aprendizado por Reforço em Sistemas Multiagentes
Beneficiário:	Felipe Leno da Silva
Modalidade de apoio:	Bolsas no Brasil - Doutorado


Processo FAPESP:	18/00344-5 - Reusando soluções de tarefas prévias em aprendizado por reforço multiagente
Beneficiário:	Felipe Leno da Silva
Modalidade de apoio:	Bolsas no Exterior - Estágio de Pesquisa - Doutorado


Processo FAPESP:	16/18792-9 - Descrição, representação e solução de jogos espaciais
Beneficiário:	Paulo Eduardo Santos
Modalidade de apoio:	Auxílio à Pesquisa - Parceria para Inovação Tecnológica - PITE

URL curto