Research Grants 22/08847-1 - Aprendizado computacional, Redes definidas por software - BV FAPESP
Advanced search
Start date
Betweenand

Explainable reinforcement learning for routing in software-defined networking

Abstract

This proposal of visiting researcher focuses on addressing the following question: How to provide an efficient, intelligent, and explainable routing scheme for SDN? We argue that eXplainable Reinforcement Learning (XRL) and eXplainable Deep Reinforcement Learning (XDRL), fields of eXplainable Artificial Intelligence (XAI) that have attracted considerable attention recently, allowing networking stakeholders to make RL and DRL models interpretable, manageable, and trustworthy. XRL and XDRL lead to understanding the reasoning and actions performed by RL and DRL agents in making their decisions. However, designing appropriately fast and accurate XAI and XRL/XDRL models, in particular, is an open research challenge. There is a tremendous need for achieving the interpretability and explainability of closed-box methods working with agents that act autonomously in the real world as RL and DRL agents do. Specifically, we will explore explainability in our SDN routing solutions, RSIR and DRSIR. RSIR adds a Knowledge Plane and defines a routing algorithm based on Q-learning that considers link-state information to explore, learn, and exploit potential paths for intelligent routing, even during dynamic traffic changes. This algorithm capitalizes on interaction with the environment, the intelligence provided by RL, and the global view and control of the network furnished by SDN. It computes and installs, in advance, optimal routes in the routing tables of the switches on the Data Plane. DRSIR enhances RSIR by using DQN with path-state metrics and Target and Online Neural Networks (NNs). Using path-state metrics reduces the knowledge abstraction needed by the routing agent since this approach directly explores different path options instead of link state information. Target and Online NNs allow DRSIR to reduce the error in estimations based on path information. Moreover, the DQN agent uses Experience Replay Memory to accelerate learning. (AU)

Articles published in Agência FAPESP Newsletter about the research grant:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)

Please report errors in scientific publications list using this form.
X

Report errors in this page


Error details: