Advanced search
Start date
Betweenand

Asynchronous dynamic programming for discrete and continuous Markov decision processes

Grant number: 12/10861-0
Support type:Scholarships abroad - Research Internship - Master's degree
Effective date (Start): August 13, 2012
Effective date (End): February 12, 2013
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Theory of Computation
Principal researcher:Leliane Nunes de Barros
Grantee:Luis Gustavo Rocha Vianna
Supervisor abroad: Scott P. Sanner
Home Institution: Instituto de Matemática e Estatística (IME). Universidade de São Paulo (USP). São Paulo , SP, Brazil
Research place: National ICT Australia (NICTA), Australia  
Associated to the scholarship:11/16962-0 - Real Time Dynamic Programming and Monte-Carlo Simulation for Probabilistic Planning, BP.MS

Abstract

Many probabilistic planning problems can be modelled by a discrete and continuous Markov decision process (DC-MDP), which is a general version of the Markov decision process (MDP). Since solutions for DC-MDPs are few in literature and not general, we intend to propose a new solution based on the efficient solutions for MDPs. Two efficient sample based algorithms are: (i) Real Time Dynamic Programming, that updates the value function of the states visited using the Bellman Equation, which provides a guarantee of convergency to a optimal solution; and (ii) Monte Carlo Planning techniques, especially, Upper Confidence Bounds Applied to Trees (UCT), which have been recently applied to probabilistic planning and has achieved success in the last international probabilisitic planning competition. Considering the efficiency of both algorithms for MDPs, we propose an adaptation of RTDP for continuous state spaces. Moreover, we intend to improve the efficiency using sampling techniques, as those in UCT, to speed up the value update operations performed by RTDP. The algorithms developed will be tested on the benchmark probabilistic planning problems and compared to the existing solutions. This project will be performed at NICTA (National ICT Australia) and the developed algorithms will be applied to a practical interest problem. (AU)

News published in Agência FAPESP Newsletter about the scholarship:
Articles published in other media outlets (0 total):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)