Advanced search
Start date
Betweenand

Asynchronous dynamic programming for discrete and continuous Markov decision processes

Grant number: 12/10861-0
Support Opportunities:Scholarships abroad - Research Internship - Master's degree
Start date: August 13, 2012
End date: February 12, 2013
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Theory of Computation
Principal Investigator:Leliane Nunes de Barros
Grantee:Luis Gustavo Rocha Vianna
Supervisor: Scott P. Sanner
Host Institution: Instituto de Matemática e Estatística (IME). Universidade de São Paulo (USP). São Paulo , SP, Brazil
Institution abroad: National ICT Australia (NICTA), Australia  
Associated to the scholarship:11/16962-0 - Real Time Dynamic Programming and Monte-Carlo Simulation for Probabilistic Planning, BP.MS

Abstract

Many probabilistic planning problems can be modelled by a discrete and continuous Markov decision process (DC-MDP), which is a general version of the Markov decision process (MDP). Since solutions for DC-MDPs are few in literature and not general, we intend to propose a new solution based on the efficient solutions for MDPs. Two efficient sample based algorithms are: (i) Real Time Dynamic Programming, that updates the value function of the states visited using the Bellman Equation, which provides a guarantee of convergency to a optimal solution; and (ii) Monte Carlo Planning techniques, especially, Upper Confidence Bounds Applied to Trees (UCT), which have been recently applied to probabilistic planning and has achieved success in the last international probabilisitic planning competition. Considering the efficiency of both algorithms for MDPs, we propose an adaptation of RTDP for continuous state spaces. Moreover, we intend to improve the efficiency using sampling techniques, as those in UCT, to speed up the value update operations performed by RTDP. The algorithms developed will be tested on the benchmark probabilistic planning problems and compared to the existing solutions. This project will be performed at NICTA (National ICT Australia) and the developed algorithms will be applied to a practical interest problem. (AU)

News published in Agência FAPESP Newsletter about the scholarship:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)