Advanced search
Start date
Betweenand


Occupation Measure Heuristics to solve Stochastic Shortest Path with Dead Ends

Full text
Author(s):
Fernandez, Milton Condori ; de Barros, Leliane Nunes ; Delgado, Karina Valdivia ; IEEE
Total Authors: 4
Document type: Journal article
Source: 2018 7TH BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS); v. N/A, p. 6-pg., 2018-01-01.
Abstract

The most efficient approach to solve probabilistic planning problems is based on stochastic shortest path (SSP) and uses heuristic search to find a policy that minimizes the expected accumulated cost to the goal (MINCOST criterion). However, this approach can solve problems with dead ends (states from which it is not possible to reach the goal) only if an efficient dead end detection heuristic is used. Another solution would be to plan in two phases: maximizing the probability to reach the goal (MAXPROB) and then minimizing the expected cost (MINCOST). While there exist several heuristics to solve MINCOST, there are not known efficient heuristics to solve MAXPROB. A recent work proposes the first heuristic that takes into account the probabilities, called h(pom), which solves a relaxed version of an SSP as a linear program in the dual space. However, to solve large problems with dead ends, h(pom) must be augmented with a dead end detection heuristic (e.g., h(pom) & h(max)). In this work, we propose two new heuristics based on hpom. The first, h(pom)(pe)(s), estimates the minimal cost to reach the goal from state s, including new variables and constraints for dead ends states and adding an expected penalty for reaching them. The second, h(ppom)(s), estimates the maximum probability to reach the goal from s, and is used to solve MAXPROB problems by ignoring action costs. We claim that h(ppom)(s) is the first heuristic for MAXPROB. Empirical results show that h(pom)(pe) can solve larger planning instances when compared to h(pom)&h(max). (AU)

FAPESP's process: 15/01587-0 - Storage, modeling and analysis of dynamical systems for e-Science applications
Grantee:João Eduardo Ferreira
Support Opportunities: Research Grants - eScience and Data Science Program - Thematic Grants