OATAO - Open Archive Toulouse Archive Ouverte Open Access Week

Temporal Markov Decision Problems : Formalization and Resolution

Rachelson, Emmanuel. Temporal Markov Decision Problems : Formalization and Resolution. PhD, Systèmes embarqués, Toulouse, France, 2009, 392 p.

[img]
Preview
(Document in English)

PDF (Author's version) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
6MB

Official URL: https://ori-oai.isae-supaero.fr/ori-oai-search/notice/view/isae-875

Abstract

This thesis addresses the question of planning under uncertainty within a time-dependent changing environment. Original motivation for this work came from the problem of building an autonomous agent able to coordinate with its uncertain environment; this environment being composed of other agents communicating their intentions or non-controllable processes for which some discrete-event model is available. We investigate several approaches for modeling continuous time-dependency in the framework of Markov Decision Processes (MDPs), leading us to a definition of Temporal Markov Decision Problems. Then our approach focuses on two separate paradigms. First, we investigate time-dependent problems as \emph{implicit-event} processes and describe them through the formalism of Time-dependent MDPs (TMDPs). We extend the existing results concerning optimality equations and present a new Value Iteration algorithm based on piecewise polynomial function representations in order to solve a more general class of TMDPs. This paves the way to a more general discussion on parametric actions in hybrid state and action spaces MDPs with continuous time. In a second time, we investigate the option of separately modeling the concurrent contributions of exogenous events. This approach of \emph{explicit-event} modeling leads to the use of Generalized Semi-Markov Decision Processes (GSMDP). We establish a link between the general framework of Discrete Events Systems Specification (DEVS) and the formalism of GSMDP, allowing us to build sound discrete-event compatible simulators. Then we introduce a simulation-based Policy Iteration approach for explicit-event Temporal Markov Decision Problems. This algorithmic contribution brings together results from simulation theory, forward search in MDPs, and statistical learning theory. The implicit-event approach was tested on a specific version of the Mars rover planning problem and on a drone patrol mission planning problem while the explicit-event approach was evaluated on a subway network control problem.

Item Type:PhD Thesis
Uncontrolled Keywords:
Institution:Université de Toulouse > Institut Supérieur de l'Aéronautique et de l'Espace - ISAE-SUPAERO (FRANCE)
Laboratory name:
Research Director:
Garcia, Frédérick and Fabiani, Patrick
Statistics:download
Deposited On:28 Nov 2017 16:15

Repository Staff Only: item control page