OATAO - Open Archive Toulouse Archive Ouverte Open Access Week

Non-Stationary Markov Decision Processes a Worst-Case Approach using Model-Based Reinforcement Learning

Lecarpentier, Erwan and Rachelson, Emmanuel Non-Stationary Markov Decision Processes a Worst-Case Approach using Model-Based Reinforcement Learning. (2019) In: NeurIPS, 9 December 2019 - 14 December 2019 (Vancouver, Canada).

[img]
Preview
(Document in English)

PDF (Author's version) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
545kB

Official URL: https://papers.nips.cc/paper/8942-non-stationary-markov-decision-processes-a-worst-case-approach-using-model-based-reinforcement-learning

Abstract

This work tackles the problem of robust zero-shot planning in non-stationary stochastic environments. We study Markov Decision Processes (MDPs) evolving over time and consider Model-Based Reinforcement Learning algorithms in this setting. We make two hypotheses: 1) the environment evolves continuously with a bounded evolution rate; 2) a current model is known at each decision epoch but not its evolution. Our contribution can be presented in four points. 1) we define a specific class of MDPs that we call Non-Stationary MDPs (NSMDPs). We introduce the notion of regular evolution by making an hypothesis of Lipschitz-Continuity on the transition and reward functions w.r.t. time; 2) we consider a planning agent using the current model of the environment but unaware of its future evolution. This leads us to consider a worst-case method where the environment is seen as an adversarial agent; 3) following this approach, we propose the Risk-Averse Tree-Search (RATS) algorithm, a zero-shot Model-Based method similar to Minimax search; 4) we illustrate the benefits brought by RATS empirically and compare its performance with reference Model-Based algorithms.

Item Type:Conference or Workshop Item (Paper)
HAL Id:hal-02882205
Audience (conference):International conference proceedings
Uncontrolled Keywords:
Institution:Université de Toulouse > Institut Supérieur de l'Aéronautique et de l'Espace - ISAE-SUPAERO (FRANCE)
French research institutions > Office National d'Etudes et Recherches Aérospatiales - ONERA (FRANCE)
Laboratory name:
Statistics:download
Deposited On:08 Nov 2019 16:16

Repository Staff Only: item control page