OATAO - Open Archive Toulouse Archive Ouverte Open Access Week

POMDP solving: what rewards do you really expect at execution?

Ponzoni Carvalho Chanel, Caroline and Farges, Jean-Loup and Teichteil-Königsbuch, Florent and Infantes, Guillaume POMDP solving: what rewards do you really expect at execution? (2010) In: The 5th Starting Artificial Intelligence Researche Symposium, 16 August 2010 - 20 August 2010 (Lisbon, Portugal).

(Document in English)

PDF (Author's version) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader


Partially Observable Markov Decision Processes have gained an increasing interest in many research communities, due to sensible improvements of their optimization algorithms and of computers capabilities. Yet, most research focus on optimizing either average accumulated rewards (AI planning) or direct entropy (active perception), whereas none of them matches the rewards actually gathered at execution. Indeed, the first optimization criterion linearly averages over all belief states, so that it does not gain best information from different observations, while the second one totally discards rewards. Thus, motivated by simple demonstrative examples, we study an additive combination of these two criteria to get the best of reward gathering and information acquisition at execution. We then compare our criterion with classical ones, and highlight the need to consider new hybrid non-linear criteria, on a realistic multi-target recognition and tracking mission.

Item Type:Conference or Workshop Item (Paper)
Additional Information:ISBN : 978-1-60750-675-1 Editeur : IOS Press
Audience (conference):International conference proceedings
Uncontrolled Keywords:
Institution:Université de Toulouse > Institut Supérieur de l'Aéronautique et de l'Espace - ISAE-SUPAERO (FRANCE)
French research institutions > Office National d'Etudes et Recherches Aérospatiales - ONERA (FRANCE)
Laboratory name:
Deposited On:15 Jan 2015 15:22

Repository Staff Only: item control page