## Publications• Sorted by Date • Classified by Publication Type • Classified by Research Category • ## Point-Based Planning for Multi-Objective POMDPs Diederik Roijers, Shimon Whiteson, and Frans A. Oliehoek. Point-Based Planning for Multi-Objective POMDPs. In ## Download## AbstractMany sequential decision-making problems require an agent to reason about both multiple objec- tives and uncertainty regarding the environmentâ€™s state. Such problems can be naturally modelled as multi-objective partially observable Markov deci- sion processes (MOPOMDPs). We propose opti- mistic linear support with alpha reuse (OLSAR), which computes a bounded approximation of the optimal solution set for all possible weightings of the objectives. The main idea is to solve a series of scalarized single-objective POMDPs, each cor- responding to a different weighting of the objec- tives. A key insight underlying OLSAR is that the policies and value functions produced when solv- ing scalarized POMDPs in earlier iterations can be reused to more quickly solve scalarized POMDPs in later iterations. We show experimentally that OLSAR outperforms, both in terms of runtime and approximation quality, alternative methods and a variant of OLSAR that does not leverage reuse. ## BibTeX Entry@inproceedings{Roijers15IJCAI, author = {Diederik Roijers and Shimon Whiteson and Frans A. Oliehoek}, title = {Point-Based Planning for Multi-Objective {POMDPs}}, booktitle = IJCAI15, year = 2015, month = jul, pages = {1666--1672}, note = {}, abstract = { Many sequential decision-making problems require an agent to reason about both multiple objec- tives and uncertainty regarding the environmentâ€™s state. Such problems can be naturally modelled as multi-objective partially observable Markov deci- sion processes (MOPOMDPs). We propose opti- mistic linear support with alpha reuse (OLSAR), which computes a bounded approximation of the optimal solution set for all possible weightings of the objectives. The main idea is to solve a series of scalarized single-objective POMDPs, each cor- responding to a different weighting of the objec- tives. A key insight underlying OLSAR is that the policies and value functions produced when solv- ing scalarized POMDPs in earlier iterations can be reused to more quickly solve scalarized POMDPs in later iterations. We show experimentally that OLSAR outperforms, both in terms of runtime and approximation quality, alternative methods and a variant of OLSAR that does not leverage reuse. } } Generated by bib2html.pl (written by Patrick Riley) on Tue Oct 03, 2017 15:15:26 UTC |