AEMS
Implements anytime error minimization search (AEMS) solver for POMDPs. This algorithm was originally described in
Ross, Stéphane, and Brahim Chaib-Draa. "AEMS: An Anytime Online Search Algorithm for Approximate Policy Refinement in Large POMDPs." IJCAI. 2007.
Specifically, this solver is AEMS2, which outperforms AEMS1 in nearly all published experiments.
Installation
Pkg.clone("https://github.com/JuliaPOMDP/AEMS.jl")
Quick Use
using POMDPs, POMDPToolbox, AEMS, POMDPModels
pomdp = BabyPOMDP()
solver = AEMSSolver()
planner = solve(solver, pomdp) # planner is of type AEMSPlanner
# once you have a belief b
a = action(planner, b)
Solver Options
The following keyword options are available; for examle, AEMSSolver(max_iterations = 100, max_time = 0.1)
.
max_iterations
Maximum number of fringe expansions during one action. Defaults to 1000.max_time
Maximum time (in seconds) to spend on one action. Defaults to 1 second.updater
The updater used to propagate beliefs in the tree. Defaults to a discrete updater.lower_bound
Defaults to a fixed-action policy.upper_bound
Defaults to a policy generated by FIB.root_manager
Determines how the root changes once an action is taken and an observation is received. This process is further described below. Allowed values are:clear
,:belief
,:user
. Defaults to:clear
.action_selector
Determines if action with best upper or lower bound should be selected after tree expansion. Defaults to:U
to select action with best upper bound; use:L
to select action with best lower bound.
Bounds
The upper and lower bounds must be subtypes of the Policy
type and have the value
function implemented.
This function is then used to estimate the bound values at new beliefs.
Root Management
Once an action is taken and an observation is received, the root of the search tree becomes the updated belief. However, due to the structure of POMDPs.jl
, the planner is not made aware of the resulting observation. Therefore we provide three different ways to manage the root.
1. Clearing the tree after each action
This solves the root issue by making a new tree with the current belief as the node, but it's probably less efficient to throw away the tree and start from scratch for each call to action
.
Tree clearing is the default behavior, but if you want to be explicit you can call AEMSSolver
with the option root_manager = :clear
.
2. Searching through child beliefs of previous root
The updated belief should be equal to one of the child beliefs of the previous root.
We can set the root to whichever of the children match the updated belief.
However, this requires ==
to be defined for the belief type you are using.
Further, the beliefs must exactly match (no tolerance for slight numerical differences).
If none of the child nodes match the belief provided to action
, an error is thrown (perhaps it should just clear the tree and start over).
Note that multiple child beliefs can be identical.
In this case, the root is set to the first matching belief (perhaps it should select the one which has the tightest bounds).
To use this method, call AEMSSolver
with the option root_manager = :belief
.
3. The user can provide the planner with the action taken and observation received.
This option is probably the most efficient, because the tree can be reused and no searching over beliefs must be performed. The planner can then follow the action and observation down the tree to the next belief node.
To provide this information, the user can call update_root(planner, a, o)
, where a
and o
are the action and observtion.
To use this method, call AEMSSolver
with the option root_manager = :user
.
Note that update_root
will throw an error if the root_manager
is not set to :user
; don't mess with the root unless you specify that it's the user's job.
Unfortunately, the user can't personally call update_root
after each action/observation pair during simulation.
Nor do existing simulators take care to update the planner (only the belief).
Therefore, AEMS.jl
has re-implemented some of the simulators from POMDPToolbox.jl
for the AEMSPlanner
type (so far only TestSimulator
).
The only changes are the following three lines of code after generating a new observation:
if planner.root_manager == :user
update_root(planner, a, o)
end
Obviously, this solution is horrifying.
Any simulation that wants to take advantage of the update_root
function must be specifically made for the AEMSPlanner
type.
While the other root management methods, like belief-searching, don't require any changes to simulations, I'm not sure they are the optimal solution.
It can be expensive if there are many beliefs and each belief is very large.
There is also the issue of beliefs that should be the same but have small numerical differences; these will not be recognized as the same belief and the root will go un-updated.
In the long-term, I think POMDPs.jl
should implement a update(policy, a, o)
function that defaults to doing nothing for all policies.
Visualization
Once you have a planner and have called action
, you can use the following code to bring up an interactive tree in a Chrome browser window. Click a node to expand/unexpand it.
using D3Trees
tree = D3Tree(planner) # creates a visualization tree
inchrome(tree) # opens chrome tab to show visualization tree