Setup
Same as before; grid looks as follows:
********L*
********L*
********L*
********L*
*****A**L*
********L*
********L*
********L*
********L*
G*******L*
Results
Seems to be simpler to solve than with goal in middle of grid
Update: Similar results for all other corners
Goal at upper left
Setup
10x10 Light Dark domain (wrap-around) with continuous observations:
********L*
********L*
********L*
********L*
**G*****L*
********L*
********L*
*****A**L*
********L*
********L*
Actions: up, down, left, right
Observations: location (x,y) corrupted by Gaussian noise with STD based on distance to light
Rewards: -1 per move
Initial belief: uniform over all states except the goal
# runs: 1000
Results: Automatic vs Fixed Discretization (1 cut point per dimension)
Results: No Discretization
Setup
10x10 Light Dark domain with continuous observations
Starts from initial uniform belief
# runs: 1000
Results
Hallway environment
Changed from MOMDP to POMDP
Orientation included in belief state
Forward action is stochastic, turn actions are deterministic
Parser for POMDP files
Started developing a parser for .POMDP files
Using / adapting parser from libpomdp
Discretization for POMCP
Equal width binning
Predefined number of cut points per dimension (m )
Number of dimensions (n )
For each sequence node, there are (m+1)n history nodes => lots of nodes!
Used for the tree and to update the particle filter
Hallway environment
Based on [1]
Actions: move forward, turn left, turn right, turn around
Reward: -1 per action
Observations: 4 wall detection sensors (with Gaussian noise), 1 landmark / goal detection sensor
Belief: location of agent (agent's orientation is known)
Initial belief distribution: uniform over all free cells
Currently available maps (0 = free, 1 = wall, G = goal, L = landmark):
0000G
00000000000 11L1L1L1G11
[1] Michael Littman, Anthony Cassandra, and Leslie Kaelbling. Learning
policies for partially observable environments: Scaling up. In Armand
Prieditis and Stuart Russell, editors, Proceedings of the Twelfth
International Conference on Machine Learning , pages 362--370, San
Francisco, CA, 1995. Morgan Kaufmann.
Tree Visualization