Setup
- Same as before; grid looks as follows:
********L*
********L*
********L*
********L*
*****A**L*
********L*
********L*
********L*
********L*
G*******L*
Results
- Seems to be simpler to solve than with goal in middle of grid
- Update: Similar results for all other corners
|
Goal at upper left |
Setup
- 10x10 Light Dark domain (wrap-around) with continuous observations:
********L*
********L*
********L*
********L*
**G*****L*
********L*
********L*
*****A**L*
********L*
********L*
- Actions: up, down, left, right
- Observations: location (x,y) corrupted by Gaussian noise with STD based on distance to light
- Rewards: -1 per move
- Initial belief: uniform over all states except the goal
- # runs: 1000
Results: Automatic vs Fixed Discretization (1 cut point per dimension)
Results: No Discretization
Setup
- 10x10 Light Dark domain with continuous observations
- Starts from initial uniform belief
- # runs: 1000
Results
Hallway environment
- Changed from MOMDP to POMDP
- Orientation included in belief state
- Forward action is stochastic, turn actions are deterministic
Parser for POMDP files
- Started developing a parser for .POMDP files
- Using / adapting parser from libpomdp
Discretization for POMCP
- Equal width binning
- Predefined number of cut points per dimension (m)
- Number of dimensions (n)
- For each sequence node, there are (m+1)n history nodes => lots of nodes!
- Used for the tree and to update the particle filter
Hallway environment
- Based on [1]
- Actions: move forward, turn left, turn right, turn around
- Reward: -1 per action
- Observations: 4 wall detection sensors (with Gaussian noise), 1 landmark / goal detection sensor
- Belief: location of agent (agent's orientation is known)
- Initial belief distribution: uniform over all free cells
- Currently available maps (0 = free, 1 = wall, G = goal, L = landmark):
- 0000G
- 00000000000
11L1L1L1G11
[1] Michael Littman, Anthony Cassandra, and Leslie Kaelbling. Learning
policies for partially observable environments: Scaling up. In Armand
Prieditis and Stuart Russell, editors, Proceedings of the Twelfth
International Conference on Machine Learning, pages 362--370, San
Francisco, CA, 1995. Morgan Kaufmann.
Tree Visualization