- 10x10 Light Dark domain (wrap-around) with continuous observations:
- Actions: up, down, left, right
- Observations: location (x,y) corrupted by Gaussian noise with STD based on distance to light
- Rewards: -1 per move
- Initial belief: uniform over all states except the goal
- # runs: 1000
Results: No Discretization
No comments:
Post a Comment