Saturday, July 14, 2012

Feedback 14-07-12

Progress
  • Changed the light-dark domain such that it locates the agent at a randomly selected starting cell at the beginning of an episode
  • Implemented a heuristic for the light-dark domain:
    • Assumes the agent is on the grid cell for which the belief is the highest
    • Takes the action that has the smallest Euclidean distance from that cell to the goal cell
Experimental Set-up
  • Light Dark domain with the following layout:
********L*
********L*
********L*
********L*
********L*
***G****L*
********L*
********L*
********L*
********L* 
  • At the beginning of each episode, the agent is randomly placed at any cell except the goal G
  • Initial belief: uniform over all possible states, except the goal state
  • Number of episodes: 1000
  • Discount factor: 0.95
Results


No comments:

Post a Comment