Master Thesis Artificial Intelligence: Feedback 11-07-10

Wednesday, July 11, 2012

Feedback 11-07-10

Experimental Set-up

Environment: Infinite Horizon Tiger Problem
Discount factor: 0.5
Discount horizon: 0.00001
Number of episodes: 5000
Number of steps: 2
Algorithms:

Both Keep 1st Node variants use perfect recall as the splitting strategy
Keep 1st Node + Update also updates the "first node" during backpropagation

Regret plots: the mean value achieved by magic guesser is taken as the optimum

Results

Regret: Variations of COMCTS


Regret: Variations of TT

Mean: Variations of COMCTS

Mean: Variations of TT

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)