Saturday, May 26, 2012

Results: Tiger Problem

Experimental Set-up
  • Regret = optimal value - mean value
  • Optimal value = sum over max. positive reward achievable in each step (20 for Horizon 2, 100 for Horizon 10)

Results: Regret Plots
Horizon 2

Horizon 2 (zoomed in)

Horizon 10

Horizon 10 (zoomed in)

No comments:

Post a Comment