Thursday, July 26, 2012

Feedback 26-07-12

Progress
  • Thesis: 
    • Wrote a section about related work
    • Started to write on conclusion chapter
    • Added experiments about performance in infinite horizon tiger problem
Overview of experiments
  • Performance
    • Sample based (# roll-outs vs mean)
      1. Variations of COMCTS
      2. Variations of TT
      3. Best COMCTS variation against best TT variation
      4. Variations with similar performance
    • Time based: same four experiments (time vs mean)
  • Practical time and space usage (time vs roll-outs)
  • Varying Noise (Standard deviation of noise vs mean)

Experiments: Time vs roll-outs, time vs mean


Set-up
  • Environment: infinite horizon Tiger problem
  • Discount factor: 0.5
  • Discount horizon: 0.00001
  • Number of episodes: 2,000
  • Algorithms:
    • BFAIRMC (pronounced: "be fair mc") = Belief based Function Approximation using Incremental Regression trees and Monte Carlo = transposition tree
Results
time vs rollouts










time vs mean

time vs mean (zoomed in)

time vs regret (optimum - mean)

No comments:

Post a Comment