Tuesday, July 10, 2012

Meeting 10-07-2012

Discussion
  • Thesis:
    • Structure stays as it is (maybe some sections will still be moved later-on)
    • Add research question about computational complexity of both algorithms
  • Transposition tree: recreate split tests from stored samples if perfect recall is used (but otherwise do not copy them)
  • Plots: 
    • Include data of up to 10^3 roll-outs
    • Regret: change y-axis to log scale
    • Thesis: 
      • Separate plots for the variations of each algorithm
      • Plots to compare best variants
      • Plots for variations with similar curves
  • Transposition tree: related work is function approximation / generalization with decision / regression trees in reinforcement learning
  • Transposition tree for light-dark domain:
    • Changes to light-dark domain:
      • Continuous state space with a squared region for the goal
      • Discrete actions, corrupted by some Gaussian noise
    • Belief space represented as a multivariate Gaussian centered around the agent's actual location (x,y) with three parameters: mean(x), mean(y), standard deviation
Planning
  • Hand-in of chapters 2 and 3 this Friday July 13
  • Next meeting: Monday July 15

No comments:

Post a Comment