Master Thesis Artificial Intelligence: Feedback 05-03-2012

Parser

Since the problems do not contain continuous actions / observations, I think it's redundant to implement a parser for such POMDP problems for RL-Glue

Paper by D. Silver

Histories in the nodes
Actions and observations in the edges (i.e. "action" edges followed by "observation" edges)

Each node is coupled to a set of particles which approximate the belief state
The algorithm (called POMCP) can make use of domain knowledge
POMCP performs on a high level in discrete state spaces with up to 10^56 states and beats other online and offline planning algorithms
Possibility 1: Extend to continuous observations
Possibility 2: Build the tree on action-observation pairs
Possiblity 3: Use a Bayes or a Kalman filter for the belief state approximation (although I think a particle filter is already the best choice)

Planning

Master Thesis Artificial Intelligence