Master Thesis Artificial Intelligence: Feedback 05-07-2012

Progress

Starts from a single leaf node which represents the complete belief space ( [0,1] in the Tiger problem )
Uses an F-test for splitting in the belief space
Collects test statistics for each action for the two "sides" of each test
Splits if there is a significant difference between any action (e.g., between action 0 on the "left" side and action 0 on the "right" side)

Deletion (implemented)
Insertion of information from winning test into new action nodes (implemented)
Insert split tests into new leaves (not implemented)
Other strategies (like perfect recall)?

All following problems relate to the agent's true belief in the real environment at the current time step and occur for both deletion and insertion
If there is a split near the end of the episode at a leaf whose range includes the agent's true belief, the algorithm cannot gather enough new data about the expected value of each action at that range to give reliable estimates.
Splitting sometimes creates a leaf with a large range, e.g., [0.5000, 0.9170]. Now, if the agent's true belief is 0.91, the expected value of the actions is again not correct.

Planning

Master Thesis Artificial Intelligence