Master Thesis Artificial Intelligence: Feedback 09-07-2012

Monday, July 9, 2012

Feedback 09-07-2012

Progress

Transposition Tree: Implemented not splitting the first node

The agent's initial belief is represented by an additional node
This "first node" is updated separately from all other nodes
There is a parameter that allows to also update it during backpropagation if the belief falls in the first node's range

Transposition Tree: Corrected the way an action is selected at the end of a simulation

Experimental Set-up

Environment: Infinite Horizon Tiger Problem
Discount factor: 0.5
Discount horizon: 0.00001

(with this setting, one roll-out means 25 updates to the tree)

Number of episodes: 5000
Number of steps: 2
Algorithms:

Both Keep 1st Node variants use Perfect Recall as the splitting strategy
Keep 1st Node + Update also updates the "first node" during backpropagation

Results

2 comments:

MichaelKaisersJuly 9, 2012 at 8:26 AM
Dear Andreas,

can you provide a regret-like plot (although we don't know if the magic guesser reward is possible, take that as an optimum for now)?

Also, a comparison to your non TT version would be great, i.e., if you can compare TT vs OT. I like both variants of keep 1st, but perfect recall appears to be dominating it (better in all cases), so it does not have the problem we expected (dip if splitting just before the end)?

If you perform a split, do you hand down the action tree of that split to the newly generated node?

Looking forward to the comparison and regret-like plots. I think you are on to something here, considering that the performance after one playout is already 'close' to optimal. Best regards, Michael
ReplyDelete
Replies
AndreasJuly 9, 2012 at 9:39 AM
'If you perform a split, do you hand down the action tree of that split to the newly generated node?'

That is exactly what the 'insertion' strategy does.

'I like both variants of keep 1st, but perfect recall appears to be dominating it (better in all cases), so it does not have the problem we expected (dip if splitting just before the end)?'

I think that with 'perfect recall', this problem does not occur - only with 'deletion' and 'insertion'.
ReplyDelete
Replies

Subscribe to: Post Comments (Atom)