Master Thesis Artificial Intelligence
Saturday, April 28, 2012
Results: Runtime
Experimental Set-up
Mean value is taken over 1,000 episodes
Each data point is indicated by 'o'
Results
First 1,000 milliseconds
First 1,000 milliseconds (logarithmic scale)
Friday, April 27, 2012
Results: Horizon 100 Tiger Problem
Experimental Set-up
Mean value is taken over 1,000 episodes
Error bars represent standard error
Results
With random baseline
First 5,000 samples
Without random baseline
First 5,000 samples
Thursday, April 26, 2012
Results: Horizon 10 Tiger Problem
Experimental Set-up
Mean value is taken over 10,000 episodes
Error bars represent standard error
Results
With random baseline
First 10,000 samples
First 1,000 samples
Without random baseline
First 10,000 samples
First 1,000 samples
Monday, April 23, 2012
Feedback 23-06-12
Progress
Computed maximum average reward that the random agent could achieve
Updated performance graphs (see below)
Implemented leaf visualization (see below)
Results
First 1000 Samples
First 100 Samples
Visualization of Leafs
Tuesday, April 17, 2012
Feedback 17-04-12
Progress
Implemented
flat
Monte Carlo
(uniform sampling for first action choice)
Experiments can now be run in two ways:
With graphical output (using RL-Viz, see below)
Or with plain text output (using
.properties
files for the settings)
Added
RL-Viz
to support:
Simple resetting of parameters
Simple selecting of agent and environment
Visualization of agent and environment
Friday, April 13, 2012
Feedback 13-04-12
Progress
Removed unnecessary information and functions from observation tree (most of them weren't used anyway)
The computation of the belief state update is correct (I just read a number wrong)
Changed from separated runs to just doing more episodes (10,000) for the experiments
Belief and action trajectories can now be shown for each episode
Thesis Structure
Wednesday, April 11, 2012
Preliminary Results
Horizon 2 Tiger Problem
Horizon 10 Tiger Problem
Thesis Structure
Monday, April 9, 2012
Feedback 09-04-12
Experiment Pipeline
Settings for experiments can now be loaded from files
Results of experiments ...
Can be directly plotted,
Can be stored in files which are suitable for Matlab / Octave, and
Plots can be stored as image files also.
Planning
Finish experiment with Tiger Problem
Next meeting: April 11, 2012, 11a.m.
Tuesday, April 3, 2012
Feedback 03-04-12
Algorithm for Continuous Observations
Implemented
the black box simulator for POMDP
Implemented discounted return
Improved the algorithm's code
Tiger
Problem
Can now be made episodic by setting a maximum number of steps that can be done by the agent (necessary for correct stepping in RL-Glue)
Output of Results from Experiments
Started working on it
Thesis
Gave it some structure
Wrote some sections for the background chapter
Planning
Finish
Output of Results from Experiments
After that: performance experiment for the Tiger Problem (#samples vs total reward)
Continue writing the background chapter
Newer Posts
Older Posts
Home
Subscribe to:
Posts (Atom)