Untitled

Model free Q-Learning in an MDP style environment

Utilized code from Berkeley's CS188 Reinforcement Learning project

Introduced an epsilon decay to offer a transition between early exploration and late exploitation

QLearning paramters:

alpha = 0.1
epsilon = 1.0
gamma = .99
epsilon_decay = .9995
learning_decay = 1.0


Program paramters:
python openai.py -v FrozenLake-v0
-a 0.1
-e 1.0
-g .99
--learningDecay 1.0
--explorationDecay .9995
-x 10000