Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- Model free Q-Learning in an MDP style environment
- Utilized code from Berkeley's CS188 Reinforcement Learning project
- Introduced an epsilon decay to offer a transition between early exploration and late exploitation
- QLearning paramters:
- alpha = 0.1
- epsilon = 1.0
- gamma = .99
- epsilon_decay = .9995
- learning_decay = 1.0
- Program paramters:
- python openai.py -v FrozenLake-v0
- -a 0.1
- -e 1.0
- -g .99
- --learningDecay 1.0
- --explorationDecay .9995
- -x 10000
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement