Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- policy_values = []
- eps = np.linspace(0, 2/3, num = 10)
- for e in eps:
- policy = make_epsilon_greedy_policy(e)
- policy_values.append( evaluate_policy_return(T, behavioral_policy, policy) )
- plt.plot(eps, policy_values)
- plt.show()
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement