Advertisement
Guest User

Untitled

a guest
Mar 30th, 2020
103
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Python 0.62 KB | None | 0 0
  1.     def attempt(self):
  2.         observation = self.discretise(self.environment.reset())
  3.         done = False
  4.         reward_sum = 0.0
  5.         while not done:
  6.             action = self.pick_best_action(observation)
  7.             new_observation, reward, done, info = self.environment.step(action)
  8.             if done:
  9.                 reward = 0.0
  10.             new_observation = self.discretise(new_observation)
  11.             self.update_knowledge(action, observation, new_observation, reward)
  12.             observation = new_observation
  13.             reward_sum += reward
  14.         self.attempt_no += 1
  15.         return reward_sum
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement