Advertisement
Guest User

Untitled

a guest
Mar 17th, 2021
340
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
YAML 0.88 KB | None | 0 0
  1. behaviors:
  2.   Agent Controller:
  3.     trainer_type: ppo
  4.     hyperparameters:
  5.       batch_size: 2048
  6.       buffer_size: 20480
  7.       learning_rate: 0.0003
  8.       beta: 0.005
  9.       epsilon: 0.2
  10.       lambd: 0.95
  11.       num_epoch: 3
  12.       learning_rate_schedule: constant
  13.     network_settings:
  14.       normalize: true
  15.       hidden_units: 512
  16.       num_layers: 4
  17.       vis_encode_type: simple
  18.     reward_signals:
  19.       extrinsic:
  20.         gamma: 0.99
  21.         strength: 1.0
  22.       curiosity:
  23.         gamma: 0.99
  24.         strength: 0.02
  25.         encoding_size: 256
  26.         learning_rate: 0.0003
  27.     keep_checkpoints: 5
  28.     max_steps: 50000000
  29.     time_horizon: 1000
  30.     summary_freq: 10000
  31.     threaded: false
  32.     self_play:
  33.       save_steps: 50000
  34.       team_change: 100000
  35.       swap_steps: 2000
  36.       window: 10
  37.       play_against_latest_model_ratio: 0.5
  38.       initial_elo: 1200.0
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement