Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- behaviors:
- MonsterAgent:
- trainer_type: ppo
- hyperparameters:
- batch_size: 64
- buffer_size: 12000
- learning_rate: 0.0003
- beta: 0.001
- epsilon: 0.2
- lambd: 0.99
- num_epoch: 3
- learning_rate_schedule: linear
- network_settings:
- normalize: true
- hidden_units: 128
- num_layers: 2
- vis_encode_type: simple
- reward_signals:
- extrinsic:
- gamma: 0.99
- strength: 1.0
- curiosity:
- strength: 0.02
- gamma: 0.99
- encoding_size: 256
- learning_rate: 3.0e-4
- keep_checkpoints: 5
- max_steps: 50000000
- time_horizon: 1000
- summary_freq: 12000
- threaded: true
- self_play:
- window: 10
- play_against_latest_model_ratio: 0.5
- save_steps: 50000
- swap_steps: 2000
- team_change: 100000
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement