Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- trans = [ 2, 4, 5,13;
- 1, 3, 6,14;
- 4, 2, 7,15;
- 3, 1, 8,16; 6, 8, 1, 9; 5, 7, 2,10; 8, 6, 3,11; 7, 5, 4,12; 10,12,13, 5; 9,11,14, 6; 12,10,15, 7; 11, 9,16, 8; 14,16, 9, 1; 13,15,10, 2; 16,14,11, 3; 15,13,12, 4 ];
- rew = [0,-1,0,-1;
- 0,0,0,-1;
- 0,0,0,-1;
- 0,-1,0,-1;
- -1,-1,0,0;
- 0,0,0,0;
- 0,0,0,0;
- 0,1,0,0;
- -1,-1,0,0;
- 0,0,0,0;
- 0,0,0,0;
- 0,1,0,0;
- 0,-1,0,-1;
- 0,0,0,1;
- 0,0,0,1;
- -1,0,-1,0];
- policy = ones (1,16);
- value = zeros (1,16);
- g=0.2;
- for p=1:100
- for s=1:16
- [dummy,policy(s)] = max( rew(s,:) + g * value(trans(s,:)) );
- end
- for s=1:16
- a=policy(s);
- value(s)=rew(s,a) + g * value(trans(s,a));
- end
- end
- states = 1
- a = policy(states);
- states = [states, trans(states,a)];
- for r=2:9
- s = states(r);
- a = policy(s);
- states = [states, trans(s,a)];
- end
- walkshow(states.')
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement