Untitled


jacek 12:41PM
ohai
MSmits 12:41PM
hey jacek
KiwiTae 12:41PM
i realised there are many tricks to stl when i tried to use c instead of cpp
MSmits 12:42PM
so jacek, i was meaning to ask you some questions about ML in oware specifically, but not sure how much you are sharing about this
jacek 12:43PM
its alright
MSmits 12:43PM
ok, so do you use selfplay or supervised in oware?
Doju 12:43PM
I just wish there was an option to select the interpreter for python here
jacek 12:43PM
selfplay
MSmits 12:43PM
when you train, do you play real games, or games with more depth or less depth?
jacek 12:43PM
or maybe 'self-supervised' would be more accurate
MSmits 12:43PM
when you generate data i mean
jacek 12:44PM
yes, real games with fast mcts
MSmits 12:45PM
what do you mean fast? Lower calculation time?
jacek 12:45PM
then i have position -> value of the mcts' root
MSmits 12:45PM
dont you use the endgame result as value to train on?
jacek 12:45PM
lower time, or even just fixed iterations count
i used that in the past, but using value from shallow search is faster and gave better results
even if at first generations it comes from random network
at the very least, near endgames have accurate values
MSmits 12:47PM
well it makes sense that the endgame reslt is a poor value to use for early game states
because it is too far away
and in later states, the mcts root value should coincide with the endgame result anyway
ok, so you don't have 6 outputs, one for each of the moves, like some other implementations
jacek 12:48PM
my training pipeline is similar to a0's now, just i train only value, and my target value is mcts result, not game final outcom
yeah, only value
MSmits 12:49PM
ok, so do you use convolution layers?
jacek 12:49PM
still simple MLP
MSmits 12:49PM
seems less useful in oware because it is not a 2D game
in what way is it close to azero then?
azero uses convolution and resnet
jacek 12:49PM
i mean the training pipeline
MSmits 12:50PM
oh ok
generating data in batches, training, validation etc.
jacek 12:50PM
yes
MSmits 12:50PM
well you do exactly what i would prefer to try, it seems like a good baseline to experiment with
jacek 12:50PM
it is evaltype-agnostic
could be nn, n-tuple or handcrafter features
MSmits 12:51PM
yeah it's a nn, it could be whatever :)
oh
you mean the pipeline
yeah
jacek 12:51PM
well anything that has adjustable parameters
MSmits 12:51PM
you could use it to train handcrafted features
parameters
yea
thats something i need also
so could learn 2 things at once here
if i stick with  it
kovi 12:52PM
bookmark
MSmits 12:52PM
do you use anything  like tensorflow?
or is it fully homemade
jacek 12:52PM
no, i made things in c++ from scratch
MSmits 12:52PM
thats great, also what i want to try
that way you can more easily get it into CG
kovi 12:53PM
not sure if c++ tensor fits into cg
yeah
MSmits 12:53PM
he can just make a long string and then convert it into tensors
thats not the hard part
kovi 12:53PM
yeah but runtime lib
MSmits 12:53PM
the hard part is writing efficient matrix calculations
jacek 12:53PM
it all started when i finally rewrote the xor example from python's to c++
kovi 12:53PM
it was not written with 100k limit consideration
MSmits 12:53PM
right, thats good
jacek 12:53PM
i think i have somewhere the python example without using np
MSmits 12:54PM
when you say "the python example" which one do you mean?
Doju 12:54PM
Hmm, I must be doing something wrong because I want to make pretty much everything protected instead of private
kovi 12:54PM
there are a0 examples
MSmits 12:54PM
yeah, but we are doing this far below a0 level :)
i dont want to touch that anymore
jacek 12:54PM
http://chat.codingame.com/pastebin/429ca1e6-890a-4b94-9773-49404526b36a
MSmits 12:54PM
a0 is so complicated
kovi 12:55PM
true, jacek value based thing is pretty wise
jacek 12:55PM
XOR mlp, 1 hidden layer
no fancy numpy
Doju 12:55PM
Oh jeez
are you doing neural nets without numpy?
MSmits 12:55PM
thats great jacek, finally something without numpy
jacek 12:55PM
just for learning for myself
Doju 12:55PM
thats nuts :o
jacek 12:56PM
no numpy in c++ standard libs
MSmits 12:56PM
numpy makes things faster, but it doesnt make it more clear to learn
Doju 12:56PM
great if you want to learn
yeah thats true
if the objective is to learn instead of making a fast thing then that makes sense
kovi 12:56PM
but without numpy usually 1/10 speed
MSmits 12:56PM
jacek for the matrix calcs, did you use any c++ library, or did you just figure out what intrinsics and other tricks to use, yourself?
kovi 12:57PM
and without tf/gpu 1/100
or worse
MSmits 12:57PM
kovi if i take weeks to code something and the training takes 24 hrs instead of 1 hr, it's fine :)
jacek 12:57PM
i use good old for loops, not even intrinsics
kovi 12:57PM
oh, you do c++ calc, sorry
than its ok
MSmits 12:58PM
I see, doesn't it bother you that it might be much faster jacek, with some tricks?
i mean obviously you dont need more speed atm
jacek 12:58PM
yeah, NN eval is most consuming part of my code
and i tried several times. im just too dumb
MSmits 12:59PM
well if I ever get to the point where I can write this stuff and it's better than what you have, I will share it with you
might be a couple months. I want to get something before the end of my summer vacation. Trying to be generous with my time estimate here, apparently it's hard to learn
jacek 01:00PM
and i finally got this work for Y. its 5th without opening book
MSmits 01:01PM
nice one, Robo did as well
but yavalath has huge problems with determinism because of the early game endings
jacek 01:01PM
its N-tuple with small hidden layer, MLP-tuple :v
MSmits 01:01PM
ohh ok
I am going to be solving connect4 i think
Doju 01:02PM
welp now i've got circular dependencies
jacek 01:03PM
determinism... in training i choose final moves according softmax
it was another thing that i lacked before
allows for exploration but not too dumb exploration
MSmits 01:03PM
I see
hey, you train on cpu right?
jacek 01:04PM
yeah
MSmits 01:04PM
i read about gpu being 20-100 times faster
but i feel that's probably also because when people do that they have 4 really expensive ones running at once
doubt i'd achieve that factor with mine
DomiKo 01:05PM
not really
jacek 01:05PM
i have rather small nn
WOLFRAHH 01:05PM
hii guys what was going
jacek 01:05PM
not quite parallelizable
MSmits 01:05PM
ahh ok
jacek 01:05PM
well maybe for training batch itself, the gpu would come in handy
MSmits 01:05PM
seems so difficult to write that yourself
i would prefer to do it with tensorflow then and just convert their models somehow
jacek 01:06PM
thats why i havent written convnets yet. i could write gazillion layers etc. in python but at first i want to make something small myself
MSmits 01:06PM
and resnets?
jacek 01:07PM
too
WOLFRAHH 01:07PM
can any body tell what was going
MSmits 01:07PM
convnets supposedly help for games like othello/yavalath, where the surroundings of a hex/square are important
doubt it would help much for oware
jacek 01:08PM
my NNs so far have at most 2 hidden layers, so resnets are pointless.
MSmits 01:08PM
yeah
2 is not much at all
jacek 01:08PM
also i mostly exploit the fact that there is little change between game states, i.e. only few squares are affected
MSmits 01:08PM
did you experiment with trading layer size for depth?
how do you exploit this?
jacek 01:09PM
yeah, and this is what i came up for my framework and cg constraints
for input/first hidden layer you need to only update the values instead of summing everything all over again
partial updates, the main idea behind nnue
MSmits 01:11PM
oh you mean it's a performance improvement
do you mean during training or running a game?
jacek 01:11PM
yes
well both
MSmits 01:12PM
well that seems useful and alleviates the problem with you just using for loops
jacek 01:12PM
though i do not use that in oware
MSmits 01:12PM
it's easier to implement improvements like this when your code is not stuck in weird intrinsic and avx stuff
proace21 01:13PM
hi
MSmits 01:13PM
once you've gotten into that, you generally dont touch the code anymore. At least I dont