Untitled

part 0.1
creating a VM instance


in google cloud, open the console :
https://console.cloud.google.com/home/

then click on the sandwich bar on top left -> compute engine -> WM instances
then create a new VM instance, choose a custom one with these settings :

- scroll through the regions and sub regions that have a V100 available, then when you find it, choose :

note : the number of vcpu/ram can be changed anytime after instance creation

->  (recommended)if you want to run 2 games simultaneously ./autogtp -g 2 (recommended, 20% faster, higher probability to get disconnected by preemptible rules of low priority cost), due to more cloud hardware used by one machine) :
4 vcpu
5.5 gb ram
for more details on hardware needs calculation, please see part 5f

-> (not recommended) or, if you want to run one game at a time ./autogtp (not recommended, 20% slower, less chance to get disconnected by preemptible rules) :
2 vcpu
3.5 gb ram


then, the rest is common for both cases :
- 10 GB HDD
- 1 GPU : Tesla V100 (maximum for free trial)
- system : ubuntu 18.04 LTS
- allow HTTP and HTTPS requests
- preemptible settings (-60% discount on the free credit consumption, cost in exchange of occasional power off of the instance, takes 2 minutes to start again, totally worth it :
to activate preemptible machine  :
       Click Management, security, disks, networking, sole tenancy.
       Under Availability policy, set the Preemptibility option to On. This setting disables automatic restart for the instance, and sets the host maintenance action to Terminate.
       Click Create to create the instance.
more details about preemptible here : https://cloud.google.com/compute/docs/instances/create-start-preemptible-instance


for information :
preemptible instances being much cheaper (if i remember well arround 650 dollars vs 1400 dollars per month of free credit consumed on a V100 VM instance), choosing preemptible instance when you create it is a no brainer.
From my experience, the VM instance will be stopped by Google at most 1 or 2, or very rarely 3 times per 24 hours, which leaves on average at least 5-10 hours to use it on a row before first stop.
And then, to restart the instance only takes 1 click and 1 minute as we will see later, then you are good to go again for many hours.
Note that when preemptible instance is stopped by Google every while, the credit stops being consumed too because the instance is stopped, so you dont have to worry about efficiency of the credit.


part 0.2 :
preparations

go on google console (compute engine) -> VM instances -> click on SSH button to connect to the instance via SSH (embbed on chrome)

To read before starting :
The instance will be opened in a new chrome window that uses SSH protocol to connect to your VM instance.
To make copying commands easier, in the SSH chrome new window, go to : settings → copy paste with ctrl+shift+c/v : click ok
From now on, we will use ctrl+c to copy a command from this text, but ctrl+shift+v to paste it on the SSH window (because it is ubuntu terminal)


finally, for information, before starting next parts :
About preemptible instances again, if, unfortunately, the instance was to be exited by google while we are installing system packages, the probability of system corruption due to exit during install of packages is high , and if this very rare case happens, i advise you to delete the instance and recreate a new one, and hopefully (unless you are very unlucky, but then try again !) you will not be stopped in this new instance.


parts 1 and 2 and 3 :
update, upgrade, then install all pre leela-zero packages, then compile leela-zero+autogtp in one run !


(before you start to run it, you may want to read the detail of what every part of the command does)

for information : after the first reboot, you lose connection with the VM instance : wait 1-2 minutes and retry 1 or 2 times, and it should reconnect again as long as server is ON (see this page to see if server is ON with a green circle, and to see if SSH is clickable) :
https://console.cloud.google.com/compute/instances
if "retry" does not work and the instance is still ON (green circle), then retry again until you succeed to connect via SSH again


select all the big all in one command below then copy paste it
note : this command includes NEXT BRANCH as it is much faster and includes all new improvements


sudo apt-get update && sudo apt-get -y upgrade && sudo apt-get -y dist-upgrade && sudo add-apt-repository -y ppa:graphics-drivers/ppa && sudo apt-get update && sudo apt-get -y install nvidia-driver-410 linux-headers-generic nvidia-opencl-dev && sudo apt-get -y install clinfo cmake git libboost-all-dev libopenblas-dev zlib1g-dev build-essential qtbase5-dev qttools5-dev qttools5-dev-tools libboost-dev libboost-program-options-dev opencl-headers ocl-icd-libopencl1 ocl-icd-opencl-dev qt5-default qt5-qmake curl glances zip && git clone https://github.com/gcp/leela-zero && cd leela-zero && git checkout next && git pull && git clone https://github.com/gcp/leela-zero && git submodule update --init --recursive && mkdir build && cd build && cmake .. && cmake --build . && cd ../autogtp && cp ../build/autogtp/autogtp . && cp ../build/leelaz . && sudo reboot


if you dont want to read details of what this big all-in-one command above does (not needed) you can skip all the text below until part 4 (how to run autogtp)
(note : will be added later : instance templates and managed group instances for preemptible automatic restart : to be made later)


DETAILS OF THE ALL IN ONE BIG COMMAND :

part 1o) upgrading system

sudo apt-get update && sudo apt-get -y upgrade && sudo apt-get -y dist-upgrade


part 1a (nvidia easiest alternative)
installing nvidia long lived branch drivers (old stable)

sudo add-apt-repository -y ppa:graphics-drivers/ppa && sudo apt-get update && sudo apt-get -y install nvidia-driver-410 linux-headers-generic nvidia-opencl-dev


official website : https://launchpad.net/~graphics-drivers/+archive/ubuntu/ppa
Current long-lived branch release at the time of this tutorial : `nvidia-410` (410.66)
at the ppa website linked above.
if in the future Current long-lived branch gets an update, you should replace 410 (in nvidia-driver-410 of the 1st command by the driver version number mentionned.
For example , if next driver update of nvidia "Current long-lived branch release" was to be nvidia-413, you would have to replace nvidia-driver-410 by nvidia-driver-413 without changing the rest of the command)

for information 3, before starting the command just under this, you may want to install latest/fresh nvidia driver instead of last long lived branch for potential extra perfomance. If yes, skip next command (under this), and go directly to part 1b


part 1b (optionnal nvidia 2nd alternative)
installing nvidia driver latest (short lived branch)


starting part 1b requires to have read instructions from part 1a but requires that you didnt run any command from part 1a !
it is optionnal so you can skip it if you are fine with part 1a, but run either part 1a(old branch stable) or 1b(fresh branch latest), but not both !
in this part, we'll install instead latest nvidia driver available, whether it is on short-lived or long-lived branch :

-> if you're lazy :
this all in one commands works directly but with possibly not latest version number
at the time i'm writing this tutorial, 396 is latest version number, but in the future, replace 396 with latest driver version available

sudo add-apt-repository -y ppa:graphics-drivers/ppa && sudo apt-get update && apt-cache search nvidia && sudo apt-get -y install nvidia-driver-396 linux-headers-generic nvidia-opencl-dev


->if you want to know which nvidia packages are available for installation and chose latest option available :
then dont run the command above but instead run this one :


sudo add-apt-repository -y ppa:graphics-drivers/ppa && sudo apt-get update && apt-cache search nvidia
(or apt search nvidia instead of apt-cache search nvidia)

as explained in part 1a, at the time i’m writing this tutorial, nvidia 396 is latest short-lived version, we find it at the bottom of the command we ran here, something like this :
nvidia-compute-utils-396 - NVIDIA compute utilities
nvidia-dkms-396 - NVIDIA DKMS package
nvidia-driver-396 - NVIDIA driver metapackage
nvidia-kernel-source-396 - NVIDIA kernel source package
nvidia-utils-396 - NVIDIA driver support binaries
xserver-xorg-video-nvidia-396 - NVIDIA binary Xorg driver
nvidia-kernel-common-396 - Shared files used with the kernel module
nvidia-headless-396 - NVIDIA headless metapackage
nvidia-headless-no-dkms-396 - NVIDIA headless metapackage - no DKMS

then run the following command :

sudo apt-get -y install nvidia-driver-396 linux-headers-generic nvidia-opencl-dev

note : the ppa providing versions will be updated as new versions get released, with time, so you'd want to rerun these steps to upgrade nvidia driver to latest version


part 2 :
installing all other prerequired packages


sudo apt-get -y install clinfo cmake git libboost-all-dev libopenblas-dev zlib1g-dev build-essential qtbase5-dev qttools5-dev qttools5-dev-tools libboost-dev libboost-program-options-dev opencl-headers ocl-icd-libopencl1 ocl-icd-opencl-dev qt5-default qt5-qmake curl glances zip


part 3 :
compiling leela zero, autogtp, and running autogtp (NEXT branch)


if you want to use MASTER BRANCH (not recommended, slower, possibly more stable), you can see part 6d but since it is slower i will not display the instructions here (go at optionnal part 6d if you're interested)


at the moment i'm writing this tutorial, NEXT BRANCH is arround 30% faster to produce games than master (leelav15/autogtpv16) from my personal tests, but it can possibly be less stable and possibly have bugs, depending on which commits are approved, etc. It's runing very fine at the moment though.
i recommend using NEXT BRANCH as long as it doesnt have any major bug (thanks @seopsx and @alreadydone for helping me with the instructions)

for manual instructions on how to compile and run leela-zero with autogtp, you can see parts 6e and 6f if you're interested, but these are not needed as they are included in the all in one big command


after the ALL IN ONE COMMAND, instructions now continue in part 4 :


part 4 :
how to run autogtp again (if exited)


this will run next branch (recommended) if you did part 3, or master branch (not recommended) if you skipped until part 6d


cd leela-zero/autogtp
./autogtp


note : this command :
./autogtp -g 2
instead of ./autogtp produces games significantly faster, because when a 0% resign game is generated, the extra gpu power of the v100 can be used to produce another game simultaneously
(265 games/24 hours VS 208 games/24hours)
see this comment for details :
https://github.com/gcp/leela-zero/issues/1905#issuecomment-428612310

note 2 : this part 4 will be replaced by the managed instance groups and instance templates


FINISHED !
you can now contribute to leela-zero !
parts 5 and 6 below are optionnal details if you're interested :


part 5 (optionnal) :
monitoring of the VM


a) check if instance is ON or OFF, and manage your instance :
https://console.cloud.google.com/compute/instances
for information : your free credit does NOT get charged as long as the machine is off, the only point of staying always ON is computing as much selfplay games as possible.


b) of the free 300 dollars credit, you can manage how much credit left you have :
https://console.cloud.google.com/billing/

(this credit totally free, you don't get charged at the end of the free trial)


c) with startup-script (will be added soon when i rework part 4)
games can run automatically at startup of the instance without needing to open any SSH window,
and keep runing even if your computer is powered off (independant), as long as green circle is on the instance (started)


d) for whatever reason, if you want to stop the instance :

go in the console VM instances page :
https://console.cloud.google.com/compute/instances

then, click on the menu settings next to your instance (the 3 dots at the right of SSH button)
and choose STOP


The grey square shows that the instance is now stopped (not consuming free credit anymore), and the green circle shows that the instance is runing (thus consuming the free credit)


e) optimal hardware needs for an instance, how to calculate :

to calculate optimal hardware needs, this is what you need to know (as less as possible to trigger as less as possible low priority of preemptible use rules)
we'll have to always assume worst case scenario (most possible consuming situation) to avoid bottlenecks


-> what is optimal number of simultaneous games
after many many tests, i found that -g 2 (run it with : ./autogtp -g 2) is the fastest (16 +/- 1 games in 60 minutes with 5% resign)
"-g 1" (one game after another) gives 13 +/-1 games in 60 minutes with 5% resign, and uses only 90% of the gpu power (checked with nvidia-smi), and during a no resign game the gpu is not using its extra power
-g 3 and more are not optimal either with a v100, as gpu load is already at 100% with -g 2, and my stats shows it's 5% slower than "-g 1", -g 3 produces 12 +/- 1 games in 60 minutes with 5% resign, -g7 produces 8+/-1 games in 60 minutes with 5% resign (8vcpu instance)
-g 3 and more is also less efficient as you'll need at least 6vcpu and 8GB RAM or more, which will make preemptible use rules often disconnect you
if you want to produce more games in parallel, i suggest you rather use a 2nd gpu


-> how leelaz threads work
1 selfplay game will use 1 vcpu leelaz thread maxing it at 100%
1 match game (worst case scenario) will use 2 vcpu leelaz thread, but the cpu load will be shared between both vcpus :
for example 65% in vcpu(core)1 and 35% in vcpu2, both of these vcpu will be used for a match

-> for VCPU number needed :
for vcpu number, the worst case scenario is to assume every game will be a match, needing 2 leelaz threads totalling 100% cpu load together
these leelaz pair threads per match can either be on the same vcpu, or in another one
so you'll need at least the same amount of vcpu as the number of games (selfplay or match, no  difference), for example with -g 2 you'll need at least 2vcpu for 2 games + 1 vcpu for startup-script (includes ./autogtp which will be loaded at 100%) + 1 vcpu free for system just in case = 4 vcpu at least in total
for example, -g 4 (4 simultaneous games, not recommended, slower, but to show the calculation) will  need : 4 vcpu for 4 games + 1 vcpu for startup-script + 1 vcpu free for system in case = minimum 6 vcpu in total for -g 4 (not recommended, slower than -g 2)
for example 2, -g 6 (6 simultaneous games, not recommended, slower, but to show the calculation) will need 6vcpu for 6 games + 1 vcpu for startup-script + 1 vcpu free for system in case = minimum 8 vcpu in total for -g 6 (not recommended, slower than -g 2)

-> for RAM quantity needed :
- ubuntu system uses 450 MB RAM
- 1 leelaz thread uses 1000 MB RAM for a 5% resign game, or 1150 MB for a match 0% resign game leelaz thread, or 1300 MB RAM for selfplay 0% resign leelaz thread
- 1 selfplay game needs 1 leelaz thread
- 1 match game (worst case scenario) needs 2 leelaz threads (1 for each network)
- 4x150 MB extra ram needed for a 0% resign match equal 2x300 MB needed for 2 0% resign selfplay game = 600 MB extra RAM needed
a few examples of calculation :
1 leelaz thread (1 selfplay) uses 1000 MB RAM with 5% resign + 450 MB system RAM => 1.45 GB total RAM
2 leelaz threads (2 selfplay or 1 match ) use 2x1000= 2 GB RAM all with 5% resign + 450 MB system RAM = 2.4GB total  RAM
3 leelaz threads (with -g 2 : 1 match + 1 selfplay ) use 3x1300= 3.9 GB RAM all with 0% resign + 450 MB system RAM = 4.3 GB total RAM
4 leelaz threads (with -g 2 : 4 selfplay or 2 matches or 2 selfplay + 1 match) use 4x1000= 4.0GB RAM with 5% resign + 450 MB system RAM => 4.45 GB total RAM
for worst case add 4x150 MB (4 -r 0 match leelaz threads) or 2x300 MB (2 -r 0 selfplay leelaz threads) = 600 MB extra needed, so total is 5.1 GB RAM, and i chose to keep a 500 MB security margin


for -g 2 (optimal), the worst case scenario (very rare) is 4 matches with 0% resign = 450 + 4 x 1150 MB RAM for each leelaz thread = 5.1 GB RAM)


part 6 (optionnal) :
optionnal extra information

a) save all sgf files you generated with autogtp and download them on your personal computer
steps described in this github, with pictures : https://github.com/gcp/leela-zero/issues/1943#issuecomment-430977929

i added this paragraph to answer @kwccoin on the github issue linked above


first, always run, everytime you want to contribute with autogtp ;
./autogtp -k allsgf

secondly, then, click on SSH button in google cloud console again to open a 2nd command line window
in the 2nd SSH window, run these commands :

cd leela-zero/autogtp
ls
#replace v1b in all these commands by whatever name you like, always a different one for every new archive
zip -r -0 v1b.zip allsgf
curl --upload-file ./v1b.zip https://transfer.sh/v1b.zip

then you will get a download link as i did in my screenshot

download link for my example (ctrl+shift+c in ubuntu terminal) :
https://transfer.sh/6Lza1/v1b.zip

optionnal :
view sgf uploaded in the allsgf folder, sorted by date :


steps explained here, with pictures : https://github.com/gcp/leela-zero/issues/1943#issuecomment-431047043
run these commands, in autogtp folder :

cd allsgf
ls -t
#(to go back : cd ..)

read order :
1st column top to bottom, then go to column 2 top to bottom, then column 3 etc

note that the sgf are also sorted by time in the zip archive :
(2 more sgf were generated since i did this screenshot)

alternatively, you can also have a log if you check the journal file, as explained earlier


b) system monitoring :
RAM usage, cpu usage per core, etc

open a 2nd SSH command line window by clicking on SSH, and run this command :

glances

you will get something similar to this picture :
https://github.com/gcp/leela-zero/issues/1905#issuecomment-430529799


c) nvidia gpu stats usage

in another SSH command line window :
run this command :

nvidia-smi


d) (not recommended) how to compile and run MASTER branch with the ALL IN ONE first boot command :

if you want to install master branch (not recommended, slower, possibly more stable) :
select all commands below and copy/paste all the selection :


this command will :
update system,
upgrade system,
install all nvidia drivers,
install all pre leela-zero required packages
compile leela-zero (here MASTER BRANCH)
reboot


sudo apt-get update && sudo apt-get -y upgrade && sudo apt-get -y dist-upgrade && sudo add-apt-repository -y ppa:graphics-drivers/ppa && sudo apt-get update && sudo apt-get -y install nvidia-driver-410 linux-headers-generic nvidia-opencl-dev && sudo apt-get -y install clinfo cmake git libboost-all-dev libopenblas-dev zlib1g-dev build-essential qtbase5-dev qttools5-dev qttools5-dev-tools libboost-dev libboost-program-options-dev opencl-headers ocl-icd-libopencl1 ocl-icd-opencl-dev qt5-default qt5-qmake curl glances zip && git clone https://github.com/gcp/leela-zerocd leela-zero/src && make && cd ../autogtp && qmake -qt5 && make && cp ../src/leelaz . && sudo reboot


-> (will be added later) after reboot, you dont need and should to run autogtp, it will run automatically on the background with the startup-script
you can check system usage in a SSH window with the command :

glances


-> (will be added later) or you can see the game production that is happening on background with the journal command :

sudo journalctl -u google-startup-scripts.service -b -e -f


part 6e : old manual NEXT BRANCH instructions :


(if you already installed MASTER branch, exit instance and relaunch it, then start directly at the # go to leela-zero folder command)

# show gpu details
clinfo && nvidia-smi
# Clone github repo
git clone https://github.com/gcp/leela-zero
# go to leela-zero folder
cd leela-zero
# pull next branch
git checkout next
git pull
git clone https://github.com/gcp/leela-zero
git submodule update --init --recursive
# create leela-zero/build folder and go inside it
mkdir build && cd build
# compile leelaz and autogtp binaries in leela-zero/build folder
cmake ..
cmake --build .
./tests
# go to leela-zero/autogtp folder
cd ../autogtp
# copy autogtp in leela-zero/build/autogtp folder to leela-zero/autogtp folder
cp ../build/autogtp/autogtp .
# copy leelaz binary in leela-zero/build folder to leela-zero/autogtp folder
cp ../build/leelaz .
# run autogtp
./autogtp


and autogtp NEXT is running !


AutoGTP v16
Using 1 thread(s) for GPU(s).
Starting tuning process, please wait...
Net filename: networks/68824bbc683a0eb482bcdc34ea7c3e4bc3e1dd152e3aa94f9a8bfc6d189f3091.gz
net: 68824bbc683a0eb482bcdc34ea7c3e4bc3e1dd152e3aa94f9a8bfc6d189f3091.
./leelaz --tune-only -w networks/68824bbc683a0eb482bcdc34ea7c3e4bc3e1dd152e3aa94f9a8bfc6d189f3091.gz
Leela Zero 0.15  Copyright (C) 2017-2018  Gian-Carlo Pascutto and contributors
This program comes with ABSOLUTELY NO WARRANTY.
This is free software, and you are welcome to redistribute it
under certain conditions; see the COPYING file for details.

Using 2 thread(s).
RNG seed: 9371137713330324515
BLAS Core: built-in Eigen 3.3.5 library.
Detecting residual layers...v1...256 channels...40 blocks.
Initializing OpenCL (autodetecting precision).
Detected 1 OpenCL platforms.
Platform version: OpenCL 1.2 CUDA 9.1.84
Platform profile: FULL_PROFILE
Platform name:    NVIDIA CUDA
Platform vendor:  NVIDIA Corporation
Device ID:     0
Device name:   Tesla V100-SXM2-16GB
Device type:   GPU
Device vendor: NVIDIA Corporation
Device driver: 390.87
Device cores:  80 CU
Device score:  1112
Selected platform: NVIDIA CUDA
Selected device: Tesla V100-SXM2-16GB
with OpenCL 1.2 capability.
Half precision compute support: No.
Detected 1 OpenCL platforms.
Platform version: OpenCL 1.2 CUDA 9.1.84
Platform profile: FULL_PROFILE
Platform name:    NVIDIA CUDA
Platform vendor:  NVIDIA Corporation
Device ID:     0
Device name:   Tesla V100-SXM2-16GB
Device type:   GPU
Device vendor: NVIDIA Corporation
Device driver: 390.87
Device speed:  1530 MHz
Device cores:  80 CU
Device score:  1112
Selected platform: NVIDIA CUDA
Selected device: Tesla V100-SXM2-16GB
with OpenCL 1.2 capability.
Half precision compute support: No.

Started OpenCL SGEMM tuner.
Will try 290 valid configurations.
(1/290) KWG=16 KWI=2 MDIMA=8 MDIMC=8 MWG=16 NDIMB=8 NDIMC=8 NWG=16 SA=1 SB=1 STRM=0 STRN=0 VWM=2 VWN=2 0.0525 ms (2247.8 GFLOPS)
(2/290) KWG=16 KWI=2 MDIMA=8 MDIMC=8 MWG=32 NDIMB=8 NDIMC=8 NWG=16 SA=1 SB=1 STRM=0 STRN=0 VWM=2 VWN=2 0.0468 ms (2518.0 GFLOPS)
(5/290) KWG=16 KWI=2 MDIMA=8 MDIMC=8 MWG=32 NDIMB=8 NDIMC=8 NWG=32 SA=1 SB=1 STRM=0 STRN=0 VWM=2 VWN=2 0.0458 ms (2574.3 GFLOPS)
(11/290) KWG=32 KWI=2 MDIMA=8 MDIMC=8 MWG=32 NDIMB=8 NDIMC=8 NWG=16 SA=1 SB=1 STRM=0 STRN=0 VWM=2 VWN=2 0.0422 ms (2792.7 GFLOPS)
(21/290) KWG=16 KWI=2 MDIMA=16 MDIMC=16 MWG=32 NDIMB=8 NDIMC=8 NWG=32 SA=1 SB=1 STRM=0 STRN=0 VWM=2 VWN=2 0.0379 ms (3113.5 GFLOPS)
(93/290) KWG=32 KWI=8 MDIMA=32 MDIMC=32 MWG=64 NDIMB=8 NDIMC=8 NWG=32 SA=1 SB=1 STRM=0 STRN=0 VWM=2 VWN=2 0.0366 ms (3222.4 GFLOPS)
(136/290) KWG=16 KWI=2 MDIMA=16 MDIMC=16 MWG=64 NDIMB=8 NDIMC=8 NWG=32 SA=1 SB=1 STRM=0 STRN=0 VWM=4 VWN=2 0.0358 ms (3291.4 GFLOPS)
(169/290) KWG=16 KWI=8 MDIMA=16 MDIMC=16 MWG=64 NDIMB=8 NDIMC=8 NWG=32 SA=1 SB=1 STRM=0 STRN=0 VWM=4 VWN=2 0.0346 ms (3413.3 GFLOPS)
(172/290) KWG=32 KWI=8 MDIMA=16 MDIMC=16 MWG=64 NDIMB=8 NDIMC=8 NWG=32 SA=1 SB=1 STRM=0 STRN=0 VWM=4 VWN=2 0.0343 ms (3438.8 GFLOPS)
Found Leela Version : 0.15
Tuning process finished
Starting thread 1 on GPU 0
{
    "cmd": "selfplay",
    "hash": "68824bbc683a0eb482bcdc34ea7c3e4bc3e1dd152e3aa94f9a8bfc6d189f3091",
    "hash_gzip_hash": "2390e5e8fd8f34c494e9775137b282430348a5d450667357345430a9f2dd5c6c",
    "minimum_autogtp_version": "16",
    "minimum_leelaz_version": "0.15",
    "options": {
        "noise": "true",
        "playouts": "0",
        "randomcnt": "999",
        "resignation_percent": "5",
        "visits": "1601"
    },
    "options_hash": "b37dca",
    "random_seed": "5468405401357328662",
    "required_client_version": "16"
}

Got new job: selfplay
net: 68824bbc683a0eb482bcdc34ea7c3e4bc3e1dd152e3aa94f9a8bfc6d189f3091.
Engine has started.
time_settings 0 1 0
Thinking time set.
1 (B D16) 2 (W Q4) 3 (B D4) 4 (W Q16) 5 (B R14)


you can notice that it's not master version because, at the time i'm writing the tutorial (leelav15/autogtpv16), you dont see these lines (among others) in the master version :
BLAS Core: built-in Eigen 3.3.5 library.
Half precision compute support: No.
time_settings 0 1 0


and next is ready to use !


part 6f : old manual MASTER BRANCH instructions :


# show gpu details
clinfo && nvidia-smi
# Clone github repo
git clone https://github.com/gcp/leela-zero
# go to leela-zero/src folder
cd leela-zero/src
# compile leelaz binary
make
# go to leela-zero/autogtp folder
cd ../autogtp
# compile autogtp binary
qmake -qt5
make
# copy leelaz binary into leela-zero/autogtp folder
cp ../src/leelaz .
# run autogtp
./autogtp


and autogtp finally runs !
AutoGTP v16
Using 1 thread(s) for GPU(s).
Starting tuning process, please wait...
Net filename: networks/25c2313d8c11b9320de4795cf593f237f32e8a61c4524a6305ff30073b760132
net: 25c2313d8c11b9320de4795cf593f237f32e8a61c4524a6305ff30073b760132.
./leelaz --tune-only -w networks/25c2313d8c11b9320de4795cf593f237f32e8a61c4524a6305ff30073b760132
./leelaz --tune-only -w networks/25c2313d8c11b9320de4795cf593f237f32e8a61c4524a6305ff30073b760132
Using 2 thread(s).
RNG seed: 17613292517859834344
Detecting residual layers...v1...256 channels...40 blocks.
Initializing OpenCL.
Detected 1 OpenCL platforms.
Platform version: OpenCL 1.2 CUDA 9.1.84
Platform profile: FULL_PROFILE
Platform name:    NVIDIA CUDA
Platform vendor:  NVIDIA Corporation
Device ID:     0
Device name:   Tesla V100-SXM2-16GB
Device type:   GPU
Device vendor: NVIDIA Corporation
Device driver: 390.87
Device speed:  1530 MHz
Device cores:  80 CU
Device score:  1112
Selected platform: NVIDIA CUDA
Selected device: Tesla V100-SXM2-16GB
with OpenCL 1.2 capability.

Started OpenCL SGEMM tuner.
Will try 290 valid configurations.
(1/290) KWG=16 KWI=2 MDIMA=8 MDIMC=8 MWG=16 NDIMB=8 NDIMC=8 NWG=16 SA=1 SB=1 STRM=0 STRN=0 VWM=2 VWN=2 0.0689 ms (3045.4 GFLOPS)
(2/290) KWG=16 KWI=2 MDIMA=8 MDIMC=8 MWG=32 NDIMB=8 NDIMC=8 NWG=16 SA=1 SB=1 STRM=0 STRN=0 VWM=2 VWN=2 0.0548 ms (3828.0 GFLOPS)
(5/290) KWG=16 KWI=2 MDIMA=8 MDIMC=8 MWG=32 NDIMB=8 NDIMC=8 NWG=32 SA=1 SB=1 STRM=0 STRN=0 VWM=2 VWN=2 0.0538 ms (3901.0 GFLOPS)
(21/290) KWG=16 KWI=2 MDIMA=16 MDIMC=16 MWG=32 NDIMB=8 NDIMC=8 NWG=32 SA=1 SB=1 STRM=0 STRN=0 VWM=2 VWN=2 0.0509 ms (4116.6 GFLOPS)
(27/290) KWG=32 KWI=2 MDIMA=16 MDIMC=16 MWG=32 NDIMB=8 NDIMC=8 NWG=32 SA=1 SB=1 STRM=0 STRN=0 VWM=2 VWN=2 0.0489 ms (4289.0 GFLOPS)
(88/290) KWG=32 KWI=8 MDIMA=16 MDIMC=16 MWG=32 NDIMB=8 NDIMC=8 NWG=32 SA=1 SB=1 STRM=0 STRN=0 VWM=2 VWN=2 0.0479 ms (4380.7 GFLOPS)
(131/290) KWG=32 KWI=2 MDIMA=8 MDIMC=8 MWG=32 NDIMB=8 NDIMC=8 NWG=32 SA=1 SB=1 STRM=0 STRN=0 VWM=4 VWN=2 0.0471 ms (4452.2 GFLOPS)
(135/290) KWG=16 KWI=2 MDIMA=16 MDIMC=16 MWG=64 NDIMB=8 NDIMC=8 NWG=16 SA=1 SB=1 STRM=0 STRN=0 VWM=4 VWN=2 0.0456 ms (4602.2 GFLOPS)
(205/290) KWG=32 KWI=2 MDIMA=16 MDIMC=16 MWG=32 NDIMB=8 NDIMC=8 NWG=32 SA=1 SB=1 STRM=0 STRN=0 VWM=2 VWN=4 0.0445 ms (4708.0 GFLOPS)
(238/290) KWG=32 KWI=8 MDIMA=16 MDIMC=16 MWG=32 NDIMB=8 NDIMC=8 NWG=32 SA=1 SB=1 STRM=0 STRN=0 VWM=2 VWN=4 0.0438 ms (4790.6 GFLOPS)
Tuning process finished


part 6g (optionnal, for manual install only) :
how to run autogtp again (if exited)


this will run next or master branch depending on which branch you installed :

cd leela-zero/autogtp
./autogtp


note : this command :
./autogtp -g 2
instead of ./autogtp produces games significantly faster, because when a 0% resign game is generated, the extra gpu power of the v100 can be used to produce another game simultaneously
(265 games/24 hours VS 208 games/24hours)
see this comment for details :
https://github.com/gcp/leela-zero/issues/1905#issuecomment-428612310