-
Notifications
You must be signed in to change notification settings - Fork 564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KataGo v1.4.2 vs LZ272 #254
Comments
Games here: Also, I'm momentarily about to upload some nets that, at least when tested against older KataGo nets, appear to be much stronger than these nets, due to learning rate drops at the end of the g170 run. :) |
@lightvector san, Great results! I have a question. |
For recent nets, 40b and 20b were close from each other on time parity, but 20b was almost not progressing anymore. For the coming nets, I guess 40b will be significantly stronger as it improved a lot and I guess 20b did not improve a lot. But we'll know more in a few hours hopefully, patience 😜 |
@Friday9i san, I can't wait for the new release!😆 |
Done! https://github.com/lightvector/KataGo/releases Gained 200ish Elo for the 40 block net, and 100ish Elo for the 20 block net, based on matches against earlier networks. No idea how much this gain transfers to gains against opponents like LZ, but anyone is free of course to try them and compare. Enjoy! |
Congrats!! |
Just posting for the record some test results against LZ272 that I ran a while back using KataGo 1.4.2 and the last "semi-zero" nets (not the last nets in the run as a whole), which were g170-b40c256x2-s3708042240-d967973220 (40 blocks) and g170e-b20c256x2-s4384473088-d968438914 (20 blocks). And also against LZ-ELFv2, just to see how far we've come since ELF.
I posted these results about a month ago in the discord chat, this is just re-posting them here.
Summary: KataGo won around 80-90% of games given comparable amounts of compute time (but, on a V100 machine which might have a smaller gap between GPU performance between KG and LZ than would be the case on certain users' hardware) and won 70%-80% of games when put at a modest visits handicap to LZ, without having to enable
avoidMYTDaggerHack
, although enabling it significantly further helped in some cases.All tests used a single V100 cloud GPU (roughly, comparable to AWS "P3 2xlarge" instance, except on Google Cloud, not AWS).
KataGo was left at mostly default settings, but with a bit of tuning:
LZ272 and LZ-ELF used:
--threads 32 --batchsize 16
since some testing indicated that this produced best LZ performance given the GPU.--randomcnt 20 --randomtemp 0.3
to increase opening diversity on LZ's side a little in lieu of having an opening panel. Higher than LZ's default of no temperature at all, but still lower and briefer overall than KataGo's default.--noponder --timemanage off
Also, both sides set to resign immediately at 5% winrate.
First test, KG set to use a fixed 5 seconds per move, and LZ used 18K playouts per move LZ-ELFv2 used 36K playouts per move, aiming to make them take about 5 seconds per move because they have no command-line way to fix a time per move. In actuality, they took about 5.6 s/move and 6s/move, so this calibration was a bit off, in LZ's favor.
Win/loss results:
Second test: fixed playouts, KG set a bit lower than either LZ or ELF.
The text was updated successfully, but these errors were encountered: