-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggestion for solving endgame problem #29
Comments
I think it's open problem. Please try it and send PR or write the paper. |
Some idea.
In early stage or middle satge of game it,s not important. |
Ray can turn on or turn off multi nn option with some algorithm. |
If simultaneous nn is impossible, then ray use sequential nn search. |
CNTK evaluation library can load multiple model. |
Well i am not developer.that was just my tip of endgame problem. |
Ray is enough strong almost professional level except endgame. |
It may be possible, but approximately it requires 10TB SSD to store data, 2 week to generate training data, 2 month to training. It is too large for me. |
Now Ray uses only games that human plays to train. |
@j8580 CGI published "Multi-Labelled Value Networks for Computer Go" https://arxiv.org/abs/1705.10701 |
I will check papers. |
Lol their dynamic komi concept is my ideas ^^ |
sorry..i couldnt understand the exact methods. just I tried manually dynamic komi change during play. komi change only effects on MCTS not neural network. only MCTS value change is also useful. but I can also see neural network value effect when dynamic komi changes during play. can u add some code? (if komi change during play..then nn value also change) I feel komi change is useful. and end game...max policy candidate is more accurate than ray's best candidat in many times. really sorry for not helping u. Still I hope u improve and solve Ray's strength..cheer up~ |
https://github.com/zakki/Ray/blob/nn/src/UctSearch.cpp#L2539 |
i change komi during play by setkomi command or gogui komi set. |
example..if white..at start..komi is 4.5 ...then nn winning ration is over 55 then white komi 3.5 ..then if nn winning ratio is over 60 then white komi is 1.5 and so on. and winng percentage is over 80 then ray use max policy endgame... i don know optimal value. they said optimal values in papers. but i cant understand their papers exactly. i have played with pro go players with ray.. now ray's winning ratio is 30-40% vs average professional players with this manual technique...but just this is manual technique..i don know ray can this automatically. and ray has still some bugs in big life-death situation. especially throw-in bugs. it's just examples..and now komi change only effect on MCTS winning percentage not nn value. but still i feel it's useful. |
at endgame..by manually i choose the one among best 3 max policy candidate not ray's best candidate. |
Ray or many go ai has endgame problem except alphago.
I suggest some idea.
Make new command for endgame.
In particular part of game, with endgame command Ray use only counting result search.
Best candidate is selected for maximum counting result.
Or ray's neural network plus maximum counting search will be good choice for good endgame performance.
Is it possible?
The text was updated successfully, but these errors were encountered: