Suggestion for solving endgame problem #29

j8580 · 2017-04-21T10:19:48Z

Ray or many go ai has endgame problem except alphago.
I suggest some idea.
Make new command for endgame.
In particular part of game, with endgame command Ray use only counting result search.
Best candidate is selected for maximum counting result.
Or ray's neural network plus maximum counting search will be good choice for good endgame performance.
Is it possible?

zakki · 2017-04-21T13:25:59Z

I think it's open problem. Please try it and send PR or write the paper.

Refactoring

j8580 · 2017-04-25T11:28:23Z

Some idea.

Simultaneously use multi neural network.(1 2 3 4 5...)
Ex) if player is qhite set for multi komi.

1 NN 6.5
2 Nn 5.5
3 NN 4.5
4 Nn 3.5 and so on
5 .....
If player is black vice versa
1 Nn 6.5
2 Nn 7.5
3 NN 8.5 and so on

In early stage or middle satge of game it,s not important.
But in end game there r huge differences in win percentage.
Ray can select higer win ratio move in multi nn .
And multi nn option works among ex) 40 ~60 winning or losing ratio.
Ray van turn on or turn off multi nn option.
Is this possible ?

j8580 · 2017-04-25T11:30:19Z

Ray can turn on or turn off multi nn option with some algorithm.

j8580 · 2017-04-25T11:32:09Z

If simultaneous nn is impossible, then ray use sequential nn search.

zakki · 2017-04-25T11:34:36Z

CNTK evaluation library can load multiple model.
Also you can merge multiple models into single model, and switch them using BrainScript.
https://github.com/Microsoft/CNTK/wiki/CloneFunction
https://github.com/Microsoft/CNTK/wiki/If-Operation

j8580 · 2017-04-25T12:08:36Z

Well i am not developer.that was just my tip of endgame problem.
I thnik u can do it . Ur genius.
But I am doing some tests my idea.
And if it is useful then I will inform more details.
Then u can adjust it to ray.

j8580 · 2017-04-25T12:11:50Z

Ray is enough strong almost professional level except endgame.
If u can solve ray's endgame problem
ray will be first professional level go game in personal pc level.

zakki · 2017-04-25T12:28:04Z

It may be possible, but approximately it requires 10TB SSD to store data, 2 week to generate training data, 2 month to training. It is too large for me.

zakki · 2017-04-25T12:45:38Z

Now Ray uses only games that human plays to train.
Data generation with RL policy and random play like AlphaGo may improve endgame.
But that requires enormous resource too. 😱

zakki · 2017-05-31T09:18:17Z

@j8580 CGI published "Multi-Labelled Value Networks for Computer Go" https://arxiv.org/abs/1705.10701
I haven't understood detail, but it looks similar to your idea in some aspect.

j8580 · 2017-05-31T09:35:56Z

I will check papers.

j8580 · 2017-05-31T10:16:50Z

Lol their dynamic komi concept is my ideas ^^
I have tried many times in manually with Ray.
It was really useful. And still testing with ray.
I will read it in details.i can not sure I can get their exact ideas. Give me a some time.

j8580 · 2017-06-05T03:21:29Z

sorry..i couldnt understand the exact methods.

just I tried manually dynamic komi change during play.

komi change only effects on MCTS not neural network. only MCTS value change is also useful.

but I can also see neural network value effect when dynamic komi changes during play.

can u add some code? (if komi change during play..then nn value also change)

I feel komi change is useful.

and end game...max policy candidate is more accurate than ray's best candidat in many times.

really sorry for not helping u.

Still I hope u improve and solve Ray's strength..cheer up~

zakki · 2017-06-05T03:25:54Z

https://github.com/zakki/Ray/blob/nn/src/UctSearch.cpp#L2539
Is what you mean replacing komi to dynamic_komi here?

j8580 · 2017-06-05T03:37:30Z

i change komi during play by setkomi command or gogui komi set.

j8580 · 2017-06-05T03:48:10Z

example..if white..at start..komi is 4.5 ...then nn winning ration is over 55 then white komi 3.5 ..then if nn winning ratio is over 60 then white komi is 1.5 and so on. and winng percentage is over 80 then ray use max policy endgame... i don know optimal value. they said optimal values in papers. but i cant understand their papers exactly.

i have played with pro go players with ray.. now ray's winning ratio is 30-40% vs average professional players with this manual technique...but just this is manual technique..i don know ray can this automatically. and ray has still some bugs in big life-death situation. especially throw-in bugs.

it's just examples..and now komi change only effect on MCTS winning percentage not nn value. but still i feel it's useful.

j8580 · 2017-06-07T02:04:57Z

at endgame..by manually i choose the one among best 3 max policy candidate not ray's best candidate.
if ray's best candidate is same like max policy candidate, then that is best
at endgame.. i think ruling out too low policy value candidate is more stable.
i am tygem 5d in real life.. so i can know approximately endgame situation. then i can choose manual candidate with Ray.
but how can Ray know endgame situation?..can u do?
when Ray is winning at endgame part(example ray's winning ratio is over 80) , in order to reduce Ray's endgame mistakes, i suggest Ray choose the one among top 3 high policy value candidate. surely that is satble in my experience. by this I've never experienced a reversal defeat recently at endgame part when Ray is winning.

zakki pushed a commit that referenced this issue Apr 22, 2017

Merge pull request #29 from koban6/development

8f709f9

Refactoring

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suggestion for solving endgame problem #29

Suggestion for solving endgame problem #29

j8580 commented Apr 21, 2017

zakki commented Apr 21, 2017

j8580 commented Apr 25, 2017

j8580 commented Apr 25, 2017

j8580 commented Apr 25, 2017

zakki commented Apr 25, 2017

j8580 commented Apr 25, 2017

j8580 commented Apr 25, 2017

zakki commented Apr 25, 2017

zakki commented Apr 25, 2017

zakki commented May 31, 2017

j8580 commented May 31, 2017

j8580 commented May 31, 2017

j8580 commented Jun 5, 2017

zakki commented Jun 5, 2017

j8580 commented Jun 5, 2017

j8580 commented Jun 5, 2017 •

edited

Loading

j8580 commented Jun 7, 2017

Suggestion for solving endgame problem #29

Suggestion for solving endgame problem #29

Comments

j8580 commented Apr 21, 2017

zakki commented Apr 21, 2017

j8580 commented Apr 25, 2017

j8580 commented Apr 25, 2017

j8580 commented Apr 25, 2017

zakki commented Apr 25, 2017

j8580 commented Apr 25, 2017

j8580 commented Apr 25, 2017

zakki commented Apr 25, 2017

zakki commented Apr 25, 2017

zakki commented May 31, 2017

j8580 commented May 31, 2017

j8580 commented May 31, 2017

j8580 commented Jun 5, 2017

zakki commented Jun 5, 2017

j8580 commented Jun 5, 2017

j8580 commented Jun 5, 2017 • edited Loading

j8580 commented Jun 7, 2017

j8580 commented Jun 5, 2017 •

edited

Loading