About Details of Implementaion #16

0xJchen · 2021-07-29T07:15:45Z

Hi, Adam @adamlerer . After reading the great paper and code, I have some questions concerning the Single-Agent-Search(SAS).
1.
In the last paragraph describing SAS, it says that 1-ply search is used, which means that search depth = 1.
But in the code(Takes), it seems that when a bot is conducting search at a certain time step, it iteratively conducts search in the future MCSearch.
More specifically, assume we are running

# evaluate single-agent search with SmartBot blueprint
BPBOT=SmartBot python eval_bot.py SearchBot --games 10

and in file SearchBot.cc.
In Move SearchBot::doSearch_, we are running search for a certain round. This is done by calling

scores[j] = oneSearchIter_(me_bot, who, sampled_move, cdf, server, handDist, my_gen);

Which further starts a copied search_server and calls

int score = search_server.runToCompletion();

to get the result of the MC rollout.
It seems that the two players in the search_server are still SearchBot and SmartBot(BP Bot), which is consistent with that fact claimed in the paper that "all agents play according to the joint blueprint policy for the remainder of the game". I wonder how do you deal with this problem or if I missed some important facts.

I found the MCSearch method mentioned in the paper is slightly different from the conventional MCTS. I suppose the biggest distinction is that in MCTS(as used in Alpha-Go), the action chosen in the selection step is based on the "Q value(which depends on a state value estimator )" and "reward(which is already presented in SPARTA)". May I ask why is the conventional idea not adopted here, or is this idea presented in the paper learned belief search?

Thanks for sharing the codebase and reading the above questions!

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About Details of Implementaion #16

About Details of Implementaion #16

0xJchen commented Jul 29, 2021 •

edited

Loading

About Details of Implementaion #16

About Details of Implementaion #16

Comments

0xJchen commented Jul 29, 2021 • edited Loading

0xJchen commented Jul 29, 2021 •

edited

Loading