-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
the sampled efficient zero portion of the code #218
Comments
Hello! We have implemented functionalities related to the empirical distribution as described in the original SampledMuZero paper. You can refer to our ptree and ctree codes for specific implementation details. Please note that since the K actions we sample are non-repetitive, the empirical distribution is essentially a re-normalization of the original probabilities (for discrete action spaces) or log probabilities (for continuous action spaces). As the original author's code is not open source, this implementation is based solely on our understanding. Additionally, following recent discussions on this issue, we plan to optimize the performance of sampled_efficientzero soon. Thank you for your patience and support. |
Thank you very much for your detailed response and for sharing the specifics of the implementation. I appreciate the efforts you and your team are putting into the development of the SampledMuZero functionalities. Understanding that this implementation was crafted from the ground up, given the absence of open-source code from the original authors, I can certainly appreciate the complexity and innovation involved in your approach. I am also looking forward to the upcoming optimizations to the sampled_efficientzero algorithm. Please keep me updated on any new developments or insights that might arise as you continue to refine the implementation.
At 2024-04-18 12:03:32, "蒲源" ***@***.***> wrote:
Hello! We have implemented functionalities related to the empirical distribution as described in the original SampledMuZero paper. You can refer to our ptree and ctree codes for specific implementation details. Please note that since the K actions we sample are non-repetitive, the empirical distribution is essentially a re-normalization of the original probabilities (for discrete action spaces) or log probabilities (for continuous action spaces). As the original author's code is not open source, this implementation is based solely on our understanding. Additionally, following recent discussions on this issue, we plan to optimize the performance of sampled_efficientzero soon. Thank you for your patience and support.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Hello,I was wondering if the sampled efficient zero portion of the code didn't use the empirical distribution from the original sampledmuzero paper to generate the prior probability of the child nodes?
The text was updated successfully, but these errors were encountered: