Skip to content

Custom rollout depth #91

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
May 19, 2022
Merged

Conversation

jancervenka
Copy link
Contributor

Draft resolving #88

@BoZenKhaa
Copy link
Contributor

Hi Honza! Thanks for a quick attempt at this! I have a few comments.

First, I like the approach of putting the depth as a parameter of the rollout estimator. I think this is elegant and easy to follow. Though as a consequence, it will be difficult to use the current depth in the tree in the rollouts. At least for me, this is not a big issue and it's simpler this way, but maybe in some use cases, this is needed. What do you think @zsunberg? This could be addressed by still passing the depth to the estimate_value function, and implementing more rollout estimators:

  • default rollout estimator, that does not use depth passed to it but instead uses depth defined in SolvedRolloutEstimator
  • depth-dependent estimator, that does use depth passed to rollout and replicates the old behavior

Regarding the code, I tried to add commits in your branch that will let the tests run, I haven't figured out how to add them here (do you know how to do that?).

    function SolvedRolloutEstimator(policy, rng, depth::Union{Int, Nothing}=nothing)
        new{typeof(policy), typeof(rng)}(policy, rng, depth)
    end
RolloutSimulator(rng::AbstractRNG, d::Int=typemax(Int)) #not possible to use nothing here for depth
RolloutSimulator(;rng=Random.GLOBAL_RNG, eps=nothing,   max_steps=nothing) # you can pass nothing to this method using kwargs

Also, you can test your code quickly from REPL. Some tests are failing right now, and this is how you run them:

  1. launch REPL in the root of the MCTS package
  2. turn package mode on in REPL with ']'
  3. run activate . to activate the environment
  4. run test to run the tests

Alternatively, you can run the runtests.jl file, but the tests may be using a different environment than the package. The above method should handle that.

I think as the next step, you can have a look at the test set, but let's wait for @zsunberg before fixing them.

@jancervenka
Copy link
Contributor Author

jancervenka commented Feb 21, 2022

Thank you for the feedback! I will fix the tests once we agree that the proposed solution is OK.

Btw the best way to suggest changes in PR is to create a comment and use the "add a suggestion" feature.
Screenshot 2022-02-21 at 14 36 20

Copy link
Contributor

@BoZenKhaa BoZenKhaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes allow for runtest.jl to run.

@zsunberg zsunberg requested a review from WhiffleFish March 9, 2022 03:56
@zsunberg
Copy link
Member

zsunberg commented Mar 9, 2022

@WhiffleFish can you please review these changes and let me know what you think.

Copy link
Member

@WhiffleFish WhiffleFish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I think the general approach looks good. However, in my opinion, it still requires a few more changes before we merge.

As a final note, some tests failed with the changes. Line 26 of test/options.jl errored because it still included depth as an argument in the value estimation call, so that should be rectified possibly along with others that may fail once this one doesn't error.

@jancervenka
Copy link
Contributor Author

jancervenka commented Mar 16, 2022

Thank you for the review! @BoZenKhaa @WhiffleFish

  • I have moved the max_depth attribute and the default constructor to RolloutEstimator and propagated the value to SolvedRolloutEstimator
  • max_depth now has a finite default value of max_depth=50. Is it reasonable?
  • I have added the eps attribute to RolloutEstimator so it exposes both the max_steps and eps of RolloutSimulator API. The default value is eps=nothing, is that okay or should I change it to eps=0.01?
  • The tests are now passing

@jancervenka jancervenka requested a review from WhiffleFish March 16, 2022 01:35
@jancervenka jancervenka changed the title [DRAFT] custom rollout depth Custom rollout depth Mar 16, 2022
@jancervenka
Copy link
Contributor Author

@BoZenKhaa I have added your suggestions. I have not found any inconsistencies in the documenttation.

@BoZenKhaa
Copy link
Contributor

@BoZenKhaa I have added your suggestions. I have not found any inconsistencies in the documenttation.

I meant this
image
but it's most definitely minor thing.

I very much like the state of the changes!

@zsunberg
Copy link
Member

Hi all, sorry for the delayed response - I have been quite busy here. The changes are quite nice. Thank you for being so thorough.

One question to consider here: In this PR you have reduced the arguments for estimate_value from (mdp, s, d) to (mdp, s). I think that, instead, we should keep this argument (in case anyone wants to implement the old behavior), but just ignore it by default.

Any other thoughts?

Copy link
Member

@zsunberg zsunberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the docstrings for the solvers, please change the documentation for the depth argument to reflect that it no longer applies to the rollouts.

@zsunberg
Copy link
Member

@WhiffleFish are the changes you requested complete?

@BoZenKhaa
Copy link
Contributor

BoZenKhaa commented May 13, 2022

@zsunberg

One question to consider here: In this PR you have reduced the arguments for estimate_value from (mdp, s, d) to (mdp, s). I think that, instead, we should keep this argument (in case anyone wants to implement the old behavior), but just ignore it by default.

Any other thoughts?

We discussed this with @WhiffleFish in the review. The outcome was that making the new behavior default is a breaking change, keeping the legacy functionality would not be generally useful and would unnecessarily complicate things.

If you agree, I think this is ready :-D

@zsunberg
Copy link
Member

Hi all, sorry that this fell stale - this semester was very busy for me, but it just ended. I think it is useful to allow the old behavior, so I will go through and add the depth argument back in and fix the other minor issues and merge this tomorrow.

@zsunberg
Copy link
Member

Thanks a bunch @jancervenka @BoZenKhaa ! I am sorry that this took so long, but it is a useful addition!

@zsunberg zsunberg merged commit b706e58 into JuliaPOMDP:master May 19, 2022
zsunberg added a commit that referenced this pull request May 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants