New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Custom action distributions #5164

Merged

ericl merged 22 commits into ray-project:master from mawright:action_dist

Aug 6, 2019

Contributor

mawright commented Jul 10, 2019

What do these changes do?

Adds support for custom action distributions. They are registered to and looked up from the same global "ModelCatalog" class that are currently used for custom models and preprocessors.

There remain some issues with the handling of the action tensors in preexisting code that I will mention in a new Github issue.

Related issue number

Linter

[ x] I've run scripts/format.sh to lint the changes in this PR.

mawright added 11 commits

July 8, 2019 14:18


          custom action dist wip


          Test case for custom action dist

8c4d684


          ActionDistribution.get_parameter_shape_for_action_space pattern

508ed4a


          Edit exception message to also suggest using a custom action distribu…

7d0ae68

…tion


          Clean up ModelCatalog.get_action_dist

289141c


          Pass model config to ActionDistribution constructors

33c3907


          Update custom action distribution test case

ce720a3


          Name fix

d0b8a64


          Autoformatter

c3ec408


          parameter shape static methods for torch distributions

f78f447


          Fix docstring

489c573

AmplabJenkins commented Jul 10, 2019

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-Perf-Integration-PRB/1592/
Test FAILed.

AmplabJenkins commented Jul 10, 2019

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15282/
Test PASSed.

ericl reviewed

View reviewed changes

Contributor

ericl left a comment

This is great! I think two more additions would make this feature more discoverable by users:

Add an end-to-end runnable example in rllib/examples/custom_action_dist.py
Update rllib-models.rst to include a section on custom action distributions, and also update rllib-examples.rst to link to the example script.

python/ray/rllib/models/action_dist.py Outdated

@@ @@ -69,6 +72,25 @@ def sampled_action_prob(self): @@
                       """Returns the log probability of the sampled action."""
                       return tf.exp(self.logp(self.sample_op))
+                  @DeveloperAPI
+                  @staticmethod
+                  def parameter_shape_for_action_space(action_space, model_config=None):

Contributor

ericl Jul 10, 2019

How about required_model_output_size?

python/ray/rllib/models/action_dist.py Outdated

@@ @@ -69,6 +72,25 @@ def sampled_action_prob(self): @@
                       """Returns the log probability of the sampled action."""
                       return tf.exp(self.logp(self.sample_op))
+                  @DeveloperAPI
+                  @staticmethod

Contributor

ericl Jul 10, 2019

Suggested change

      
                @staticmethod
          
                @classmethod

Contributor Author

mawright Jul 13, 2019

I don't really understand this suggestion. Why do you think this should be a class method?

Contributor

ericl Jul 13, 2019

Hm I guess staticmethod is fine, since you don't really need the class.

python/ray/rllib/models/action_dist.py Outdated

+                          model_config (dict): Model's config dict (as defined in catalog.py)
+                      Returns:
+                          dist_dim (int or np.ndarray of ints): size of the required

Contributor

ericl Jul 10, 2019

Suggested change

      
                        dist_dim (int or np.ndarray of ints): size of the required
          
                        model_output_size (int or np.ndarray of ints): size of the required

python/ray/rllib/models/action_dist.py Outdated

                   """
                   @DeveloperAPI
-                  def __init__(self, inputs):
+                  def __init__(self, inputs, model_config=None):

Contributor

ericl Jul 10, 2019

To avoid accidents where we forget to pass this, consider making it a required argument.

Contributor Author

mawright Jul 13, 2019

I lean towards making it required but this may break other users' code.

Contributor

ericl Jul 13, 2019 •

edited

Loading

Since this is DeveloperAPI I would say it's ok to err on the side of avoiding bugs vs backwards compatibility.

python/ray/rllib/models/action_dist.py Outdated

+                  @staticmethod
+                  @override(ActionDistribution)
+                  def parameter_shape_for_action_space(action_space, model_config=None):
+                      return action_space.shape[0] * 2

Contributor

ericl Jul 10, 2019

np.product(action_space.shape)? here and elsewhere?

python/ray/rllib/models/catalog.py Outdated

                           return partial(MultiCategorical, input_lens=action_space.nvec), \
                               int(sum(action_space.nvec))
+                      return dist, dist.parameter_shape_for_action_space(

Contributor

ericl Jul 10, 2019

Seems like now you could simplify this to return just dist (or this cleanup can be done later).

ericl self-assigned this

Contributor Author

mawright commented Jul 12, 2019

Will probably be another day or two before I get a chance to work on the stuff you mentioned.

Contributor

ericl commented Jul 24, 2019

@mawright any updates? Let us know if you need help.

Contributor Author

mawright commented Jul 25, 2019

@mawright any updates? Let us know if you need help.

Sorry, have been busy finishing my dissertation the last few weeks and have a busy week ahead of me with neurips reviewer responses. This is still on my to do list.

ericl mentioned this pull request

[rllib] Autoregressive action distributions #5304

Merged

5 tasks


          Generalize fake array for graph initialization

ff5076e

AmplabJenkins commented Jul 30, 2019

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15779/
Test FAILed.

mawright force-pushed the action_dist branch from f98ea6d to f9d592d Compare

August 1, 2019 18:43


          Merge branch 'master' into action_dist

bfdfa90

mawright force-pushed the action_dist branch from f9d592d to bfdfa90 Compare

August 1, 2019 19:11


          Fix action dist constructors

11243cd

AmplabJenkins commented Aug 1, 2019

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15892/
Test FAILed.

AmplabJenkins commented Aug 1, 2019

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15893/
Test FAILed.

AmplabJenkins commented Aug 1, 2019

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15898/
Test FAILed.

mawright added 2 commits

August 1, 2019 14:21


          Correct parameter shape static methods for multicategorical and gaussian

a9939d4


          Make suggested changes to custom action dist's

96bba6c

AmplabJenkins commented Aug 2, 2019

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15912/
Test FAILed.

mawright added 2 commits

August 1, 2019 19:47


          Correct instances of not passing model config to action dist

8158f24


          Autoformatter

f11fbca

AmplabJenkins commented Aug 2, 2019

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15923/
Test FAILed.

AmplabJenkins commented Aug 2, 2019

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15926/
Test FAILed.


          fix tuple distribution constructor

bd378b0

AmplabJenkins commented Aug 2, 2019

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15943/
Test FAILed.


          bugfix

2f39c88

Contributor Author

mawright commented Aug 4, 2019 •

edited

Loading

The Travis check said this commit errored on a Python 2.7 build but I can't figure out why. From what I can tell the error is occurring in a C++ file: https://travis-ci.com/ray-project/ray/jobs/222169300#L1427


          Merge remote-tracking branch 'upstream/master' into action_dist

a1321a8

ericl approved these changes

View reviewed changes

Contributor

ericl left a comment

LGTM

Contributor

ericl commented Aug 5, 2019

The travis tests look ok to me, but not sure if jenkins is fully passing still. Going to wait for the latest build before merging.

Contributor

ericl commented Aug 5, 2019

jenkins retest this please


          Merge remote-tracking branch 'upstream/master' into action_dist

1b2eb98

AmplabJenkins commented Aug 6, 2019

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/16008/
Test PASSed.

ericl merged commit e3c9f7e into ray-project:master

AmplabJenkins commented Aug 6, 2019

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/16042/
Test PASSed.

edoakes pushed a commit to edoakes/ray that referenced this pull request


          Custom action distributions (ray-project#5164)

d4b62e6

* custom action dist wip

* Test case for custom action dist

* ActionDistribution.get_parameter_shape_for_action_space pattern

* Edit exception message to also suggest using a custom action distribution

* Clean up ModelCatalog.get_action_dist

* Pass model config to ActionDistribution constructors

* Update custom action distribution test case

* Name fix

* Autoformatter

* parameter shape static methods for torch distributions

* Fix docstring

* Generalize fake array for graph initialization

* Fix action dist constructors

* Correct parameter shape static methods for multicategorical and gaussian

* Make suggested changes to custom action dist's

* Correct instances of not passing model config to action dist

* Autoformatter

* fix tuple distribution constructor

* bugfix

mawright deleted the action_dist branch

August 22, 2019 22:35

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet