[rllib] Envs for vectorized execution, async execution, and policy serving by ericl · Pull Request #2170 · ray-project/ray

ericl · 2018-05-31T01:59:38Z

What do these changes do?

Vectorized envs: Users can either implement VectorEnv, or alternatively set num_envs=N to auto-vectorize gym envs (this vectorizes just the action computation part).

# CartPole-v0 on single core with 64x64 MLP:

# vector_width=1:
Actions per second 2720.1284458322966

# vector_width=8:
Actions per second 13773.035334888269

# vector_width=64:
Actions per second 37903.20472563333

Async envs: The more general form of VectorEnv is AsyncVectorEnv, which allows agents to execute out of lockstep. We use this as an adapter to support ServingEnv. Since we can convert any other form of env to AsyncVectorEnv, utils.sampler has been rewritten to run against this interface.

Policy serving: This provides an env which is not stepped. Rather, the env executes in its own thread, querying the policy for actions via self.get_action(obs), and reporting results via self.log_returns(rewards). We also support logging of off-policy actions via self.log_action(obs, action). This is a more convenient API for some use cases, and also provides parallelizable support for policy serving (for example, if you start a HTTP server in the env) and ingest of offline logs (if the env reads from serving logs).

Any of these types of envs can be passed to RLlib agents. RLlib handles conversions internally in CommonPolicyEvaluator, for example:

       gym.Env => rllib.VectorEnv => rllib.AsyncVectorEnv
       rllib.ServingEnv => rllib.AsyncVectorEnv

TODO:

wait for [rllib] Refactor rllib to have a common sample collection pathway #2149 to merge
unit tests

Related issue number

#2053

AmplabJenkins · 2018-06-14T07:16:35Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6060/
Test FAILed.

AmplabJenkins · 2018-06-15T01:58:30Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6067/
Test FAILed.

AmplabJenkins · 2018-06-15T02:43:49Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6068/
Test FAILed.

AmplabJenkins · 2018-06-15T22:41:04Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6076/
Test FAILed.

AmplabJenkins · 2018-06-15T23:23:22Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6077/
Test FAILed.

AmplabJenkins · 2018-06-16T02:39:28Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6079/
Test FAILed.

AmplabJenkins · 2018-06-16T02:44:27Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6080/
Test FAILed.

ericl · 2018-06-16T04:28:22Z

jenkins retest this please

AmplabJenkins · 2018-06-16T06:29:29Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6082/
Test FAILed.

AmplabJenkins · 2018-06-17T07:12:49Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6085/
Test FAILed.

ericl · 2018-06-17T19:44:12Z

jenkins retest this please

AmplabJenkins · 2018-06-17T23:16:07Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6088/
Test FAILed.

AmplabJenkins · 2018-06-18T01:59:32Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6089/
Test FAILed.

AmplabJenkins · 2018-06-18T05:58:11Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6092/
Test PASSed.

* 'master' of https://github.com/ray-project/ray: (157 commits) Fix build failure while using make -j1. Issue 2257 (ray-project#2279) Cast locator with index type (ray-project#2274) fixing zero length partitions (ray-project#2237) Make actor handles work in Python mode. (ray-project#2283) [xray] Add error table and push error messages to driver through node manager. (ray-project#2256) addressing comments (ray-project#2210) Re-enable some actor tests. (ray-project#2276) Experimental: enable automatic GCS flushing with configurable policy. (ray-project#2266) [xray] Sets good object manager defaults. (ray-project#2255) [tune] Update Trainable doc to expose interface (ray-project#2272) [rllib] Add a simple REST policy server and client example (ray-project#2232) [asv] Pushing to s3 (ray-project#2246) [rllib] Remove need to pass around registry (ray-project#2250) Support multiple availability zones in AWS (fix ray-project#2177) (ray-project#2254) [rllib] Add squash_to_range model option (ray-project#2239) Mitigate randomly building failure: adding gen_local_scheduler_fbs to raylet lib. (ray-project#2271) [rllib] Refactor Multi-GPU for PPO (ray-project#1646) [rllib] Envs for vectorized execution, async execution, and policy serving (ray-project#2170) [Dataframe] Change pandas and ray.dataframe imports (ray-project#1942) [Java] Replace binary rewrite with Remote Lambda Cache (SerdeLambda) (ray-project#2245) ...

ericl added 30 commits May 27, 2018 14:36

wip

682ae7e

cls

846a3a6

re

a5e1416

wip

cfb77be

Merge branch 'fix-classmethod' into v2-refactor

7966e63

wip

3c07c29

a3c working

332683c

torch support

3cea2c9

pg works

d7472e5

lint

b4a782b

rm v2

8738fa3

consumer id

a88957c

clean up pg

370abf0

clean up more

6c2bcbb

fix python 2.7

56429fb

Merge branch 'fix-classmethod' into v2-refactor

2380c8f

tf session management

f16f8f0

docs

71d78b5

dqn wip

5ab8723

fix compile

c6d68ff

dqn

fa015ff

apex runs

e2a41a9

up

84624fe

impotrs

3c4a9fd

ddpg

c56dcef

quotes

04220bf

Merge remote-tracking branch 'upstream/master' into v2-refactor

6f5ef1b

fix tests

95a69df

fix last r

c62a236

fix tests

a9090a4

ericl added 2 commits June 14, 2018 18:15

fixes

27bea6b

fix a3c policy

c8f85ce

ericl added 3 commits June 15, 2018 13:49

Merge remote-tracking branch 'upstream/master' into v2-vectorization

09df795

fix squeeze

f5bb43d

fix trunc on apex

b3214f4

ericl added 2 commits June 15, 2018 17:37

fix squeezing for real

8ebf32f

update

516a595

richardliaw mentioned this pull request Jun 17, 2018

[rllib] Replace tf.minimum with tf.maximum in PPO loss. #2265

Closed

ericl force-pushed the v2-vectorization branch from 623547f to 06b568a Compare June 17, 2018 22:48

remove horizon test for now

e191e82

ericl force-pushed the v2-vectorization branch from 06b568a to e191e82 Compare June 17, 2018 23:54

fix race condition

dcb6eba

ericl merged commit 7dee2c6 into ray-project:master Jun 18, 2018

ericl mentioned this pull request Jul 7, 2018

Add 0.5 release notes. #2365

Merged

robertnishihara deleted the v2-vectorization branch July 7, 2018 07:16

Conversation

ericl commented May 31, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What do these changes do?

Related issue number

Uh oh!

AmplabJenkins commented Jun 14, 2018

Uh oh!

AmplabJenkins commented Jun 15, 2018

Uh oh!

AmplabJenkins commented Jun 15, 2018

Uh oh!

AmplabJenkins commented Jun 15, 2018

Uh oh!

AmplabJenkins commented Jun 15, 2018

Uh oh!

AmplabJenkins commented Jun 16, 2018

Uh oh!

AmplabJenkins commented Jun 16, 2018

Uh oh!

ericl commented Jun 16, 2018

Uh oh!

AmplabJenkins commented Jun 16, 2018

Uh oh!

AmplabJenkins commented Jun 17, 2018

Uh oh!

ericl commented Jun 17, 2018

Uh oh!

AmplabJenkins commented Jun 17, 2018

Uh oh!

AmplabJenkins commented Jun 18, 2018

Uh oh!

AmplabJenkins commented Jun 18, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ericl commented May 31, 2018 •

edited

Loading