[wingman -> rllib] Remote and entangled environments #3968

bjg2 · 2019-02-06T16:39:39Z

NOTE: This is the beginning of the pull request, so we can align. No unit tests were run, nor the changes are tested broader than we need them.

What do these changes do?

Adds the config remote_worker_envs to work with environments as remote ray process, and to step all environments in one worker in parallel. This is important for our use-case, as we're working with big environments (like SC2) and stepping through them is relatively expensive.
Adds the config entangled_worker_envs to work with one physical environment as many logical environments (batched step). This is important for our use-case, as we have a Unity environment where batching several logical environments give many performance/resource management benefits. These logical environments are disconnected, done in one environment does not done others, it should reset right away, it doesn't share episode with others, so MultiAgentEnv is not working for us.

Environment issues

These changes work, but they are not the prettiest - for example even though remote_worker_envs config is set, env_creator needs to create remote environment only if provided env_context remote field is True (sometimes is built with remote, sometimes without). Aside from that, there is already mentioned issue that sometimes environments are created only for observation/action space, and that all the init data might be unused. Proposal for fixing these issues: change the register env interface so it takes (env_name, env_class, function mapping env_context -> env_constructor_params, function mapping env_context -> (obs_space, action_space)). That way no unneded environments would be created (as space data is already provided), and creating remote environments could be handled by rllib framework (user should only set remote_worker_envs to true).

Current remote env creation:

def build_env(env_context):
    cls = ray.remote(num_cpus=1)(Sc2Env).remote if env_context.remote else Sc2Env
    return cls(env_context.copy(), env_context.worker_index, env_context.vector_index)

AmplabJenkins · 2019-02-06T17:50:11Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/11616/
Test FAILed.

ericl

The remote env option makes sense to me. Isn't the entangled env the same as VectorEnv though? Is the only difference that the class subclasses gym.Env instead of VectorEnv?

ericl · 2019-02-06T21:30:21Z

python/ray/rllib/agents/agent.py

    def _make_evaluator(self, cls, env_creator, policy_graph, worker_index,
-                        config):
+                        config, remote_worker_envs=False,
+                        entangled_worker_envs=False):


These args can be passed as part of config right?

Removed entangled_worker_envs.

remote_worker_envs is hardcoded to False for local_evaluator, so it doesn't create ray workers if it doesn't need to. Also, you can see that dummy env in each evaluator is created with remote false with the same reason.

A lot of problems boils down to the issue that env needs to be created for dummy values. I encourage opening new environment register interface as mentioned in pull request description, and obsoleting current one (printing obsolete when used and removing it after few releases)...

I see. So one reason it makes sense to put under config is that you might want to run with num_workers: 0 with remote envs.

I think you can instead have the remote env wrapper initialize the remote workers only on reset(), similar to the rest of the env adapters. I agree the registration interface could use re-thinking though for the future.

I don't get how / why would you run with num_workers 0? Local evaluator is used for optimizer and remote evaluators are used for sampling, why would you want to mix those?

Local environment currently creates num num_envs_per_worker environments too much (it should not create neither one). It would be too costly to request this much ray workers and could even lead to hang if they couldn't be scheduled.

ericl · 2019-02-06T21:30:55Z

python/ray/rllib/env/base_env.py

        if not isinstance(env, BaseEnv):
            if isinstance(env, MultiAgentEnv):
+                # NOTE: Probably should handle remote / entangled envs in
+                #       _MultiAgentEnvToAsync as well


Can we raise NotImplementedError here if those options are enabled?

ericl · 2019-02-06T21:31:33Z

python/ray/rllib/env/entangled_env.py

+@PublicAPI
+class EntangledEnv(gym.Env):
+    """Interface for one physical environment that hosts
+    several logical environments."""


How does this differ from VectorEnv?

You have the point - it does not. My mistake, didn't realize there is env I could use out of the box.

ericl · 2019-02-06T21:33:07Z

python/ray/rllib/env/vector_env.py

+
+        for env in range(self.envs):
+            assert isinstance(env, ray.actor.ActorClass), \
+                "Your environment needs to be ray remote environment"


Instead of requiring the env to be decorated as remote, you could add the decorator here on the fly (i.e., ray.remote(cls)). That way, it becomes possible to toggle between remote / non-remote with just the use_remote flag.

I was mentioning that problem in the description. We don't have the environment class, just a function that maps env_context to env object. Even if we wanted to create env object and deduce the class from environment object (which is very ugly), we would need to know how to map env_context to env constructor arguments, which is info that we do not have.

The only solution I can think of is changing register env interface as mentioned in pull request desription.

Got it, this seems fine then.

ericl · 2019-02-06T21:34:00Z

python/ray/rllib/env/env_context.py

+        self.entangled_envs_num = entangled_envs_num

-    def with_vector_index(self, vector_index):
+    def align(self, env_config=None, worker_index=None, vector_index=None,


What does this function do?

Bad naming, agree. Renamed to copy_with_overrides.

ericl · 2019-02-06T21:35:30Z

python/ray/rllib/env/vector_env.py

+            rew_batch.append(rew)
+            done_batch.append(done)
+            info_batch.append(info)
+        return obs_batch, rew_batch, done_batch, info_batch


I think we need some way of destroying these remote actors when the agent is stopped, otherwise the actors may be leaked. (I haven't checked if they actually are).

Problem is, I don't know how we can do that if the framework doesn't start ray workers. And then we come back to the issues we were talking about earlier...

But I'm not sure, seems that those are getting closed (they do not exist after keyboard interrupt). But other ugly kinds of stuff are happening - everything hangs if you request more environments than you can schedule on ray, etc (because it is out of the framework rllib does not handle it).

Makes sense for now, I think you only run into the issue if running multiple experiments on the same cluster.

Re: resources: I think that can be resolved with the right resource requests (at least when running in Tune):

Basically, if you have remote envs enabled, then the agent should request max(1, num_workers) * num_envs_per_worker * num_remote_envs extra_cpu in default_resource_request(). That way you won't run into this resource allocation deadlock. However this only applies if running in Tune.

Alternatively, I think it would also work to create these remote envs with num_cpus=0.

Yeah, I figured out those ways around the issue, just mentioned that we don't control the remote environments and we don't even raise an exception if there are no resources when building them, it just hangs.

bjg2 · 2019-02-07T09:51:31Z

The remote env option makes sense to me. Isn't the entangled env the same as VectorEnv though? Is the only difference that the class subclasses gym.Env instead of VectorEnv?

Yes. My blunder, I didn't understand I could just use that one.

AmplabJenkins · 2019-02-07T10:52:31Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/11637/
Test FAILed.

AmplabJenkins · 2019-02-07T11:33:55Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/11641/
Test FAILed.

AmplabJenkins · 2019-02-07T13:22:11Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/11647/
Test FAILed.

ericl

Could you also add a quick regression test in test_policy_evaluator.py? You could also have a standalone test script if it doesn't fit into that file, and add the entry to run_multi_node_tests.sh

ericl · 2019-02-07T21:24:09Z

python/ray/rllib/agents/agent.py

    },
+
+    # Whether environments are in remote process or not
+    "remote_worker_envs": False,


Can you move this to under the Environment config section?

ericl · 2019-02-07T21:26:37Z

python/ray/rllib/agents/agent.py

    def _make_evaluator(self, cls, env_creator, policy_graph, worker_index,
-                        config):
+                        config, remote_worker_envs=False,
+                        entangled_worker_envs=False):


I see. So one reason it makes sense to put under config is that you might want to run with num_workers: 0 with remote envs.

I think you can instead have the remote env wrapper initialize the remote workers only on reset(), similar to the rest of the env adapters. I agree the registration interface could use re-thinking though for the future.

ericl · 2019-02-07T21:27:34Z

python/ray/rllib/agents/agent.py

-            output_creator=output_creator)
+            output_creator=output_creator,
+            remote_worker_envs=remote_worker_envs,
+            entangled_worker_envs=entangled_worker_envs)


Should use config (and remove entangled_)

Commented about config above, and entangled does not exist (comment is on outdated snippet).

ericl · 2019-02-07T21:30:13Z

python/ray/rllib/env/vector_env.py

        return self.envs
+
+
+class _RemoteVectorizedGymEnv(_VectorizedGymEnv):


Consider extending VectorEnv directly, since you don't seem to use much of the functionality of VectorizedGymEnv

Well, I do reuse constructor and get_unwrapped, no need to copy those. I would leave it like this.

ericl · 2019-02-07T21:30:57Z

python/ray/rllib/env/vector_env.py

+
+        for env in range(self.envs):
+            assert isinstance(env, ray.actor.ActorClass), \
+                "Your environment needs to be ray remote environment"


Got it, this seems fine then.

ericl · 2019-02-07T21:33:49Z

python/ray/rllib/env/vector_env.py

+            rew_batch.append(rew)
+            done_batch.append(done)
+            info_batch.append(info)
+        return obs_batch, rew_batch, done_batch, info_batch


Makes sense for now, I think you only run into the issue if running multiple experiments on the same cluster.

Re: resources: I think that can be resolved with the right resource requests (at least when running in Tune):

Basically, if you have remote envs enabled, then the agent should request max(1, num_workers) * num_envs_per_worker * num_remote_envs extra_cpu in default_resource_request(). That way you won't run into this resource allocation deadlock. However this only applies if running in Tune.

Alternatively, I think it would also work to create these remote envs with num_cpus=0.

AmplabJenkins · 2019-02-08T12:05:49Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/11687/
Test FAILed.

AmplabJenkins · 2019-02-08T14:19:06Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/11692/
Test FAILed.

ericl · 2019-02-08T20:48:09Z

I pushed a change to auto-wrap remote envs. I think this will make it a lot easier to use since you don't need to modify existing envs to turn on the flag, let me know if it works for you. Also added a test.

Btw, you can run scripts/format.sh to auto-lint the files.

AmplabJenkins · 2019-02-08T21:22:48Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/11698/
Test FAILed.

AmplabJenkins · 2019-02-08T21:54:13Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/11700/
Test FAILed.

AmplabJenkins · 2019-02-08T22:11:52Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/11701/
Test FAILed.

bjg2 · 2019-02-11T10:21:59Z

python/ray/rllib/env/vector_env.py

+            "Creating throwaway env to get action and obs space. To avoid "
+            "resource overheads, your env should defer any expensive "
+            "initialization to reset().")
+        dummy = make_env(0)


You shouldn't need dummy env as you have action_space / observation_space already provided as arguments from dummy env created in policy evaluator constructor.

and if creating dummy env, calling dummy.close() at the end might be a good idea?

Right, fixed.

bjg2 · 2019-02-11T10:32:08Z

python/ray/rllib/env/vector_env.py

+        return self.env.reset()
+
+    def step(self, action):
+        return self.env.step(action)


Is it possible to call close() for remote env? SC2 environments are starting SC2 server which is a separate process, and and I guess the correct way to stop it in these situations would be calling the close method (though I see them dying after keyboard interrupt).

There's python atexit which I think should work. If not, we can add close() hooks (but I don't know if this is as reliable in case of errors).

bjg2 · 2019-02-11T10:36:16Z

I pushed a change to auto-wrap remote envs. I think this will make it a lot easier to use since you don't need to modify existing envs to turn on the flag, let me know if it works for you. Also added a test.

Btw, you can run scripts/format.sh to auto-lint the files.

Don't know why I haven't think of wrapping :( Looks and works great!

AmplabJenkins · 2019-02-11T11:34:37Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/11770/
Test FAILed.

AmplabJenkins · 2019-02-11T20:27:35Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/11791/
Test FAILed.

ericl · 2019-02-12T18:47:14Z

  File "/ray/python/ray/rllib/test/test_nested_spaces.py", line 319, in <lambda>
    self.doTestNestedTuple(lambda _: BaseEnv.to_base_env(NestedTupleEnv()))
  File "/ray/python/ray/rllib/env/base_env.py", line 96, in to_base_env
    observation_space=env.observation_space)
  File "/ray/python/ray/rllib/env/vector_env.py", line 35, in wrap
    observation_space)
  File "/ray/python/ray/rllib/env/vector_env.py", line 96, in __init__
    self.envs.append(self.make_env(len(self.envs)))
TypeError: 'NoneType' object is not callable

bjg2 · 2019-02-13T10:39:55Z

  File "/ray/python/ray/rllib/test/test_nested_spaces.py", line 319, in <lambda>
    self.doTestNestedTuple(lambda _: BaseEnv.to_base_env(NestedTupleEnv()))
  File "/ray/python/ray/rllib/env/base_env.py", line 96, in to_base_env
    observation_space=env.observation_space)
  File "/ray/python/ray/rllib/env/vector_env.py", line 35, in wrap
    observation_space)
  File "/ray/python/ray/rllib/env/vector_env.py", line 96, in __init__
    self.envs.append(self.make_env(len(self.envs)))
TypeError: 'NoneType' object is not callable

Fixed falling tests.

AmplabJenkins · 2019-02-13T14:33:18Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/11883/
Test PASSed.

ericl · 2019-02-13T18:08:45Z

Tests look good, thanks for contributing this!

added all our environment changes

242aeea

bjg2 mentioned this pull request Feb 6, 2019

Wingman IMPALA changes #3945

Closed

ericl reviewed Feb 6, 2019

View reviewed changes

Merge branch 'master' into env-changes

5d168c5

fixed merge request comments and remote env

a3cb42d

fixed remote check

e211849

bjg2 mentioned this pull request Feb 7, 2019

[rllib] Memory usage constantly growing while training IMPALA #3884

Closed

ericl reviewed Feb 7, 2019

View reviewed changes

Merge branch 'master' into env-changes

2228d4c

moved remote_worker_envs to correct config section

c8752d3

ericl added 2 commits February 8, 2019 12:12

lint

04c1fbc

auto wrap impl

5d1c9e4

ericl force-pushed the env-changes branch from 47b209b to 5d1c9e4 Compare February 8, 2019 20:47

ericl self-assigned this Feb 8, 2019

Merge remote-tracking branch 'upstream/master' into env-changes

30e9228

bjg2 commented Feb 11, 2019

View reviewed changes

Merge branch 'master' into env-changes

95be509

bjg2 commented Feb 11, 2019

View reviewed changes

ericl added 2 commits February 11, 2019 11:28

fix

865819f

Merge branch 'env-changes' of github.com:wingman-ai/ray into env-changes

30ef65b

ericl approved these changes Feb 11, 2019

View reviewed changes

Aleksandar Milovanović added 2 commits February 13, 2019 11:37

fixed the tests

d774d1e

Merge branch 'master' into env-changes

f7b2e2f

ericl merged commit 0e37ac6 into ray-project:master Feb 13, 2019

bjg2 deleted the env-changes branch February 25, 2019 09:50

		return self.envs


		class _RemoteVectorizedGymEnv(_VectorizedGymEnv):

[wingman -> rllib] Remote and entangled environments #3968

[wingman -> rllib] Remote and entangled environments #3968

Uh oh!

Conversation

bjg2 commented Feb 6, 2019

What do these changes do?

Environment issues

Uh oh!

AmplabJenkins commented Feb 6, 2019

Uh oh!

ericl left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bjg2 commented Feb 7, 2019

Uh oh!

AmplabJenkins commented Feb 7, 2019

Uh oh!

AmplabJenkins commented Feb 7, 2019

Uh oh!

AmplabJenkins commented Feb 7, 2019

Uh oh!

ericl left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AmplabJenkins commented Feb 8, 2019

Uh oh!

bjg2 Feb 11, 2019 •

edited

Loading