-
Notifications
You must be signed in to change notification settings - Fork 6.9k
Description
There's been a lot of user questions and discussion around how to specify dependencies for a given job, actor, or task. There are a few different general use cases here:
- Users want to run tasks/actors that require different or conflicting Python dependencies as part of one Ray application.
- Users want to use Docker containers to manage dependencies (not just Python) for different tasks and actors as part of one Ray application.
- Users want to distribute local Python modules and files to their workers/actors for rapid iteration/development.
- Users want to easily install new Python packages as part of their development workflow (e.g., change library versions without restarting the cluster).
This proposal is to introduce a new runtime_env
API that enables all of these use cases and can generalize to future worker environment-related demands.
The runtime_env
will be a dictionary that can be passed as an option to actor/task creation:
f.options(runtime_env=env).remote()
Actor.options(runtime_env=env).remote()
This dictionary will include the following arguments:
container_image (str)
: Require a given (Docker) container image. The image must have the same version of Ray installed.conda_env (str)
: Activates a named conda environment that the worker will run in. The environment must already exist on the node.files (Path)
: Project files and local modules to unpack in the working directory of the task/actor.- (possible future extension)
python_requirements (Union[File, List[str]])
: List of Python requirements or a requirements.txt file to use to dynamically create a new conda environment.
These options should cover the all known dependency management use cases listed above.
Misc semantics:
- Any downstream tasks/actors will by default inherit the
runtime_env
of their parent. - The
runtime_env
needs to be able to be specified on an individual actor and task basis, but for convenience it should also be able to be set in theJobConfig
as a default for all tasks/actors spawned by the driver.
At this point, this RFC is primarily about the use cases and interface, not the implementation of each runtime_env
option. Please comment if you believe there is a use case not covered, the UX could be improved, etc.