Add support for kernel launch parameters #22

kevin-bates · 2019-08-15T01:43:04Z

These changes add the ability for kernel provider clients to pass in "launch" parameters. I chose "launch parameters" over "kernel parameters" because many of the parameters that a client will provide will be used to seed the environment in which the kernel will run, and not necessarily be conveyed to the kernel directly.

More importantly than the launch_params argument added to the launch() methods, the test code illustrates how the metadata corresponding to the available parameters could be implemented. In this case, the provider is a Spec provider, so the metadata is pulled directly from the kernelspec. Other providers may choose to do things differently, but the end result is that the metadata entry returned from find_kernels() would include a launch_params (or whatever we call it) stanza consisting of a list of dicts where each element is a parameter and its dict describes that parameter.

<aside>
We may instead want a dict of dicts where the key is the parameter's symbolic_name (i.e., the canonical value used in substitution) and the value is a dict of the metadata - including a more formal display_name (used in presentation). I kinda like that better, but that's more outside the scope of this PR.
</aside>

Here's the example spec used in the test code...

"metadata": {
  "launch_parameter_schema": {
    "title": "Params_kspec Kernel Provider Launch Parameter Schema",
      "properties": {
        "line_count": {"type": "integer", "minimum": 1, "default": 20, "description": "The number of lines totail"}, 
        "follow": {"type": "string", "enum": ["-f", "-F"], "default": "-f", "description": "The follow option to tail"}, 
        "cpus": {"type": "number", "minimum": 0.5, "maximum": 8.0, "default": 4.0, "description": "The number of CPUs to use for this kernel"},
        "memory": {"type": "integer", "minimum": 2, "maximum": 1024, "default": 8, "description": "The number of GB to reserve for memory for this kernel"}
      },
      "required": ["line_count", "follow"]
  } 
}

The client (front-end) would then use the metadata to prompt for values and create a simple dictionary of name, value pairs. These then get re-validated by the provider against the metadata. Required values that were not provided but have specified defaults are set, ranges are checked (as appropriate), etc. Once validated, the parameters are used in whichever way the provider has defined.

Copying a few other folks that might be interested: @SylvainCorlay @Zsailer @lresende @rolweber @jasongrout @blink1073

EDIT: replaced previously custom schema with JSON schema referenced in comment below (and added Steve to cc list).

takluyver · 2019-08-16T16:27:57Z

jupyter_kernel_mgmt/subproc/launcher.py

+        # Preserve system-owned substitutions by starting with launch params
+        ns = dict()
+        if isinstance(self.launch_params, dict):
+            ns.update(self.launch_params)


Do we want to drop all launch_params into this namespace, or have a sub-dict for that and leave room in case there's a need for other kinds of parameters in the future?

I think I see your point relative to another area I was thinking about - environment variables - where the metadata also includes a list of supported environment variables. We could then maintain separate dicts. That said, having everything in the same namespace is only an issue if there are conflicts between env names and launch args and I'd be okay with stating that those spaces are shared.

I think we could keep a single set of parameters by adding meta-property indicating their expected usages. For example, a context meta-property could be included that has one of a set of values like {'env', 'argv', 'config'}. The presentation layer could choose to use the context to split how it prompts for the values or not - its up to them. The provider could decompose list into multiple sets as it chooses.

I think this comes down to extensibility and my feeling is it would be better to extend the schema with an additional context type vs. supporting a varying number of sub-dicts - where each sub-dict infers its context.

That said, my preference toward a metadata approach is just slightly stronger.

takluyver · 2019-08-16T16:29:24Z

jupyter_kernel_mgmt/subproc/launcher.py

@@ -44,13 +45,14 @@ class SubprocessKernelLauncher:
    """
    transport = 'tcp'

-    def __init__(self, kernel_cmd, cwd, extra_env=None, ip=None):
+    def __init__(self, kernel_cmd, cwd, extra_env=None, ip=None, launch_params=None):


I'd probably say this level of the API should have specific kwargs which the provider is responsible for extracting from the generic launch_params. E.g. this could be called argv_format_args or something.

When I first read this, I figured you were stating that these three keyword parameters could be collapsed into **kwargs. But now that I look closer, it seems like your comment really applies to the code that constructs the SubprocessKernelLauncher instance, since that would be location in which argv-only parameters could be extracted into argv_format_args. Or are you saying that this method would have additional code that pulls the argv-only parameters into argv_format_args, for example?

I don't mind replacing this single parameter with **kwargs, but I'd still need to add code that pulls launch_params out of kwargs making those values available to the format code until the metadata definition is formalized. Or update the callers to use a different keyword.

Could you please clarify what you mean? Sorry.

takluyver · 2019-08-16T16:33:25Z

We may instead want a dict of dicts where the key is the parameter's symbolic_name

I kind of like the list-of-dicts design. Not a strong preference, but if you're going to generate documentation or an interactive form from it, you want to write an ordered collection. I know recent Python will probably preserve the order, but JSON objects are defined as unordered, and you may want to reserialise it and send it to some other code.

kevin-bates · 2019-08-16T17:40:36Z

Good point about ordered dicts. The reason I prefer metadata be a dict of dicts is for correlation in the launch methods of the provider. In order to validate parameters, the client-provided parameter should be easily found in the corresponding metadata. By using the parameter name as the key to its metadata, this is quick and easy; much more so that scanning each element asking if it's name has a value of <parameter>. (I'm sure there's some python trick to perform that in an elegant manner, but I can't see it beating a dictionary lookup.)

codecov-io · 2019-08-16T18:49:22Z

Codecov Report

Merging #22 into master will increase coverage by 0.11%.
The diff coverage is 98.46%.

@@            Coverage Diff             @@
##           master      #22      +/-   ##
==========================================
+ Coverage   70.95%   71.07%   +0.11%     
==========================================
  Files          27       27              
  Lines        2045     2043       -2     
==========================================
+ Hits         1451     1452       +1     
+ Misses        594      591       -3

Impacted Files	Coverage Δ
jupyter_kernel_mgmt/tests/test_discovery.py	`96.58% <100%> (+0.89%)`	⬆️
jupyter_kernel_mgmt/tests/test_kernelspec.py	`99.21% <100%> (ø)`	⬆️
jupyter_kernel_mgmt/subproc/launcher.py	`74.72% <100%> (+0.56%)`	⬆️
jupyter_kernel_mgmt/discovery.py	`61.38% <90.9%> (-1%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4902927...00b4950. Read the comment docs.

kevin-bates · 2019-08-18T00:44:17Z

After experimenting with JSON schema, I think the best approach would be for the launch parameters to be defined in native JSON schema. (Originally, I was thinking that we would build a schema so we could entertain custom properties, but I think pure JSON schema is more familiar and tools would adhere to it better. In addition, parameter validation would be more straight forward.

I've updated the test to use this to define the launch_parameters:

"metadata": {
  "launch_parameter_schema": {
    "title": "Params_kspec Kernel Provider Launch Parameter Schema",
      "properties": {
        "line_count": {"type": "integer", "minimum": 1, "default": 20, "description": "The number of lines totail"}, 
        "follow": {"type": "string", "enum": ["-f", "-F"], "default": "-f", "description": "The follow option to tail"}, 
        "cpus": {"type": "number", "minimum": 0.5, "maximum": 8.0, "default": 4.0, "description": "The number of CPUs to use for this kernel"},
        "memory": {"type": "integer", "minimum": 2, "maximum": 1024, "default": 8, "description": "The number of GB to reserve for memory for this kernel"}
      },
      "required": ["line_count", "follow"]
  } 
}

kevin-bates · 2019-08-19T17:37:06Z

Just remembered that Kyle might be interested in this as well: @rgbkrk

Zsailer · 2019-08-19T17:49:47Z

After experimenting with JSON schema, I think the best approach would be for the launch parameters to be defined in native JSON schema

👍 for defining in native JSON Schema.

You probably haven't been following the Jupyter Telemetry or Metadata work, but defining JSON schemas is foundational to this work. It would be awesome to hook the kernel provider work into Telemetry and Metadata systems. We could easily emit/store and validate kernel provider events using your schema.

takluyver · 2019-08-27T15:57:03Z

Feel free to resolve the conflicts and merge this.

This updates the launch methods to include an optional parameter consisting of a dictionary of name/value pairs. It also applies these to the SubprocessKernelLauncher's cmd string. Finally, it provides a POC for how parameter metadata may get expressed and processed. Fixes #19

kevin-bates mentioned this pull request Aug 15, 2019

Parametrized kernels for passing arguments in to starting up kernels jupyter/jupyter_client#434

Open

takluyver reviewed Aug 16, 2019

View reviewed changes

kevin-bates merged commit debe5a9 into takluyver:master Aug 27, 2019

kevin-bates deleted the launch-params branch August 27, 2019 17:15

kevin-bates mentioned this pull request Sep 19, 2019

Transition to Kernel Provider model for kernel management jupyter-server/jupyter_server#90

Closed

kevin-bates mentioned this pull request Oct 4, 2019

Kernel providers jupyter-server/jupyter_server#112

Closed

12 tasks

vidartf mentioned this pull request Oct 18, 2019

Passing request URI to kernel env voila-dashboards/voila#414

Merged

kevin-bates mentioned this pull request Dec 3, 2019

[PROPOSAL] Parameterized Kernel Launch #38

Closed

echarles mentioned this pull request Jan 8, 2024

Jupyter Kernel Parameters datalayer/rfc#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for kernel launch parameters #22

Add support for kernel launch parameters #22

kevin-bates commented Aug 15, 2019 •

edited

Loading

takluyver Aug 16, 2019

kevin-bates Aug 16, 2019

takluyver Aug 16, 2019

kevin-bates Aug 16, 2019

takluyver commented Aug 16, 2019

kevin-bates commented Aug 16, 2019

codecov-io commented Aug 16, 2019 •

edited

Loading

kevin-bates commented Aug 18, 2019 •

edited

Loading

kevin-bates commented Aug 19, 2019

Zsailer commented Aug 19, 2019

takluyver commented Aug 27, 2019

Add support for kernel launch parameters #22

Add support for kernel launch parameters #22

Conversation

kevin-bates commented Aug 15, 2019 • edited Loading

takluyver Aug 16, 2019

Choose a reason for hiding this comment

kevin-bates Aug 16, 2019

Choose a reason for hiding this comment

takluyver Aug 16, 2019

Choose a reason for hiding this comment

kevin-bates Aug 16, 2019

Choose a reason for hiding this comment

takluyver commented Aug 16, 2019

kevin-bates commented Aug 16, 2019

codecov-io commented Aug 16, 2019 • edited Loading

Codecov Report

kevin-bates commented Aug 18, 2019 • edited Loading

kevin-bates commented Aug 19, 2019

Zsailer commented Aug 19, 2019

takluyver commented Aug 27, 2019

kevin-bates commented Aug 15, 2019 •

edited

Loading

codecov-io commented Aug 16, 2019 •

edited

Loading

kevin-bates commented Aug 18, 2019 •

edited

Loading