Parameterized kernel specs proposal #87

hbcarlos · 2022-02-10T09:47:55Z

Parameterized kernel specs

In this JEP we propose to parameterize the kernel specs, simplifying the way some kernels are installed reducing the amount of kernel specs files, and at the same time improving the UI of some of the Jupyter front-ends.

Co-author: @SylvainCorlay
Co-author: @AnastasiaSliusar

Contributors that may be interested in this topic from past conversations:

dhirschfeld · 2022-02-10T10:06:23Z

I imagine @kevin-bates would be interested in this too.

blink1073 · 2022-02-10T14:30:33Z

@echarles has been exploring similar ideas in https://github.com/deshaw/ksmm

echarles · 2022-02-10T14:42:35Z

There also is a issue opened in jupyter-server team-compass jupyter-server/team-compass#16 to give connect to people interested in this feature.

So basically, you are adding the following stanza to the kernelspec?

  "metadata": {
      "parameters": {
        "cpp_version": {
          "type": "string",
          "default": 'C++14',
          "enum": ['C++11', 'C++14', 'C++17'],
          "save": true
        }
      }
    },
}

echarles · 2022-02-10T14:49:31Z

... and that stanza is a json-schema.

For KSMM, we give the option to define env var (see https://raw.githubusercontent.com/deshaw/ksmm/main/screenshots/general_settings_ss.png). I read you say ...lists (respectively the command-line arguments and environment variables)... but I don't see example for such env var definitions.

kevin-bates · 2022-02-10T15:53:11Z

Thanks for the ping @dhirschfeld and reference to the team-compass issue @echarles.

@hbcarlos - this is great - thank you for opening this issue!

I'm really hoping this proposal can be expanded a bit to also include provisioner_parameters (and perhaps rename the parameters stanza to kernel_parameters) since this proposal appears to focus only on kernel-based parameters. For reference, here's the originally proposed JEP that I closed due to dormancy: https://github.com/kevin-bates/enhancement-proposals/blob/4c0727fbe03653aba789de67c388e2d28cda4cb9/parameterized-launch/parameterized-launch.md. It includes both schemas.

I would love to participate in any way possible. Thank you for opening this.

SylvainCorlay · 2022-02-10T18:11:18Z

For KSMM, we give the option to define env var (see https://raw.githubusercontent.com/deshaw/ksmm/main/screenshots/general_settings_ss.png). I read you say ...lists (respectively the command-line arguments and environment variables)... but I don't see example for such env var definitions.

Exactly, the env list of the kernelspec can also have reference to parameters as said in the proposal, although the example that we included does not make use of it.

E.g.

{
  "display_name": "C++",
  "argv": [
      "/home/user/micromamba/envs/kernel_spec/bin/xcpp",
      "-f",
      "{connection_file}",
      "-std={parameters.cpp_version}"
  ],
  env: [
    "XEUS_LOGLEVEL={parameters.xeus_log_level}"
  ],
  "language": "C++"
  "metadata": {
      "parameters": {
        "cpp_version": {
          "type": "string",
          "default": "C++14",
          "enum": ["C++11", "C++14", "C++17"],
          "save": true
        },
        "xeus_log_level": {
          "type": "string",
          "default": "ERROR",
          "enum": ["TRACE", "DEBUG", "INFO", "WARN", "ERROR", "FATAL"],
          "save": true
        }
      }
    },
}

hbcarlos · 2022-02-15T14:48:42Z

I would love to participate in any way possible. Thank you for opening this.

Thanks, @kevin-bates! Help is always welcome!

I'm really hoping this proposal can be expanded a bit to also include provisioner_parameters (and perhaps rename the parameters stanza to kernel_parameters) since this proposal appears to focus only on kernel-based parameters.

This proposal allows specifying parameters independently of how they may be used (as environment variables, command-line arguments). What we are trying to achieve on this JEP, is to include and formalize the concept of parameters in the kernel specs but not the semantics, so that developers can use these parameters for any purpose (provisioner parameters, environment variables, etc.). I'll update the proposal to make it more clear.

kevin-bates

This is a great start - thank you.

I'm not sure if this is necessary here but should we discuss where/how these parameters will be applied when, and when not, provided in the start request payload.

I've also been thinking about how the JSON schemas get into existence in the first place. Yes, we definitely need to support their presence in the kernel.json file (and I know @SylvainCorlay has brought up having a separate "sibling file" for the schema), but I'd also like to see a "query capability" whereby the KernelSpecManager asks the configured kernel provisioner "give me your parameter schema". This would allow the provisioner's current environment to perhaps be utilized to come up with the schema. For example, if memory were a parameter that the provisioner applies, then the provisioner could (hypothetically) detect available memory and use some heuristic to determine the range of memory values applicable to that configuration. Even if the "query" returns static information, it would go a long way to ease the kernelspec deployment burden on operators.

Of course, what is returned by the KernelSpecManager is the (possibly merged) result of these interactions, so perhaps we could drive toward that later - we should just try to make sure we don't preclude us from having this option.

(The KernelSpecManager could do the same kind of query against the underlying kernel, although that may be tougher because I'd like to see a given kernel spec have language as a parameter, where the kernel provisioner (or certain provisioners at least) support multiple kernel types. This may be pie-in-the-sky kind of stuff, but dynamic kernelspecs is where we're heading.)

kevin-bates · 2022-02-23T00:49:57Z

jupyter-parameterized-kernel-spec/jupyter-parameterized-kernel-specs.md

+Cons:
+
+ - Changes are required in multiple components of the stack, from the protocol specification to the front-end.
+ - Unless we require default values for all parameters, this would be a backward-incompatible change.


I think, at a minimum, we need to require default values for all required parameters and the "source" (i.e., kernel or kernel provisioner) should have a reasonable default for others. Of course, having a default for each parameter would be a "best practice".

Where the default values probably fall down are for things like credentials, but I would argue those kinds of things are probably best retained in a backing store (e.g, configuration file or database), where any inputted values would override the persisted values.

I guess the point is that we don't want to abandon all the jupyter_client-based applications that don't have the ability to present a JSON schema for parameter selection - so some form of reasonable defaulting is necessary.

As much as possible we need to be backwards compatible with existing kernel spec users so I'm in agreement with @kevin-bates on default values. Maybe it's fine to let the kernel fail if it's missing a parameter (or if the parameter is invalid).

kevin-bates · 2023-02-02T20:40:16Z

Hi @hbcarlos,

I'm not sure where this JEP stands but I'd like to try to revive it a bit. I think we should discuss some of the issues I raised in the previous comment, namely parameter discovery (which, at the time, I had referred to as "query capability").

While I think having static parameters reside in the kernel.json file is sufficient, the KernelSpecManager should be capable of asking the configured kernel provisioner (including the default LocalProvisionser when one is not specified directly in kernel.json) to provide its parameter schema.

Moreover, this interaction should pass the kernel.json to the provisioner and, using a kernel hint located in the kernel.json, the provisioner would then, similarly, ask the kernel "parameter provider" for its parameter schema. The provisioner would then return both parameter schemas which the KernelSpecManager would reconcile with the existing static parameters to produce an "in-memory" kernel.json file that is then returned to the application asking for this information. In essence, any static parameters would override those returned from the provisioner parameter provider.

By introducing this discovery mechanism, existing kernel.json instances would not have to be overwritten when new versions of kernels are deployed that contain an updated set of parameters. (We'd need to work out what this would mean if a given kernel/provisioner parameter provider were to remove support for a parameter between versions. This makes me wonder if we should allow static parameter schema at all.) Likewise, until the discovered parameters are persisted as static parameters into the kernel.json file, no parameter schema would even need to necessarily exist.

Because a specific kernel package may not be necessarily installed locally, I was thinking of exposing "parameter providers" via entry points. The previously referenced kernel hint would reflect the name of the kernel parameter provider that the provisioner calls to fetch the corresponding kernel parameter schema. This would allow kernel implementations to be decoupled from their parameter provider if that was advantageous. Similarly, we could have provisioner parameter providers, although since the provisioner must be locally available, the configured provisioner package could expose itself as a provisioner parameter provider.

For applications using KernelSpecManager for discovering available kernel options, we'd also want to institute a kernelspec cache so that this parameter discovery is minimized.

The KernelSpecManager that performs this discovery and reconciliation, could be a subclass of the existing manager, if we only wanted this capability in, say, Jupyter Server due to additional dependency issues, or introduce an optional installation dependency on jupyter_client with something like jupyter_client[caching] that includes the extra dependency (probably something akin to watchfiles).

I think by introducing discoverable parameter schemas we would essentially have "dynamic kernelspecs" which would be easier to maintain and be more backward compatible between releases (including those releases that predate parameterized kernel specs).

davidbrochart · 2023-02-03T13:40:05Z

@kevin-bates Things like "entry points", KernelSpecManager and "subclass" sound like implementation details to me, and I think they are out of the scope of this JEP, or JEPs in general, which should remain language-agnostic.

hbcarlos · 2023-02-03T14:42:59Z

Hi @kevin-bates,

Yes, I would like to revive this JEP as well. Give me a couple of days to get up to speed, and I'll get back to you.

kevin-bates · 2023-02-03T16:01:47Z

Things like "entry points", KernelSpecManager and "subclass" sound like implementation details to me, and I think they are out of the scope of this JEP, or JEPs in general, which should remain language-agnostic.

I understand. I just wanted to help paint a picture of how a discovery mechanism might be conceptualized. Somehow a link from the discoverer to the discoverable must be made and my terminology was something concrete that others can understand. I'll try to be more abstract in future responses.

hbcarlos · 2023-02-08T15:51:47Z

Thanks for the comments and the interest you have shown in this, @kevin-bates.

After looking into the Jupyter Client, Jupyter Protocol, the Jupyter Kernel management, and discussions and JEPs about the kernel discovery framework, parameterized kernel launch, kernel providers, etc. In my opinion, your comments in #87 (comment) are out of the scope of this JEP. They might take advantage of the parameterization of the Kernel Specs we propose here. In addition, Kernel Provisioners could be an excellent example to show the necessity of parameterized Kernel Specs.

I just wanted to clarify with you. In this JEP, we want to formalize the introduction of parameters to the Kernel Specs. We are trying to introduce parameters for everyone to take advantage of it, from provisioners to kernels. The idea is to make it as generic as possible to include every possible use case.

kevin-bates · 2023-02-08T16:05:06Z

Thanks for your response @hbcarlos.

I just wanted to clarify with you. In this JEP, we want to formalize the introduction of parameters to the Kernel Specs. We are trying to introduce parameters for everyone to take advantage of it, from provisioners to kernels. The idea is to make it as generic as possible to include every possible use case.

By this comment, it sounds like there will be pending updates to take into account parameters that are kernel-provisioner-relative - which is great. Thanks.

Will these be conveyed in separate stanzas so that things like discovery (and dynamic kernelspecs) could be implemented?

kevin-bates · 2023-02-08T16:13:27Z

jupyter-parameterized-kernel-spec/jupyter-parameterized-kernel-specs.md

+        "type": "string",
+        "default": "C++14",
+        "enum": ["C++11", "C++14", "C++17"],
+        "save": true


Would it make sense to have a format_string meta-property so software that launches the kernel can simply build the rest of the argv stanza on the fly? Otherwise, the baked-in argv in the kernel.json dictates what parameters will be used, yet users may want to include others that are defined in the schema. For example, in this example, parameters xeus_log_level will never be included despite the user wanting to enable TRACE output.

Suggested change

"save": true

"save": true,

"format_string": "-std={cpp_version}"

Then, only those parameters that have been provided by the user (or are required) would be included in the finalized argv list.

The xeus_log_level parameter is used in the environment variable:

env: [ "XEUS_LOGLEVEL={parameters.xeus_log_level}" ],

That's why we are trying to make it as generic as possible so we can use these parameters anywhere.

Maybe the format_string is helpful for flags or optional parameters. I don't have an example right now but imagine a kernel with an optional flag to activate or deactivate LSP (Language Server Protocol) features.

{ "display_name": "C++", "argv": [ "/home/user/micromamba/envs/kernel_spec/bin/xcpp", "-f", "{connection_file}", "{parameters.lsp}" ], env: [ "XEUS_LOGLEVEL={parameters.xeus_log_level}" ], "language": "C++" "metadata": { "parameters": { "lsp": { "type": "string", "default": "True", "format_string": "--lsp", "save": true } } }, }

To be honest, I don't have a strong opinion on this. More people could chime in.

The xeus_log_level parameter is used in the environment variable:

Ah - I'm sorry, I missed that. I figured environment variables would be classified in a different manner. In that vein,
I think there should be a means of adding environmental variables, free form, and those that are specified in the schema should be classified as environment variables to assist in UX. These are the kinds of things that integrations typically need and we should enable the ability to add any environment variable to the env of the kernel.

kevin-bates · 2023-02-08T16:15:57Z

jupyter-parameterized-kernel-spec/jupyter-parameterized-kernel-specs.md

+  ],
+  "language": "C++"
+  "metadata": {
+    "parameters": {


Since these will include provisioner-relative parameters, I think it would be good to have separate kernel_parameters and provisioner_parameters stanzas - both for end-user applications and programmatic processing.

Cdsdashboards does provisioning atop JupyterHub (~BinderHub but with parameters)

https://github.com/ideonate/cdsdashboards/blob/3d6cb7d4f60f82a7d366a8f439a57ab6f2479070/cdsdashboards/builder/processbuilder.py#L46-L91

README:

User sees a safe user-friendly version of the original notebook - served by Voilà, Streamlit, Dash, Bokeh, Panel, R Shiny etc.
All of this works through a new Dashboards menu item added to JupyterHub's header.

jupyter-repodocker looks for REES config files like {requirements.txt, environment.yml, Dockerfile} in / and /.binder/ ; but there are also command line argument parameters:
https://repo2docker.readthedocs.io/en/latest/usage.html#command-line-api

Since these will include provisioner-relative parameters, I think it would be good to have separate kernel_parameters and provisioner_parameters stanzas - both for end-user applications and programmatic processing.

It is okay for me to distinguish between kernel and provisioner. Nevertheless, I believe we should not distinguish them because provisioner parameters will get a form for free once we implement the form for choosing kernel parameters.

For example, in JupyterLab, when a user selects the kernel, if there is a custom provisioned, they will see and be able to configure parameters like (CPU, GPU, and memory) from the same form.

Nevertheless, I believe we should not distinguish them because provisioner parameters will get a form for free once we implement the form for choosing kernel parameters.

For example, in JupyterLab, when a user selects the kernel, if there is a custom provisioned, they will see and be able to configure parameters like (CPU, GPU, and memory) from the same form.

I understand, but users should know that certain parameters are relative to the provisioned environment while others are relative to the kernel. In addition, their metadata specifications in the schemas will be separated, so their values should remain separated as well.

If they don't separate them, then users will see completely different sets of parameters for the "same kernel" depending on the provisioner and (I believe) it makes sense logically to separate the two. That said, the UX can choose how these should be organized but at least they'd have that option if the two are separated in their schemas and submission values.

I don't think that we need to give a specific semantics to the new kernel parameters, since they are completely generic and can have very different semantics depending on the use case (language version, connexion parameters, options, or anything else).

westurner · 2023-02-08T16:23:30Z

Specifically which things need to be parameterized at the kernel level?

"Reproducibility" is jeopardized if all parameters are not persisted for repeatability. Which additional files would then also be necessary to archive and distribute in order to reproduce the notebook output?

Variance in parameters like PYTHONHASHSEED and PYTHOPTIMIZE should also be isolated (for Python notebooks, for example). Should you also with this measure specify how users should document such non-kernel simulation parameters?

E.g cookiecutter has a config schema with default values IIRC, but it doesn't specify e.g. jsonschema or SHACL to validate runtime parameter datatypes or constraints, and I don't think autoescape=on is on for the templates because we trust developers who understand the code to not insecurely pass unescaped parameters to templates. Here, the potential vulnerabilities of additional parameters do include OS command injection (especially if the os commands are string-parametrized without something like sarge or subprocess.Popen(shell=False).

Users will want to pass urlargs to the kernel from URL arguments and HTML form data, but that runs ipykernel --$thing and thing=a;cp -Rv /etc /usr/local/etc

How do I determine whether or not an experimental outcome is sensitive to these new unspecified parameters?

What additional risks to reproducible science and users is posed by adding parametrization to os commands?

westurner · 2023-02-08T16:25:56Z

"Reproducibility" is jeopardized if all parameters are not persisted for repeatability. Which additional files would then also be necessary to archive and distribute in order to reproduce the notebook output

If defaults could change over time, the reproducible ScholarlyArticle author must persist and archive at time t and publish the default values that were specified at that time, too

hbcarlos · 2023-02-09T11:00:50Z

Specifically which things need to be parameterized at the kernel level?

It is up to the kernel. We just offer the possibility of having parameters.

"Reproducibility" is jeopardized if all parameters are not persisted for repeatability. Which additional files would then also be necessary to archive and distribute in order to reproduce the notebook output?

That is addressed in this JEP. We talk about adding a particular attribute "save": true, to indicate whether that parameter should be saved in the notebook's metadata.

Here, the potential vulnerabilities of additional parameters do include OS command injection (especially if the os commands are string-parametrized without something like sarge or subprocess.Popen(shell=False).

We could do a sanity check before launching the kernel.

Users will want to pass urlargs to the kernel from URL arguments and HTML form data, but that runs ipykernel --$thing and thing=a;cp -Rv /etc /usr/local/etc

The user can now open a terminal and run cp -Rv /etc /usr/local/etc.

rgbkrk · 2024-06-14T14:02:59Z

Would you be interested in attending a demo on Monday?

Yes!

SylvainCorlay · 2024-06-14T22:12:54Z

Would you be interested in attending a demo on Monday?

Yes!

This will be discussed at the next SSC working call (which is a public call). I sent an invite so that you have it on your calendar.

vidartf · 2024-06-20T15:50:56Z

jupyter-parameterized-kernel-spec/jupyter-parameterized-kernel-specs.md

+
+Upon starting a new kernel instance, a front-end form generated from the JSON schema is prompted to the user to fill the parameter values. Many tools are available to generate such forms, such as react-jsonschema-form.
+
+Some of the chosen parameter values can be saved in e.g. the notebook metadata so that they don't have to be specified every time one opens the notebook.


Having stored arguments in the notebook is potentially a security issue. For arbitrary input fields (e.g. any non-enum field) this could potentially be used as a way to inject string, e.g. for shell-injections. So if we want to include this feature it would need to be fleshed out in more detail to ensure it is secure (or at least can be disabled).

Note: I'm talking here about notebooks that can come from untrusted sources, vs general user controlled arguments.

vidartf · 2024-06-20T15:53:47Z

jupyter-parameterized-kernel-spec/jupyter-parameterized-kernel-specs.md

+  "argv": [
+      "/home/user/micromamba/envs/kernel_spec/bin/xcpp",
+      "-f",
+      "{connection_file}",


This format for string templates needs to be fleshed out. Above it was listed as ${}, some places as {}. Also, if I need to use curly braces in my command, I need to be able to escape it.

I think we need to remove the $ from the text. It is clearly just a curly bracket.

@martinRenou mentioned that he might have an example of a case where literal curly braces might be needed, just to highlight what I mean with an escape mechanism being needed, and how it will be strictly a breaking change, but unlikely to actually affect anyone (as long as unrecognized parameter names are left as they are, and don't cause errors).

I mentioned I had an example of a kernel spec with ${} in it, but that was my memory tricking me.

Actually we should fix the mention of -std=${cpp_version} line 33 with -std={cpp_version} to stay coherent with the current {connection_file}.

minrk · 2024-06-25T13:36:37Z

Thanks for recording the demo!

I think the main point that was touched on that I think we need to nail down is the relationship between parameters stored in the notebook and trusted/untrusted notebooks.

I do think being able to generically set environment variables would be really useful, and allow skipping the schema step for a lot of things, but being able to set PYTHONPATH, PYTHONSTARTUP, LD_PRELOAD, etc. from an untrusted notebook would violate the notebook security model.

Since we currently do start kernels for untrusted notebooks but execute no code, we need to make sure we don't load any 'unsafe' parameters from an untrusted notebook. To me, the main questions there are do we:

start kernel as we do now, ignoring parameters from notebook (potentially confusing, potentially leads to inadvertent trust, see below)
instead of starting kernel straight away, prompt with parameters filled in from notebook metadata with message about 'are you sure you trust this?'
allow kernelspecs to mark specific parameters as 'safe' (default: False), so only 'unsafe' parameters need trust handling
make sure kernel parameter metadata is considered for trust - we don't want a situation where we load an untrusted notebook without untrusted parameters, run it so it becomes trusted, then reload with untrusted parameters that have become trusted without explicit trust of the parameters themselves

Currently, since code cell output is the only source of trust, a notebook ends up trusted if it has no output or only plaintext, etc.. We need to make sure that parameters are not trusted and prevent a notebook from becoming trusted until they are explicitly approved by the user. I think option 2 - prompting the user and requiring that they approve the parameters before starting a parametrized kernel - is the only reasonably safe option.

I think we may need to update our signing/trust model in nbformat to consider more metadata fields as well.

minrk · 2024-06-25T17:00:35Z

I realize: the other option is to explicitly trust parameters, and tell kernels to only ever define parameters that are always safe (essentially meaning env vars can never be passed through except with special handling)

echarles · 2024-06-25T17:12:53Z

Thanks for recording the demo!

Is there a link for the demo?

Quick question: There has been discussion about having a JSON Schema. What is the status on that discussion?

AnastasiaSliusar · 2024-06-28T13:12:02Z

UPD: Hello everyone, I implement next algorithm which defines whether a kernel spec file is insecure. Please take a look at the flowcharts.

So, the algorithm looks like the filter of kernel spec files. We have the bunch of kernel spec files and I check each kernel spec file. There are next secure criteria for a kernel spec file:

if a kernel spec file does not have metadata.parameters then it has default behavior that is present now without functionality of parameterized kernels when a user clicks on a kernel icon on Launcher and the user runs a kernel. Such kernel spec file is allowed;
if all metadata.parameters of a kernel spec file do not include free form (text inputs, textarea) where a user can put any information from frontend side. If so such a kernel spec file is save and we show a dialog window for a user.

If a kernel spec file includes metadata.parameters, it means we have the new type of a kernel spec file dedicated for launching a kernel with custom configuration. The new type of kernel spec file can include any structure of JSON schema inside metadata.parameters.

If a kernel spec is not secure then we check whether we setup a flag --ServerApp.allowed_insecure_kernelspec_params as true during a running the app. If so, show a dialog window. If not, we still have the bunch of kernel spec files that can be used for customization or not and the task is to run a secure kernel without failing. And if we have the new type of kernel spec file, then we should use its default values for each kernel custom parameter. Otherwise the kernel will fail.

@minrk In my opinion, this algorithm can answer on some your questions. Please, take a look and I am curious about what you think about it. Thank you:)

CC: @SylvainCorlay , @JohanMabille

AnastasiaSliusar · 2024-06-28T13:17:18Z

The new type of kernel spec file can include any structure of JSON schema inside metadata.parameters.

Dear @echarles , if I understand you correctly, your question is about JSON Schema which is inside a kernel spec file. According to the description of JEP, there will be the new type of kernel spec file which will include parameters property inside "metadata" object of a kernel.json and JSON Schema should be defined inside metadata.parameters by a kernel author

gabalafou · 2024-07-08T15:30:38Z

In the SSC call today, one thing that came up was whether it might be a good idea to break this into two separate JEPs:

one that handles the kernel parameterization
another that handles storing those parameters in the notebook (and associated trust model changes)

Just throwing the idea out there.

AnastasiaSliusar · 2024-07-10T15:13:23Z

In the SSC call today, one thing that came up was whether it might be a good idea to break this into two separate JEPs:

one that handles the kernel parameterization

another that handles storing those parameters in the notebook (and associated trust model changes)

Just throwing the idea out there.

Thank you @gabalafou . I agree. I updated PRs of the solution for this JEP where I implemented this algorithm #87 (comment) please, take a look: jupyterlab/jupyterlab#16487
jupyter-server/jupyter_server#1431
jupyter/jupyter_client#1028.

Thank you

AnastasiaSliusar · 2024-07-11T11:28:36Z

UPD: Hello everyone, I implement next algorithm which defines whether a kernel spec file is insecure. Please take a look at the flowcharts.

So, the algorithm looks like the filter of kernel spec files. We have the bunch of kernel spec files and I check each kernel spec file. There are next secure criteria for a kernel spec file:

if a kernel spec file does not have metadata.parameters then it has default behavior that is present now without functionality of parameterized kernels when a user clicks on a kernel icon on Launcher and the user runs a kernel. Such kernel spec file is allowed;

if all metadata.parameters of a kernel spec file do not include free form (text inputs, textarea) where a user can put any information from frontend side. If so such a kernel spec file is save and we show a dialog window for a user.

If a kernel spec file includes metadata.parameters, it means we have the new type of a kernel spec file dedicated for launching a kernel with custom configuration. The new type of kernel spec file can include any structure of JSON schema inside metadata.parameters.

If a kernel spec is not secure then we check whether we setup a flag --ServerApp.allowed_insecure_kernelspec_params as true during a running the app. If so, show a dialog window. If not, we still have the bunch of kernel spec files that can be used for customization or not and the task is to run a secure kernel without failing. And if we have the new type of kernel spec file, then we should use its default values for each kernel custom parameter. Otherwise the kernel will fail.

@minrk In my opinion, this algorithm can answer on some your questions. Please, take a look and I am curious about what you think about it. Thank you:)

CC: @SylvainCorlay , @JohanMabille

As for check_is_kernel_secure block of this flowchart, it includes the verification whether a kernel spec file is secure or not checking whether JSON Schema description of each parameter includes enum. The flowchart is next

vidartf · 2024-07-15T17:45:38Z

I think a workable plan forward here could be:

Have the default behavior be that the server only accepts safe arguments, unless the server was explicitly started with a flag allow_unsafe_kernelspec_args (or something along that name). This should also handle the scenario where the server admin doesn't trust the user (certain arguments might be used by the kernel provisioner and therefore used in a different security context than that of the kernel process itself).
If the server is set to accept unsafe arguments, we could as an additional step recommend that the trust model is expanded to allow more user safety and convenience, but that initially this model would put the full onerous on the user (i.e. with a heavy warning). OR we could say that this JEP must include such a change to the trust model in order for users to not have to trade security for functionality. @minrk : thoughts? NOTE: If we do not plan to store any arguments in the notebook document itself, this trust model consideration goes away.
Clients would need to know whether the server accept unsafe arguments or not, in order to present a suitable UI. This would need to be added somehow to an API.
I think we should more carefully think through what @kevin-bates was trying to say here regarding interactions with the kernel provisioners. E.g. here: https://github.com/jupyter/enhancement-proposals/pull/46/files#diff-e6ee537c33eaf4125571232d4379f39c23b40552b477d8ddcb3e6dcee48f2f59R11-R52 the arguments for the provisioner and the kernel are separated into two schemas. That would allow you to say for a specific deployment e.g. that:
- You can run 3 types of kernel here (e.g. C++, Python, R). Each of those kernels have their own parameters (compiler version, env vars, etc, etc).
- You can run these kernels either locally, or on cloud provider A or B. For the cloud providers, you can request specific resources (GPU, amount of memory, etc. etc).
The plan there was to have kernel provisioners be able to populate this schema. This is not a part of this JEP, and it doesn't have to be, but it might be prudent to at least namespace the arguments that are specifically for the kernel process? That at least leaves the door open for doing a separate JEP adding such provisioner parameters in a backwards compatible way. If not, I think people will just bolt on provisioner arguments through this mechanism, which might end up being pretty ugly..

SylvainCorlay · 2024-08-27T20:51:45Z

I don't think we need to namespace parameters of the kernelspec with the "kernel" prefix, as it would probably be redundant. If a special category of parameters for provisioners is added, they could probably be namespaced.

vidartf · 2024-09-23T15:37:41Z

jupyter-parameterized-kernel-spec/jupyter-parameterized-kernel-specs.md

+  ],
+  env: [
+    "XEUS_LOGLEVEL=ERROR"
+  ],


I think these should be dicts?

SylvainCorlay · 2024-09-23T15:52:28Z

jupyter-parameterized-kernel-spec/jupyter-parameterized-kernel-specs.md

+
+## Proposed Enhancement
+
+The solution we are proposing consists of adding parameters to the kernel specs file in the form of a JSON Schema that would be added to the specs metadata. These parameters are then used to populate the `argv` and `env` lists (respectively the command-line arguments and environment variables).


(Pointed out by @vidartf):

the phrase "the argv and env lists" is incorrect because env is a dict. We should fix that sentense.

Parameterized kernel specs proposal

6322120

Update example

37756a1

hbcarlos force-pushed the kernel_specs branch from 2834bb9 to 37756a1 Compare February 15, 2022 17:29

kevin-bates reviewed Feb 23, 2022

View reviewed changes

Zsailer mentioned this pull request Feb 23, 2022

Meeting Notes 2022 jupyter-server/team-compass#15

Closed

kevin-bates reviewed Feb 8, 2023

View reviewed changes

Zsailer mentioned this pull request Feb 16, 2023

Meeting Notes 2023 jupyter-server/team-compass#45

Closed

JohanMabille mentioned this pull request Feb 23, 2023

Kernel protocol roadmap jupyter-server/team-compass#50

Open

JohanMabille added the under discussion (RFC) label Apr 2, 2023

gabalafou mentioned this pull request May 16, 2024

SSC meeting minutes 2024 jupyter/software-steering-council-team-compass#22

Open

This was referenced Jun 4, 2024

Parameterized kernel specs AnastasiaSliusar/jupyterlab#1

Open

Parametrizing kernels AnastasiaSliusar/jupyter_client#1

Open

Add support of 'custom_kernel_specs' parameters AnastasiaSliusar/jupyter_server#1

Open

AnastasiaSliusar mentioned this pull request Jun 14, 2024

Parameterized kernel specs jupyterlab/jupyterlab#16487

Draft

3coins mentioned this pull request Jun 20, 2024

Meeting Notes 2024 jupyter-server/team-compass#57

Closed

vidartf reviewed Jun 20, 2024

View reviewed changes

krassowski mentioned this pull request Jun 26, 2024

Issue patching SessionConnection class in extension after moving to pip based install jupyterlab/jupyterlab#16513

Closed

Update JEP secription

39d1502

AnastasiaSliusar mentioned this pull request Aug 1, 2024

Add support of using custom env variables jupyter-server/jupyter_server#1448

Open

AnastasiaSliusar mentioned this pull request Aug 12, 2024

Add allow_insecure_kernelspec_params to page config jupyterlab/jupyterlab_server#460

Draft

vidartf reviewed Sep 23, 2024

View reviewed changes

SylvainCorlay reviewed Sep 23, 2024

View reviewed changes

Fix the description

f5dfa75

renan-r-santos mentioned this pull request Oct 16, 2024

Detect multiple environments renan-r-santos/pixi-kernel#20

Closed

minrk mentioned this pull request Nov 8, 2024

Use zmq-anyio ipython/ipykernel#1291

Open

kevin-bates mentioned this pull request Feb 18, 2025

enhancement to help kernel provisioners introspect notebooks for their dependencies jupyter/jupyter_client#1055

Open

	"save": true
	"save": true,
	"format_string": "-std={cpp_version}"


		Upon starting a new kernel instance, a front-end form generated from the JSON schema is prompted to the user to fill the parameter values. Many tools are available to generate such forms, such as react-jsonschema-form.

		Some of the chosen parameter values can be saved in e.g. the notebook metadata so that they don't have to be specified every time one opens the notebook.


		## Proposed Enhancement

		The solution we are proposing consists of adding parameters to the kernel specs file in the form of a JSON Schema that would be added to the specs metadata. These parameters are then used to populate the `argv` and `env` lists (respectively the command-line arguments and environment variables).

Parameterized kernel specs proposal #87

Are you sure you want to change the base?

Parameterized kernel specs proposal #87

Conversation

hbcarlos commented Feb 10, 2022 • edited by SylvainCorlay Loading

Parameterized kernel specs

dhirschfeld commented Feb 10, 2022

blink1073 commented Feb 10, 2022

echarles commented Feb 10, 2022 • edited Loading

echarles commented Feb 10, 2022

kevin-bates commented Feb 10, 2022

SylvainCorlay commented Feb 10, 2022 • edited Loading

hbcarlos commented Feb 15, 2022

kevin-bates left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kevin-bates commented Feb 2, 2023

davidbrochart commented Feb 3, 2023

hbcarlos commented Feb 3, 2023

kevin-bates commented Feb 3, 2023

hbcarlos commented Feb 8, 2023

kevin-bates commented Feb 8, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

westurner Feb 8, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

westurner commented Feb 8, 2023 • edited Loading

westurner commented Feb 8, 2023 • edited Loading

hbcarlos commented Feb 9, 2023

rgbkrk commented Jun 14, 2024

SylvainCorlay commented Jun 14, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

martinRenou Sep 30, 2024 • edited Loading

Choose a reason for hiding this comment

minrk commented Jun 25, 2024 • edited Loading

minrk commented Jun 25, 2024

echarles commented Jun 25, 2024

AnastasiaSliusar commented Jun 28, 2024 • edited Loading

AnastasiaSliusar commented Jun 28, 2024

gabalafou commented Jul 8, 2024 • edited Loading

AnastasiaSliusar commented Jul 10, 2024

AnastasiaSliusar commented Jul 11, 2024

vidartf commented Jul 15, 2024 • edited Loading

SylvainCorlay commented Aug 27, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hbcarlos commented Feb 10, 2022 •

edited by SylvainCorlay

Loading

echarles commented Feb 10, 2022 •

edited

Loading

SylvainCorlay commented Feb 10, 2022 •

edited

Loading

westurner Feb 8, 2023 •

edited

Loading

westurner commented Feb 8, 2023 •

edited

Loading

westurner commented Feb 8, 2023 •

edited

Loading

martinRenou Sep 30, 2024 •

edited

Loading

minrk commented Jun 25, 2024 •

edited

Loading

AnastasiaSliusar commented Jun 28, 2024 •

edited

Loading

gabalafou commented Jul 8, 2024 •

edited

Loading

vidartf commented Jul 15, 2024 •

edited

Loading