diff --git a/keps/sig-api-machinery/4595-cel-crd-additionalprintercolumns/README.md b/keps/sig-api-machinery/4595-cel-crd-additionalprintercolumns/README.md new file mode 100644 index 00000000000..4a5f387d9a9 --- /dev/null +++ b/keps/sig-api-machinery/4595-cel-crd-additionalprintercolumns/README.md @@ -0,0 +1,1382 @@ + +# KEP-4595: CEL for CRD AdditionalPrinterColumns + + + + + + +- [Release Signoff Checklist](#release-signoff-checklist) +- [Summary](#summary) +- [Motivation](#motivation) + - [Goals](#goals) + - [Non-Goals](#non-goals) +- [Proposal](#proposal) + - [Example](#example) + - [User Stories (Optional)](#user-stories-optional) + - [Story 1](#story-1) + - [Story 2](#story-2) + - [Story 3](#story-3) + - [Story 4](#story-4) + - [Notes/Constraints/Caveats (Optional)](#notesconstraintscaveats-optional) + - [Risks and Mitigations](#risks-and-mitigations) + - [Complex CEL expressions may impact compilation performance](#complex-cel-expressions-may-impact-compilation-performance) + - [Runtime evaluation errors despite successful compilation](#runtime-evaluation-errors-despite-successful-compilation) +- [Design Details](#design-details) + - [API Changes](#api-changes) + - [Proposed flow of CEL additionalPrinterColumns](#proposed-flow-of-cel-additionalprintercolumns) + - [CEL Compilation](#cel-compilation) + - [Validation](#validation) + - [Implementation](#implementation) + - [CEL vs JSONPath Performance Analysis](#cel-vs-jsonpath-performance-analysis) + - [Test Plan](#test-plan) + - [Prerequisite testing updates](#prerequisite-testing-updates) + - [Unit tests](#unit-tests) + - [Integration tests](#integration-tests) + - [e2e tests](#e2e-tests) + - [Graduation Criteria](#graduation-criteria) + - [Alpha](#alpha) + - [Beta](#beta) + - [GA](#ga) + - [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy) + - [Version Skew Strategy](#version-skew-strategy) +- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire) + - [Feature Enablement and Rollback](#feature-enablement-and-rollback) + - [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning) + - [Monitoring Requirements](#monitoring-requirements) + - [Dependencies](#dependencies) + - [Scalability](#scalability) + - [Troubleshooting](#troubleshooting) +- [Implementation History](#implementation-history) +- [Drawbacks](#drawbacks) +- [Alternatives](#alternatives) +- [Infrastructure Needed (Optional)](#infrastructure-needed-optional) + + +## Release Signoff Checklist + + + +Items marked with (R) are required *prior to targeting to a milestone / release*. + +- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR) +- [ ] (R) KEP approvers have approved the KEP status as `implementable` +- [ ] (R) Design details are appropriately documented +- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors) + - [ ] e2e Tests for all Beta API Operations (endpoints) + - [ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) + - [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free +- [ ] (R) Graduation criteria is in place + - [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) +- [ ] (R) Production readiness review completed +- [ ] (R) Production readiness review approved +- [ ] "Implementation History" section is up-to-date for milestone +- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io] +- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes + + + +[kubernetes.io]: https://kubernetes.io/ +[kubernetes/enhancements]: https://git.k8s.io/enhancements +[kubernetes/kubernetes]: https://git.k8s.io/kubernetes +[kubernetes/website]: https://git.k8s.io/website + +## Summary + + + +This enhancement proposes to let users define human readable printer columns for custom resource definitions using CEL expressions as an alternative to using JSON path. + +## Motivation + + + +Currently, when creating CustomResourceDefinitions you can define a map of `additionalPrinterColumns` that would be displayed when querying the custom resources with kubectl. This list of `additionalPrinterColumns` are defined using JSON paths. If your CustomResourceDefinition is defined in the following manner, running `kubectl get mycrd myresource` would yield the following response. + +```yaml +additionalPrinterColumns: +- name: Desired + type: integer + jsonPath: .spec.replicas +- name: Current + type: integer + jsonPath: .status.replicas +- name: Age + type: date + jsonPath: .metadata.creationTimestamp +``` + +``` +NAME DESIRED CURRENT AGE +myresource 1 1 7s +``` + +This approach has a few limitations such as not being able to support arrays, missing support for processing conditionals, not being able to compute column value from multiple fields and difficulty with formatting dates as duration from another timestamp. + +With the advent of CEL, we can provide an alternative input for `additionalPrinterColumns` to represent the value in CEL for more complicated table readings. This would be added along with the existing JSON path and users can define `additionalPrinterColumns` for their CRDs in either JSON path or as a CEL expression. + +### Goals + + + +- Enable support for defining `additionalPrinterColumns` using CEL expressions in Custom Resource Definitions (CRD). +- Ensure each column uses only one method—either a CEL expression or JSONPath, not both. +- Allow CRDs to define a mix of columns, with some using CEL and others using JSONPath. + +### Non-Goals + + + +- Modify, replace, or phase out JSONPath-based column definitions. +- Expanding CEL’s access scope beyond the current design constraints (e.g., no access to arbitrary `metadata.*` fields beyond `name` and `generateName`). + + Refer [caveats](#notesconstraintscaveats-optional) section for context. +- Changes to `kubectl` or other clients are required. + + +## Proposal + + + +This KEP propses a new, mutually exclusive sibling field to `additionalPrinterColumns[].jsonPath` called `additionalPrinterColumns[].expression`. This field allows defining printer column values using CEL (Common Expression Language) expressions that evaluate to strings. + +To support this, the [`CustomResourceColumnDefinition`](https://github.com/kubernetes/kubernetes/blob/3044a4ce87abae50d8bf9ef77554fa16f2be2f12/staging/src/k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/types.go#L237-L257) struct will be extended to accept CEL expressions for printer columns, and the API server will evaluate these expressions dynamically when responding to `Table` requests (e.g., `kubectl get`), producing richer, computed, or combined column outputs. + +### Example + +Given this CRD snippet: + +```yaml +additionalPrinterColumns: +- name: Replicas + type: string + expression: "%d/%d".format([self.status.replicas, self.spec.replicas]) +- name: Age + type: date + jsonPath: .metadata.creationTimestamp +- name: Status + type: string + expression: self.status.replicas == self.spec.replicas ? "READY" : "WAITING" +``` + +The `kubectl get` output might look like: + +``` +NAME REPLICAS AGE STATUS +myresource 1/1 7s READY +myresource2 0/1 2s WAITING +``` + +This enhancement enables flexible, human-friendly column formatting and logic in `kubectl get` outputs without requiring external tooling or complex `JSONPath` workarounds. + +### User Stories (Optional) + + + +#### Story 1 + +As a Kubernetes user, I want to define `additionalPrinterColumns` that correctly aggregate and display all nested arrays within my CRD, so that `kubectl get` outputs the full list of hosts instead of only showing the first array. Current JSONPath-based columns only print the first matching array, resulting in incomplete data. + +Using CEL expressions for `additionalPrinterColumns` allows combining all nested arrays into a single flattened list, providing complete and accurate output in `kubectl get`. + +```yaml +additionalPrinterColumns: +- name: hosts + jsonPath: .spec.servers[*].hosts + type: string +- name: hosts + type: string + description: "All hosts from all servers" + expression: "self.spec.servers.map(s, s.hosts)" +``` + +Output: +``` +NAME HOSTS HOSTS CEL +foo0 ["foo.example.com","bar.example.com"] [[foo.example.com, bar.example.com], [baz.example.com]] +``` + +In the above example: + +* `spec.servers` is mapped to extract each `hosts` array. +* The resulting list of all hosts is displayed in the column output. + +Once we support the CEL `flatten()` macro in the Kubernetes CEL environment, we can get the exact output with `(self.spec.servers.map(s, s.hosts)).flatten()`. + +**References:** + +* https://github.com/kubernetes/kubectl/issues/517 +* https://github.com/kubernetes/kubernetes/pull/67079 +* https://github.com/kubernetes/kubernetes/pull/101205 +* https://groups.google.com/g/kubernetes-sig-api-machinery/c/GxXWe6T8DoM + +#### Story 2 + +As a Kubernetes user, I want to display the status of a specific condition (e.g., the "Ready" condition) from a list of status conditions in a human-readable column when using `kubectl get`. Currently, `jsonPath` based additionalPrinterColumns cannot directly extract and display a single condition's status from an array of conditions, which limits usability and clarity. + +With CEL based additionalPrinterColumns, I can define a column using an expression that filters and selects the relevant condition, making the output more meaningful. + +**Example:** + +Using the following CRD snippet, I define a `READY` column that uses a CEL expression to extract the status of the "Ready" condition: + +```yaml +apiVersion: apiextensions.k8s.io/v1 +kind: CustomResourceDefinition +... +spec: + ... + versions: + ... + schema: + openAPIV3Schema: + type: object + properties: + status: + type: object + properties: + conditions: + type: array + items: + type: object + properties: + type: + type: string + status: + type: string + ... + additionalPrinterColumns: + - name: READY + type: string + description: 'Status of the Ready condition' + expression: 'self.status.conditions.exists(c, c.type == "Ready") ? self.status.conditions.filter(c, c.type == "Ready")[0].status : "Unknown"' +``` + +Output: +``` +NAME READY +example-resource True +example-resource2 Unknown +``` + +This expression checks if a condition with `type == "Ready"` exists. If so, it returns its status; otherwise, it returns `"Unknown"`. This approach enables clear, user-friendly status reporting for conditions stored as arrays in the CRD. + +**References:** + +* https://github.com/kubernetes/kubernetes/issues/67268 + + +#### Story 3 + +As a Kubernetes user, I want to define an additional printer column that combines multiple fields from a sub-resource into a single human-readable string. The additionalPrinterColumns defined using `jsonPath` can’t concatenate fields, so the output is either limited or unclear. + +With CEL expressions in additionalPrinterColumns, it is possible to format and combine multiple fields cleanly for better readability. + +For example, in a CRD with `.spec.sub.foo` and `.spec.sub.bar`, this column defined using CEL expression combines the two fields with a slash: + +```yaml +additionalPrinterColumns: +- name: "Combined" + type: string + description: "Combined Foo and Bar values" + expression: 'format("%s/%s", self.spec.sub.foo, self.spec.sub.bar)' +``` + +Output: +``` +NAME COMBINED AGE +myresource foo/bar 7s +``` + +This shows output like `val1/val2` in `kubectl get` columns, improving clarity. + +**References:** + +* https://github.com/operator-framework/operator-sdk/issues/3872 + +#### Story 4 + +As a Kubernetes user, I want to format dates as relative durations (e.g., "5m ago" instead of absolute timestamps) in printer columns, making it easier to understand resource age or timing at a glance. + +**Example:** + +```yaml +additionalPrinterColumns: + - name: Duration + type: string + description: Duration between start and completion + expression: 'timestamp(self.status.completionTimestamp) - timestamp(self.status.startTimestamp)' +``` + +Output: +``` +NAME DURATION +sample-job 24h7m10s +``` + +This would allow `kubectl get` to display the elapsed time between start and completion timestamps as a formatted duration. + +**Reference:** + +- https://stackoverflow.com/questions/70557581/kubernetes-crd-show-durations-in-additionalprintercolumns + + +### Notes/Constraints/Caveats (Optional) + + + +As of this writing, when defining `additionalPrinterColumns` using **CEL expressions**, access to fields under `metadata` is **limited**. +Only `metadata.name` and `metadata.generateName` are accessible, as per the current [design decision](https://github.com/kubernetes/kubernetes/blob/55f2bc10435160619d1ece8de49a1c0f8fcdf276/staging/src/k8s.io/apiextensions-apiserver/pkg/apiserver/schema/cel/model/schemas.go#L39-L73). + +This makes CEL-based columns less flexible than those defined using JSONPath, because columns definied using JSONPath can access additional `metadata` fields like `creationTimestamp`, `labels`, and `ownerReferences`, etc. + +For example, the following `jsonPath`-based columns defined in the [Cluster API project](https://github.com/kubernetes-sigs/cluster-api/blob/ef10e5aea3d3c9525dd83fa8a15005fc0b97d1b9/test/infrastructure/docker/config/crd/bases/infrastructure.cluster.x-k8s.io_devmachines.yaml#L19-L39)) are valid: + +```yaml +additionalPrinterColumns: +- name: Age + type: date + description: Time since creation + jsonPath: .metadata.creationTimestamp +- name: Cluster + type: string + description: Associated Cluster + jsonPath: .metadata.labels['cluster\.x-k8s\.io/cluster-name'] +- name: Machine + type: string + description: Owning Machine + jsonPath: .metadata.ownerReferences[?(@.kind=="Machine")].name +``` + +But when attempting to define the same columns using CEL expressions, it fails because any field under `metadata` (except `metadata.name` and `metadata.generateName`) is dropped during the [conversion of the CRD structural schema to a CEL declaration](https://github.com/kubernetes/kubernetes/blob/71f0fc6e72d53d5caf50b1314ca4d754463117f0/staging/src/k8s.io/apiextensions-apiserver/pkg/apiserver/schema/cel/model/schemas.go#L26-L75): + +```yaml +additionalPrinterColumns: +- name: Age + type: date + description: Time since creation + expression: self.metadata.creationTimestamp +``` + +Error: + +``` +The CustomResourceDefinition "jobs.example.com" is invalid: spec.additionalPrinterColumns[1]: Internal error: CEL compilation failed for self.metadata.creationTimestamp rules: compilation failed: ERROR: :1:14: undefined field 'creationTimestamp' + | self.metadata.creationTimestamp + | .............^ +``` + +There's a similar ongoing discussion here – [https://github.com/kubernetes/kubernetes/issues/122163](https://github.com/kubernetes/kubernetes/issues/122163) + +### Risks and Mitigations + + + +#### Complex CEL expressions may impact compilation performance + +With CEL-based `additionalPrinterColumns`, users may write highly complex expressions to fulfill specific use cases. These expressions can lead to longer compilation times or excessive compute cost during CRD creation. + +**Mitigation:** + +A finite CEL cost model is enforced, as is standard with other CEL-enabled features in Kubernetes. This model limits the computational cost during expression compilation. If a CEL expression exceeds the allowed cost, the compilation will timeout and fail gracefully. + +For expressions that are within the cost limits but still slow due to complexity, the responsibility lies with the CRD author to keep or drop them. + + +#### Runtime evaluation errors despite successful compilation + +CEL expressions are compiled during CRD creation but evaluated later during API usage, such as `kubectl get `. As a result, runtime data inconsistencies can cause evaluation errors even if compilation was successful. + +For example, if a CEL expression references fields not present in a given Custom Resource instance—due to missing data, schema changes, or optional fields—the evaluation may fail. + +**Mitigation:** + +This behavior is aligned with how `jsonPath` based `additionalPrinterColumns` currently function. If a `jsonPath` evaluation fails, an empty value is printed in the column. + +The same strategy will be applied for CEL: evaluation failures will result in an empty column, and the underlying error will be logged. This ensures user experience remains consistent and resilient to partial data issues. + +Example: + +```yaml +openAPIV3Schema: + status: + type: object + properties: + startTimestamp: + type: string + format: date-time # Incorrect format + completionTimestamp: + type: string + format: date-time # Incorrect format + duration: + type: string +additionalPrinterColumns: +- name: Duration + type: string + description: Duration between start and completion + expression: 'timestamp(self.status.completionTimestamp) - timestamp(self.status.startTimestamp)' +``` + +In the above example, the format for the fields are incorrect, but the CEL expression is valid. This results in the CEL program returning an error during evaluation at the runtime. This happens because the format we've defined, `date-time` is incorrect. The correct format defined in [supportedFormats](https://github.com/kubernetes/kubernetes/blob/4468565250c940bbf70c2bad07f2aad387454be1/staging/src/k8s.io/apiextensions-apiserver/pkg/apiserver/validation/formats.go#L26-L51) is `datetime`. The above would example would give us the following error: + +``` +NAME DURATION +sample-job no such overload: timestamp(string) +``` + + +## Design Details + + + +Today CRD additionalPrinterColumns only supports JSONPath. This is done today with [TableConvertor](https://github.com/kubernetes/kubernetes/blob/8a0f6370e4b53b648050c534f0ee11b776f900a6/staging/src/k8s.io/apiextensions-apiserver/pkg/registry/customresource/tableconvertor/tableconvertor.go) that converts objects to `metav1.Table`. Once we create a CRD, a new TableConvertor object will be created along with it. The TableConvertor is what processes the output for additionalPrinterColumns when we query for custom resources. The JSONPath is validated during the [CRD validation](https://github.com/kubernetes/kubernetes/blob/dae746b59d390c304cc2019d8840f99872a5723a/staging/src/k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/validation/validation.go#L807-L811) and is [parsed](https://github.com/kubernetes/kubernetes/blob/8a0f6370e4b53b648050c534f0ee11b776f900a6/staging/src/k8s.io/apiextensions-apiserver/pkg/registry/customresource/tableconvertor/tableconvertor.go#L50-L53) when the TableConvertor is created. We propose extending the CRD API as well as the `TableConvertor` logic to handle CEL expressions alongside the existing JSONPath logic without changing any of the current behaviour. + +### API Changes + +We extend the `CustomResourceColumnDefinition` type by adding an `Expression` field which takes CEL expressions as a string. + +```diff +type CustomResourceColumnDefinition struct { + ... + JSONPath string + ++ Expression string +} +``` + +```diff +type CustomResourceColumnDefinition struct { + // ... + JSONPath string `json:"jsonPath,omitempty" protobuf:"bytes,6,opt,name=jsonPath"` + ++ Expression string `json:"expression,omitempty" protobuf:"bytes,7,opt,name=expression"` +} +``` + +### Proposed flow of CEL additionalPrinterColumns + +This CEL expression would then be compiled twice: +- During the [CRD validation](https://github.com/kubernetes/kubernetes/blob/b35c5c0a301d326fdfa353943fca077778544ac6/staging/src/k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/validation/validation.go#L789-L790) and, +- Then again during the [TableConvertor creation](https://github.com/kubernetes/kubernetes/blob/b35c5c0a301d326fdfa353943fca077778544ac6/staging/src/k8s.io/apiextensions-apiserver/pkg/registry/customresource/tableconvertor/tableconvertor.go#L39-L41) + +The compiled CEL program would then be later evaluated at runtime when [printing columns during resource listing](https://github.com/kubernetes/kubernetes/blob/b35c5c0a301d326fdfa353943fca077778544ac6/staging/src/k8s.io/apiextensions-apiserver/pkg/registry/customresource/tableconvertor/tableconvertor.go#L115-L135). + +### CEL Compilation + +To handle the CEL compilation, we add a new `CompileColumn()` function to the `apiextensions-apiserver/pkg/apiserver/schema/cel` package which would be called during both CRD Validation and from inside the `TableConvertor.New()` function. + +```go +func CompileColumn(expr string, s *schema.Structural, declType *apiservercel.DeclType, perCallLimit uint64, baseEnvSet *environment.EnvSet, envLoader EnvLoader) ColumnCompilationResult { + ... +} +``` + +### Validation + +We expect the additionalPrinterColumns of a CustomResourceDefinition to either have a `jsonPath` or an `expression` field. Currently additionalPrinterColumns are validated from the `ValidateCustomResourceColumnDefinition` function. Once we add the new expression field, we compile the CEL expression here using the `cel.CompileColumn()` function. If the CEL compilation fails at validation, the CRD is not applied. + +```diff +func ValidateCustomResourceColumnDefinition(col *apiextensions.CustomResourceColumnDefinition, fldPath *field.Path) field.ErrorList { + // ... + if len(col.JSONPath) == 0 && len(col.expression) == 0 { + allErrs = append(allErrs, field.Required(fldPath.Child("JSONPath or expression"), "either JSONPath or CEL expression must be specified")) + } + + if len(col.JSONPath) != 0 { + if errs := validateSimpleJSONPath(col.JSONPath, fldPath.Child("jsonPath")); len(errs) > 0 { + allErrs = append(allErrs, errs...) + } + } + ++ if len(col.expression) != 0 { ++ // Handle CEL context creation and error handling ++ var celContext *CELSchemaContext ++ celContext = PrinterColumnCELContext(schema) ++ ... + + // CEL compilation during the validation stage + compilationResult = cel.CompileColumn(col.Expression, structuralSchema, model.SchemaDeclType(s, true), celconfig.PerCallLimit, environment.MustBaseEnvSet(environment.DefaultCompatibilityVersion(), true), cel.StoredExpressionsEnvLoader()) + // Based on the CEL compilation result validate the additionalPrinterColumn + if compilationResult.Error != nil { + allErrs = append(allErrs, field.InternalError(fldPath, fmt.Errorf("CEL compilation failed for %s rules: %s", col.Expression, compilationResult.Error))) + } + + ... + } + + return allErrs +} +``` + +### Implementation + +Inside [tableconvertor.go](https://github.com/kubernetes/kubernetes/blob/8a0f6370e4b53b648050c534f0ee11b776f900a6/staging/src/k8s.io/apiextensions-apiserver/pkg/registry/customresource/tableconvertor/tableconvertor.go): + +- We have the [TableConvertor.New()](https://github.com/kubernetes/kubernetes/blob/dae746b59d390c304cc2019d8840f99872a5723a/staging/src/k8s.io/apiextensions-apiserver/pkg/registry/customresource/tableconvertor/tableconvertor.go#L41) function which creates the TableConvertor object for a CRD. This is done from the [crdHandler](https://github.com/kubernetes/kubernetes/blob/dae746b59d390c304cc2019d8840f99872a5723a/staging/src/k8s.io/apiextensions-apiserver/pkg/apiserver/customresource_handler.go#L810) when the CRD is created or updated. + +- Each column under additionalPrinterColumns is defined in the TableConvertor object with a [columnPrinter interface](https://github.com/kubernetes/kubernetes/blob/dae746b59d390c304cc2019d8840f99872a5723a/staging/src/k8s.io/apiextensions-apiserver/pkg/registry/customresource/tableconvertor/tableconvertor.go#L74-L77). This interface has two methods, `FindResults()` and `PrintResults()`, which would be used by the TableConvertor object to compute and print the additionalPrinterColumns' values when we do a GET operation on the CRD. + +Today for JSONPath additionalPrinterColumns, we parse the JSONPath expression inside the `TableConvertor.New()` function [here](https://github.com/kubernetes/kubernetes/blob/dae746b59d390c304cc2019d8840f99872a5723a/staging/src/k8s.io/apiextensions-apiserver/pkg/registry/customresource/tableconvertor/tableconvertor.go#L49-L69) like so: + +```go +func New(crdColumns []apiextensionsv1.CustomResourceColumnDefinition) (rest.TableConvertor, error) { + ... + path := jsonpath.New(col.Name) + if err := path.Parse(fmt.Sprintf("{%s}", col.JSONPath)); err != nil { + return c, fmt.Errorf("unrecognized column definition %q", col.JSONPath) + } + path.AllowMissingKeys(true) + c.additionalColumns = append(c.additionalColumns, path) +} +``` + +We then call this function in `TableConvertor.New()` to allow handling additionalPrinterColumns defined using CEL expressions: + +```diff ++ func New(crdColumns []apiextensionsv1.CustomResourceColumnDefinition, s *schema.Structural) (rest.TableConvertor, error) { + ... ++ if len(col.JSONPath) > 0 && len(col.Expression) == 0 { + // existing jsonPath logic ++ } else if len(col.Expression) > 0 && len(col.JSONPath) == 0 { ++ compResult := CompileColumn(col.Expression, s, model.SchemaDeclType(s, true), celconfig.PerCallLimit, environment.MustBaseEnvSet(environment.DefaultCompatibilityVersion(), true), cel.StoredExpressionsEnvLoader()) + ++ if compResult.Error != nil { ++ return c, fmt.Errorf("CEL compilation error %q", compResult.Error) ++ } ++ c.additionalColumns = append(c.additionalColumns, compResult) ++ } +} +``` + +To make all this work, we also introduce the following: + +- A new struct `ColumnCompilationResult`: + +```go +type ColumnCompilationResult struct { + Error error + MaxCost uint64 + MaxCardinality uint64 + FieldPath *field.Path + Program cel.Program +} +``` + +- This struct implements the [columnPrinter](https://github.com/kubernetes/kubernetes/blob/8a0f6370e4b53b648050c534f0ee11b776f900a6/staging/src/k8s.io/apiextensions-apiserver/pkg/registry/customresource/tableconvertor/tableconvertor.go#L74-L77) interface: + +```go +func (c ColumnCompilationResult) FindResults(data interface{}) ([][]reflect.Value, error) { + ... +} + +func (c ColumnCompilationResult) PrintResults(w io.Writer, results []reflect.Value) error { + ... +} +``` + +- The output of `cel.CompileColumn()` returns a `ColumnCompilationResult` object for each additionalPrinterColumn. + +With all of this we can pass the CEL program to the TableConvertor's `ConvertToTable()` method, which will call `FindResults` and `PrintResults` for all additionalPrinterColumns, regardless of whether they're defined with JSONPath or CEL expressions. + + +### CEL vs JSONPath Performance Analysis + +A big part of the discussions for our proposal was the CEL cost limits since this is the first time CEL is added to the read path. As part of this we've done benchmarking of the time it takes to parse and compile equivalent JSONPath and CEL expressions. + +> **Note**: The following benchmark analysis statistics are only indicative of the performance. The actual numbers may vary across different runs of the same test. + +Refer: +- [Source code for the POC](https://github.com/sreeram-venkitesh/kubernetes/commits/kep-4595-poc/?since=2025-07-20&until=2025-07-22&author=sreeram-venkitesh) +- Scenario 1: Benchmarking overall performance (compilation + evaluation + cost estimation bits et.al) +
+ Details +
+

Run on Apple M3 Pro with 12 cores, 18 GB RAM, arm64

+

Find the raw output of the benchmark tests, as well as the source code: https://gist.github.com/sreeram-venkitesh/f4aff1ae7957a5a3b9c6c53e869b7403

+

The following table provides an average performance analysis across CEL and JSONPath based additionalPrinterColumns:

+ + | | CEL ([BenchmarkNew_CEL](https://gist.github.com/sreeram-venkitesh/f4aff1ae7957a5a3b9c6c53e869b7403#file-tableconvertor_test-go-L36-L75)) | JSONPath ([BenchmarkNew_JSONPath](https://gist.github.com/sreeram-venkitesh/f4aff1ae7957a5a3b9c6c53e869b7403#file-tableconvertor_test-go-L77-L116)) | + |--------------------------------------|--------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------| + | **Column Definition** | `self.spec.servers.map(s, s.hosts.filter(h, h == "prod.example.com"))` | `.spec.servers[*].hosts[?(@ == "prod.example.com")]` | + | **Overall Performance**
(Compilation + Evaluation) | • Average iterations: 3,111
• Average time per operation: **382,914 ns/op** (~383 µs per op)
• Standard deviation: ±42,087 ns (±11%) | • Average iterations: 70,542 iterations
• Average time per operation: **17,654 ns/op** (~17.7 µs per op)
• Standard deviation: ±2,846 ns (±16%) | + | **Compilation Performance** | • Cold Start: 2.340 ms

• Warmed: 300–400 µs
 ◦ Most Expensive / Consistent Phases:
  • Env & Cost Estimator: 160–220 µs avg
  • CEL Compilation: 60–120 µs avg
  • Program Generation: 50–80 µs avg

• 83% improvement (2.34 ms → ~400 µs)| • Cold Start: ~85 µs

• Warmed: 5–8 µs
 ◦ Most Expensive / Consistent Phases:
  • JSONPath Parsing: 4–85 µs (occasional spikes)

• 90% improvement (85 µs → ~8 µs)| + | **Evaluation Performance** | **FindResults**
• Cold: 103.5 µs
• Warmed: 13.5 µs
• 81% improvement (103.5 → 13.5 µs)

**PrintResults**
• Cold: 3.9 µs
• Warmed: 1.5 µs
• 70% improvement (3.9 → 1.5 µs) | **FindResults**
• Cold: 1.4 µs
• Warmed: 0.85 µs
• 29% improvement (1.4 → 0.85 µs)

**PrintResults**
• Cold: 0.29 µs
• Warmed: 0.18 µs
• 58% improvement (0.29 → 0.18 µs) | + +
+- Scenario 2: Benchmarking evaluation (`findResults()`) performance. + + Based on the review comment [here](https://github.com/kubernetes/enhancements/pull/4602#discussion_r2121919813) - `Benchmark an expensive JSON Path additionalPrinterColumns operation (just the part that finds a value using the JSON Path library)`. +
+ Details +
+

Run on a resource constraint VM - 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz, 4 CPU, 4GB RAM, X86_64

+

Find the raw output of the benchmark tests, as well as the source code: https://gist.github.com/Priyankasaggu11929/43cc9ece4d6215ee4cfe0d1523a919d6

+

The following table provides an average performance analysis across CEL and JSONPath based additionalPrinterColumns (only for the `findResults()` execution durations across the benchmark test iterations, along with the min, max, avg indexes):

+ + | | CEL ([BenchmarkNew_CEL_DeepComplex](https://gist.github.com/Priyankasaggu11929/43cc9ece4d6215ee4cfe0d1523a919d6#file-tableconvertor_testgo)) | JSONPath ([BenchmarkNew_JSONPath_DeepComplex](https://gist.github.com/Priyankasaggu11929/43cc9ece4d6215ee4cfe0d1523a919d6#file-tableconvertor_testgo)) | + |--------------------------------------|--------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------| + | **Column Definition** | `self.spec.environments.map(e, e.clusters.map(c, c.nodes.filter(n, n.metrics.memory > 8000).map(n, n.id)))` | `.spec.environments[*].clusters[*].nodes[?(@.metrics.memory > 8000)].id` | + | **Evaluation Performance** | **FindResults**
• Min: 30.91 µs
• Max: 1870.87 µs
• Average: 58.38 µs | **FindResults**
• Min: 2.19 µs
• Max: 1147.24 µs
• Average: 8.40 µs | + +
+ +_**Conclusion**_ — + +Overall performance (compilation + evaluation + cost calculation et.al) of CEL across our two scenarios above, is that CEL is about 20x slower than JSONPath. + +But since our focus for the performance analysis was to analyze the **evaluation cost** (refer scenario 2): + +- On average, CEL is about 7x slower than JSONPath (58.38 µs vs 8.40 µs) +- In the worst cases scenario (most expensive run) CEL is 1.5x slower than JSONPath (1870.87 µs vs 1147.24 µs) + + + + +### Test Plan + + + +[x] I/we understand the owners of the involved components may require updates to +existing tests to make this code solid enough prior to committing the changes necessary +to implement this enhancement. + +##### Prerequisite testing updates + + + +##### Unit tests + + + + + +Alpha: + +[staging/src/k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/validation/validation_test.go](https://github.com/kubernetes/kubernetes/blob/e54719bb6674fac228671e0786d19c2cf27b08a3/staging/src/k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/validation/validation_test.go) + +- Test that validation passes when we create an additionalPrinterColumn with an expression field with valid CEL expression +- Test that validation fails when we create an additionalPrinterColumn with an expression field with an invalid CEL expression +- Test that existing behaviour of jsonPath is not altered when creating CRDs with only jsonPath additionalPrinterColumns +- Test that validation fails when we create an additionalPrinterColumn with both jsonPath and expression fields +- Test that validation passes when we create multiple additionalPrinterColumns with both jsonPath and expression fields +- Test that validation fails when we try to create an additionalPrinterColumn with expression field when the feature gate is turned off + +[staging/src/k8s.io/apiextensions-apiserver/pkg/registry/customresource/tableconvertor/tableconvertor_test.go](https://github.com/kubernetes/kubernetes/blob/bc302fa4144d21a338683cd83701661f97be4aba/staging/src/k8s.io/apiextensions-apiserver/pkg/registry/customresource/tableconvertor/tableconvertor_test.go) + +- Verify that CEL compilation errors are caught at the CRD validation phase +- Verify that CEL compilation at the TableConvertor creation stage succeeds +- Verify that TableConvertor is getting created for the CRD with both jsonPath and expression columns + + + +##### Integration tests + + + + +[test/integration/apiserver/crd_additional_printer_columns_test.go](https://github.com/kubernetes/kubernetes/tree/bc302fa4144d21a338683cd83701661f97be4aba/test/integration/apiserver) + +- Verify that CRDs are getting created with additionalPrinterColumns with both jsonPath and expression fields +- Verify that CEL compilation errors are caught at the CRD validation stage +- Verify that existing behaviour is not altered when creating CRDs with only jsonPath additionalPrinterColumns + + + +##### e2e tests + + + + +We will test all cases in integration test and unit test. If needed, we can add e2e tests before beta graduation. We are planning to extend the existing [e2e tests for CRDs](https://github.com/kubernetes/kubernetes/blob/3df3b83226530bda69ffcb7b4450026139b2cd11/test/e2e/apimachinery/custom_resource_definition.go). + +### Graduation Criteria + +#### Alpha + +- Feature implemented behind a feature flag +- Initial benchmarks to compare performance of JSONPath with CEL columns and set an appropriate CEL cost (equivalent or at most 2x to the JSONPath cost - as discussed in the [June 11, 2025 SIG API Machinery meeting](https://docs.google.com/document/d/1x9RNaaysyO0gXHIr1y50QFbiL1x8OWnk2v3XnrdkT5Y/edit?tab=t.0#bookmark=id.epfys7yzizcn)) +- Unit tests and integration tests completed and enabled + +#### Beta + +- Gather feedback from developers and surveys +- Add e2e tests +- Add appropriate metrics for additionalPrinterColumns usage and CEL cost usage +- More benchmarking to compare JSONPath and CEL execution and modify CEL cost if needed + +#### GA + +- N examples of real-world usage +- Upgrade/downgrade e2e tests +- Scalability tests +- Allowing time for feedback + +### Upgrade / Downgrade Strategy + + + + + + + +No changes are required for a cluster to make an upgrade and maintain existing behavior. + +If the cluster is downgraded to a version which doesn't support CEL for additionalPrinterColumns: +- Existing additionalPrinterColumns with CEL expressions would be ignored and those columns will not be printed. Any create or update operation to CRDs would fail if we try to use CEL for additionalPrinterColumns. +- Existing additionalPrinterColumns with JSONPath would still work as expected. + +Once the cluster is upgraded back to a version supporting CEL for additionalPrinterColumns, users should be able to create CRDs with additionalPrinterColumns using CEL again. + +### Version Skew Strategy + + + +This feature is implemented in the kube-apiserver component, skew with other kubernetes components do not require coordinated behavior. + +Clients should ensure the kube-apiserver is fully rolled out before using the feature. + +## Production Readiness Review Questionnaire + + + +### Feature Enablement and Rollback + + + +###### How can this feature be enabled / disabled in a live cluster? + + + +- [x] Feature gate (also fill in values in `kep.yaml`) + - Feature gate name: `CRDAdditionalPrinterColumnCEL` + - Components depending on the feature gate: `apiextensions-apiserver`, `kube-apiserver` +- [ ] Other + - Describe the mechanism: + - Will enabling / disabling the feature require downtime of the control + plane? + - Will enabling / disabling the feature require downtime or reprovisioning + of a node? + +###### Does enabling the feature change any default behavior? + + + +No default behaviour will be changed since we still support additionalPrinterColumns with JSONPath. + +###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)? + + + +Yes, if the feature is disabled after being used, the existing additionalPrinterColumns with JSONPath would work as expected. Existing resources with CEL expressions in their additionalPrinterColumn definition would be ignored and those columns will not be printed if the feature is disabled. + +###### What happens if we reenable the feature if it was previously rolled back? + +CRDs which had failed validation previously might now succeed if the CEL expression is valid. Existing CRDs additionalPrinterColumns defined with CEL expression would start working again after the feature has been reenabled. + +###### Are there any tests for feature enablement/disablement? + + + +We will have unit and integration tests to make sure that the feature enablement and disablement works as intended. + +### Rollout, Upgrade and Rollback Planning + + + +###### How can a rollout or rollback fail? Can it impact already running workloads? + + + +This feature will not impact rollouts or already-running workloads. + +###### What specific metrics should inform a rollback? + + + +If enabling this feature introduces an increase in the latency of the `kubectl get ` (or similar) request durations, in turn creating load on the apiserver, the same can be indicated by apiserver metrics like `apiserver_request_duration_seconds`. If there are significant spikes in these metrics during these GET operations you can try disabling the feature/rolling back the cluster version to see if the performance improves. + +###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested? + + + +We're planning to test upgrade-> downgrade -> upgrades before graduating to beta. + +###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.? + + + +No. + +### Monitoring Requirements + + + +###### How can an operator determine if the feature is in use by workloads? + + + +The cluster admin can check if the CRDAdditionalPrinterColumnCEL feature gate is turned on. If yes, the admin can further check if any CRD has any columns defined under `additionalPrinterColumns` section which are using the new `expression` field. + +###### How can someone using this feature know that it is working for their instance? + + + +- [ ] Events + - Event Reason: +- [ ] API .status + - Condition name: + - Other field: +- [x] Other (treat as last resort) + - Details: Users will be able to define additionalPrinterColumns for their custom resources with `expression` instead of `jsonPath`. + +###### What are the reasonable SLOs (Service Level Objectives) for the enhancement? + + + +###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service? + + + +- [ ] Metrics + - Metric name: + - [Optional] Aggregation method: + - Components exposing the metric: +- [ ] Other (treat as last resort) + - Details: + +###### Are there any missing metrics that would be useful to have to improve observability of this feature? + + + +No. + +### Dependencies + + + +###### Does this feature depend on any specific services running in the cluster? + + + +No. + +### Scalability + + + +###### Will enabling / using this feature result in any new API calls? + + + +No. + +###### Will enabling / using this feature result in introducing new API types? + + + +No. + +###### Will enabling / using this feature result in any new calls to the cloud provider? + + + +No. + +###### Will enabling / using this feature result in increasing size or count of the existing API objects? + + + +No. + +###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs? + + + +Performance of CRD reads might be impacted. Benchmarking needs to be done to know the exact difference between using JSONPath and CEL. + +###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components? + + + +Since the CEL expressions are compiled and evaluated in the kube-apiserver, depending on the complexity of the CRDs and the expressions defined, we may see a non-negligible increase of CPU usage. We are planning to benchmark this before beta graduation. + +###### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)? + + + +No. + +### Troubleshooting + + + +###### How does this feature react if the API server and/or etcd is unavailable? + +The same way any write to apiserver would. + +###### What are other known failure modes? + + + +None. + +###### What steps should be taken if SLOs are not being met to determine the problem? + +Disable the feature. + +## Implementation History + + + +## Drawbacks + + + +## Alternatives + + + +An alternative to the CEL approach proposed by this KEP would be to extend JSONPath to support arrays and other complex queries. There have been a couple of attempts to implement this previously. + +- [Adding support for complex json paths in AdditionalPrinterColumns #101205](https://github.com/kubernetes/kubernetes/pull/101205) +- [apiextensions: allow complex json paths for additionalPrinterColumns #67079](https://github.com/kubernetes/kubernetes/pull/67079) + +These attempts were not successful because of breaking changes to JSONPath. Now that we have CEL as an option, we can move away from trying to extend JSONPath and embrace CEL, since it covers a much larger ground than what we could achieve with extending JSONPath. + +## Infrastructure Needed (Optional) + + diff --git a/keps/sig-api-machinery/4595-cel-crd-additionalprintercolumns/kep.yaml b/keps/sig-api-machinery/4595-cel-crd-additionalprintercolumns/kep.yaml new file mode 100755 index 00000000000..31e86c2f1b1 --- /dev/null +++ b/keps/sig-api-machinery/4595-cel-crd-additionalprintercolumns/kep.yaml @@ -0,0 +1,33 @@ +title: CEL for CRD AdditionalPrinterColumns +kep-number: 4595 +authors: + - "@sreeram-venkitesh" + - "@Priyankasaggu11929" +owning-sig: sig-api-machinery +participating-sigs: +reviewers: + - "@cici37" + - "@deads2k" + - "@jpbetz" + - "@liggitt" +approvers: + - "@jpbetz" +creation-date: "2024-04-26" +last-updated: "v1.34" +status: provisional +stage: alpha + +latest-milestone: "1.34" + +milestone: + alpha: "1.34" + beta: "" + stable: "" + +feature-gates: + - name: CRDAdditionalPrinterColumnCEL + components: + - kube-apiserver +disable-supported: true + +metrics: []