-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v1beta1 planning #21
Comments
quick demo: apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
creationTimestamp: "2022-04-21T11:30:27Z"
name: noderesourcetopologies.topology.node.k8s.io
spec:
conversion:
strategy: None
group: topology.node.k8s.io
names:
kind: NodeResourceTopology
listKind: NodeResourceTopologyList
plural: noderesourcetopologies
shortNames:
- nrt
singular: noderesourcetopology
scope: Cluster
versions:
- name: v1beta1
schema:
openAPIV3Schema:
description: NodeResourceTopology describes node resources and their topology.
properties:
apiVersion:
description: 'APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
type: string
kind:
description: 'Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type: string
metadata:
type: object
status:
description: NodeResourceTopologyStatus defines the observed state of NodeResourceTopology
zones:
description: ZoneList contains an array of Zone objects.
items:
description: Zone represents a resource topology zone, e.g. socket, node, die or core.
properties:
attributes:
description: AttributeList contains an array of AttributeInfo objects.
items:
description: AttributeInfo contains one attribute of a Zone.
properties:
name:
type: string
value:
type: string
required:
- name
- value
type: object
type: array
costs:
description: CostList contains an array of CostInfo objects.
items:
description: CostInfo describes the cost (or distance) between two Zones.
properties:
name:
type: string
value:
format: int64
type: integer
required:
- name
- value
type: object
type: array
name:
type: string
parent:
type: string
resources:
description: ResourceInfoList contains an array of ResourceInfo objects.
items:
description: ResourceInfo contains information about one resource type.
properties:
allocatable:
anyOf:
- type: integer
- type: string
description: Allocatable quantity of the resource, corresponding to allocatable in node status, i.e. total amount of this resource available to be used by pods.
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
available:
anyOf:
- type: integer
- type: string
description: Available is the amount of this resource currently available for new (to be scheduled) pods, i.e. Allocatable minus the resources reserved by currently running pods.
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
capacity:
anyOf:
- type: integer
- type: string
description: Capacity of the resource, corresponding to capacity in node status, i.e. total amount of this resource that the node has.
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
name:
description: Name of the resource.
type: string
required:
- allocatable
- available
- capacity
- name
type: object
type: array
type:
type: string
required:
- name
- type
type: object
type: array
type: object
required:
- zones
type: object
served: true
storage: true |
xref: #22 |
Can we also include TopologyManagerPolicyOptions, I believe it should be of type |
I'm unconvinced about this. We decided to add the TM policy and then we mixed with the scope, so clearly separating the scope makes sense. But arguably adding the TM policy was already a stretch, so I'm not keen to adding even more data which is not directly related to the NUMA zone resource tracking (TM data is not). |
that said, TM config is related to the problem we hare handling here, so we can totally discuss new extra objects to add to the general umbrealla of this API. |
Yeah, was thinking about either. I was confused that NodeTopology is mixed with |
This is something we can (and should!) totally do, I'm very open to explore this direction. We will need to design an object which is not too tied to the TM status, but I'm confident we can come out with something nice for v1beta1. Looking forward to hear your thoughts in this area. |
Sounds good! Is there any design doc or PR where we can keep track of this? |
not yet, we will need to figure out the timeline. Word of warning: this is very unlikely to progress much in the next 2-3 weeks. When the design starts, I'll link it here. We will need a way to collaborate, so it's possible it will be a design doc ora PR to reason about (or both) |
I am inclining towards the more radical approach here :) We should figure out a way to represent Topology Manager policies, scope and the newly introduced policy options. Perhaps something like a Also, I noticed that we left out this field in the CRD spec above. Was that intentional so we have a discussion on this topic? |
I agree with @fromanirh that the use of
@PiotrProkop If you have the time and would like, feel free to submit a PR that would unblock your work. Even though we don't have the bandwidth to drive this work, we will definitely make the time to review it :) |
Adding a note that use of maps in the APIs is discouraged: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#lists-of-named-subobjects-preferred-over-maps so we can't go with the approach I recommended earlier. |
Since |
@swatisehgal @fromanirh should we move this discussion to kubernetes/kubernetes#96275? or the intent is to merge
And if I should ask for it should I do it here or in |
Sure, let's move the discussion to kubernetes/kubernetes#96275. I would be inclined to merge as
Irrespective of the approach we take, we should add these missing policies for the sake of completeness, I will update my PR: 96275 as well as here. |
Sounds good! I will leave comment on kubernetes/kubernetes#96275 to provide visibility why this field is changed. |
note we will also need to change the updating agents (NFD, RTE) |
So probably in this repo the:
Should stay and in |
these are good points. We need to evaluate and explore the tradeoffs here. |
Yes, that's right. All the consumers of the API would have to be updated. |
I think, we should keep "Restricted" and "BestEffort" policies for https://github.com/k8stopologyawareschedwg/noderesourcetopology-api but in the kubernetes PR (API in staging) we should move to policy+scope for the sake of consistency. It will seem a bit odd to have "Restricted" and "BestEffort" policies and their corresponding policies with the scope. I foresee it raising eyebrows. When NFD, RTE and Scheduler Plugin transition to consuming the NRT API from the staging repo we would have to handle it carefully. |
👍 |
I'm very convinced we should remove configuration data (= TM policy, TM scope) from the NRT API proper. |
I agree. We are already relying on NFD to advertise NRT resources, so relying on specific labels makes sense. |
I like the idea of representing this information as labels. It gives us what we need and is much simpler than exposing a new object. Given that NFD is a well recognized software component intended to expose both hardware features and system configurations, I believe it is a good fit for storing configuration information like this. @marquiz WDYT about NFD exposing Topology manager policy and scope as labels? Perhaps, we can expose Topology Manager policy options and CPU Manager policy/policy options as well? |
Continuing this train of thought, I think we should expose not just the TM policy, but something that abstracts a bit the node behavior enforced by that scope. Random brainstorm:
scope is easier:
TL;DR: labels should represent concepts enabled by TM config, should not reflect TM config 1:1 |
What benefit do you see in abstracting the concept as opposed to representing the configurations 1:1? Representing the topology manager configuration or in general kubelet configuration directly without abstracting it gives a clear view of what is being represented in the label (kubelet configuration on the node) and is much more intuitive rather than having to interpret what Abstracting the policies means that NFD has to translate the policies to an abstract label and some component has the responsibility of correctly interpreting it which makes me a bit nervous. |
just a general comment, please stop using term "NUMA" anywhere in API and please don't put it in new ones. It is not technically correct and not matching neither existing on the marked hardware nor upcoming in near future. |
point taken. |
proposed changes for
v1beta1
, to be discussedstatus
- NRT describes the status of resources on a node, so it makes sense to honor the spec/status split and to have the fields reported as statusmake topologyPolicy a single stringreport TM scope alongside policya more radical approach: split the TM configuration data into a new objectlabelsuse top-level attributes added in v1alpha2)podfingerprint as value(from annotation: use top-level attributes added in v1alpha2)node-res-topo
tonrt
The text was updated successfully, but these errors were encountered: