Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support per-node or per-daemonset cost adjustments #1664

Open
alter3d opened this issue Sep 13, 2024 · 2 comments
Open

Support per-node or per-daemonset cost adjustments #1664

alter3d opened this issue Sep 13, 2024 · 2 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@alter3d
Copy link

alter3d commented Sep 13, 2024

Description

What problem are you trying to solve?

Karpenter currently calculates pricing for a set of instances based solely on the cloud provider pricing for the instance. However, sometimes there are external costs to consider that are on a per-node basis -- for example, monitoring or security agents that have a per-node-hour fee that doesn't scale with instance size.

This results in situations where Karpenter might provision 16 "large" instances @ $0.04/hr each ($0.64/hr total cost) rather than choosing (or consolidating to) 1 "8xlarge" instance @ $0.65/hr, which on its face is cheaper but if we factor in the $0.03/node-hr for our monitoring solution, the math changes drastically ($1.12/hr for the "large" instances, $0.68/hr for the "8xlarge" instance).

Supporting a config option for a per-node-hour adjustment at the NodePool level, or perhaps creating a standard annotation that can be added to DaemonSets, would solve many of these cases. The NodePool approach may be more accurate since the user can account for e.g. node taints where the DaemonSet wouldn't run. I think the remaining cases (for example, where provisioning a node for even 30 seconds results in a node-month fee, i.e. SaaS providers with no hourly option) is probably outside of Karpenter's scope.

How important is this feature to you?

"Nice-to-have". It could result in some significant cost savings over time, but it is manageable for now by simply using larger minimum instance sizes in our NodePools, which of course is a tradeoff between wasted compute resources vs the external costs.

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@alter3d alter3d added the kind/feature Categorizes issue or PR as related to a new feature. label Sep 13, 2024
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Sep 13, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If Karpenter contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@jerryjvl
Copy link

jerryjvl commented Sep 17, 2024

I am currently in a similar situation for much the same underlying reasons (additional third-party costs correlating with the number of nodes running).

I would like to propose a slightly more nuanced configuration though, because in a given NodePool, not all available InstanceTypes will cost the same, because machine architecture has an impact on which DaemonSets we actually need to run on a given node (free vs. paid for similar functionality, etc.)

Also, not all of our cost inputs scale in the same dimension, some of the costs are per-core costs, and others are per-host costs.

If we could configure an add-on cost in a per-node-pool map by instance-type that the karpenter cost calculator adds on to the raw host costs returned from AWS, then we can build out our own mapping of how these costs affect different hardware.

If these costs are in the same unit as the machine cost lookups (hourly? monthly?) then the implementation could be as simple as an addition in the inner cost calculation loop from this per-node-pool map of add-on costs. Which seems like a good balance between complexity-to-implement and functional richness of the added capability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
None yet
Development

No branches or pull requests

3 participants