-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Official documentation on Poseidon/Firmament, a new multi-scheduler support for K8S #12069
Conversation
Copying @deepak-vij @shivramsrivastava |
Deploy preview for kubernetes-io-master-staging ready! Built with commit 7eadd6b https://deploy-preview-12069--kubernetes-io-master-staging.netlify.com |
{{% capture overview %}} | ||
|
||
Poseidon is the [Firmament scheduler](https://github.com/Huawei-PaaS/firmament) integration for Kubernetes. At a very high level, Poseidon/Firmament scheduler augments the current Kubernetes scheduling capabilities by incorporating a new novel flow network graph based scheduling capabilities alongside the default Kubernetes Scheduler. It models the scheduling problem as a constraint-based optimization over a flow network graph – by reducing scheduling to a min-cost max-flow optimization problem. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I'd rather read one or two sentences, at most, in the overview section.
- Can you break up or clean up this sentence:
At a very high level, the Poseidon scheduler augments the current Kubernetes scheduling capabilities by incorporating a new novel flow network [graph-based scheduling capabilities alongside the default Kubernetes Scheduler?????]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have simplified the language and broken down the long sentences to shorter ones.
Please check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I just read the design doc. Is Poseidon really in the alpha state in 1.13 or will it be in 1.14?
- I'm confused: Is Poseidon encapsulating Firmament with the integration? What version of Poseidon includes what version of Firmament? Or the scheduler and integration are separate installs?
- nit, suggested change: Poseidon is the integration of the Firmament scheduler with Kubernetes.
- nit: Unless I have missed something, the overview and the introduction repeat the same information, verbatim.
- From the design doc, a good explanation of firmament and the role of Poseidon: Firmament models workloads and clusters as flow networks and runs min-cost flow optimizations over these networks to make scheduling decisions.
- nit: use of new and novel is redundant?
It incorporates new novel flow network graph based scheduling capabilities alongside the default Kubernetes Scheduler.
Possible change: The Firmament scheduler [or Poseidon integration?] incorporates novel flow network, graph-based scheduling capabilities ...
Do you need the graph-based?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Poseidon-Firmament is supported from Kubernetes release 1.6 and works with all subsequent K8S releases.
- This is updated in the introduction section. Poseidon is the integration glue for the Firmament scheduler with Kubernetes.
The release process for Poseidon-Firmament scheduler is explained in detail at this link. In summary, Poseidon and Firmament are in lock-step. Everytime there is a change in Firmament or Poseidon there is a corresponding release created. for ex. The current poseidon release is 0.6 and the current corresponding firmament release is 0.6. This information has been updated in the Current Project Stage section in the document. - The overview is now changed.
- Added this and reworded the introduction section.
- Removed new.
- Graph based is definitely necessary.
## Introduction | ||
|
||
Poseidon is the [Firmament scheduler](https://github.com/Huawei-PaaS/firmament) integration for Kubernetes. At a very high level, Poseidon/Firmament scheduler augments the current Kubernetes scheduling capabilities by incorporating a new novel flow network graph based scheduling capabilities alongside the default Kubernetes Scheduler. It models the scheduling problem as a constraint-based optimization over a flow network graph – by reducing scheduling to a min-cost max-flow optimization problem. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Again, break down the descriptions of what this scheduler can do.
- In general, is this the best location for this page? There is information for configuring multiple schedulers under
tasks/administer-cluster/configure-multiple-schedulers. - If this scheduler (Poseidon) and platform (Firmament) are in the alpha state, you should state that explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
The language has been simplified and the sentences broken to shorter ones.
-
This document does not contain any steps that need to be performed. The current location was a tentative one. After reviewing the other pages we feel that the path (/concepts/extend-kubernetes) is a better fit.
-
@bsalamat Please review and advise.
-
We have added the feature state tag at the top. The alpha status is also mentioned in the Current Project Stage.
Poseidon is the [Firmament scheduler](https://github.com/Huawei-PaaS/firmament) integration for Kubernetes. At a very high level, Poseidon/Firmament scheduler augments the current Kubernetes scheduling capabilities by incorporating a new novel flow network graph based scheduling capabilities alongside the default Kubernetes Scheduler. It models the scheduling problem as a constraint-based optimization over a flow network graph – by reducing scheduling to a min-cost max-flow optimization problem. | ||
|
||
Due to the inherent rescheduling capabilities, the new scheduler enables a globally optimal scheduling environment that constantly keeps refining the workloads placements dynamically. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence could use some clarification. I find the last clause confusing. What is a globally optimal scheduling environment? And the environment constantly refines the placement of the workloads?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The content has been modified and simplified.
Due to the inherent rescheduling capabilities, the new scheduler enables a globally optimal scheduling environment that constantly keeps refining the workloads placements dynamically. | ||
|
||
Poseidon/Firmament scheduler runs alongside the default Kubernetes Scheduler as an alternate scheduler – multiple schedulers running simultaneously. As part of the Kubernetes multiple schedulers support, each new pod is typically scheduled by the default scheduler, but Kubernetes can be instructed to use another scheduler by specifying the name of another custom scheduler (“Poseidon” in our case) in the PodSpec at the time of pod creation. In this case, the default scheduler will ignore that Pod and allow Poseidon scheduler to schedule the Pod on a relevant node. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you show the configuration (YAML) for the Poseidon scheduler alongside the Kubernetes default scheduler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The configuration YAML has been added.
### Flow graph scheduling based Poseidon/Firmament scheduler provides the following key advantages: | ||
- Workloads (pods) are bulk scheduled for enabling scheduling decisions at massive scale. | ||
- Based on the extensive performance test results, Poseidon/Firmament scales much better than Kubernetes default scheduler as the number of nodes increase in a cluster. This is due to the fact that Poseidon/Firmament is able to amortize more and more work across workloads. | ||
- Poseidon/Firmament Scheduler outperforms K8S default scheduler by a wide margin when it comes to throughput performance numbers for scenarios where compute resource requirements are somewhat uniform across jobs (Replicasets/Deployments/Jobs). As shown in the graph below, Poseidon/Firmament scheduler end-to-end throughput performance numbers (including bind time) consistently get better and better as the number of nodes in a cluster increase. For example, for a 2,700 nodes cluster (shown in the graph below), Poseidon/Firmament scheduler is 7X (or more) better end-to-end throughput-wise that includes bind time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure about the graphs. Could the perf results be published on your github page or in the k8s blog?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The performance graphs have been moved to the benchmark section under the poseidon repository.
https://github.com/kubernetes-sigs/poseidon/blob/master/docs/benchmark/README.md
f82c3ac
to
43c68bf
Compare
@@ -34,7 +36,16 @@ Poseidon/Firmament scheduler runs alongside the default Kubernetes Scheduler as | |||
|
|||
## Poseidon-Firmament Scheduler - How it works | |||
|
|||
As part of the Kubernetes multiple schedulers support, each new pod is typically scheduled by the default scheduler, but Kubernetes can be instructed to use another scheduler by specifying the name of another custom scheduler (“Poseidon” in our case) in the PodSpec at the time of pod creation. In this case, the default scheduler will ignore that Pod and allow Poseidon scheduler to schedule the Pod on a relevant node. | |||
As part of the Kubernetes multiple schedulers support, each new pod is typically scheduled by the default scheduler. Kubernetes can be instructed to use another scheduler by specifying the name of another custom scheduler (“poseidon” in our case) in the PodSpec (**schedulerName** field of the PodSpec) at the time of pod creation. In this case, the default scheduler will ignore that Pod and allow Poseidon scheduler to schedule the Pod on a relevant node. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Key advantages, you mention the perf graphs. Remove this section?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been fixed.
@@ -75,7 +86,7 @@ Although, Poseidon/Firmament scheduler is capable of scheduling various types of | |||
|High Availability|Y|N**|Planned.| | |||
|Real-time metrics based scheduling|N|Y**|Initially supported using Heapster (now deprecated) for placing pods using actual cluster utilization statistics rather than reservations. Plans to switch over to "metric server".| | |||
|Support for Max-Pod per node|Y|Y|Poseidon/Firmament scheduler seamlessly co-exists with K8S default scheduler.| | |||
|Support for Ephemeral Storage, in addition to CPU/Memory|Y|Y|This feature was working earlier. However, for some reason since K8S release 1.10 onwards it does not seem to work as expected. We are looking at resolving the issue soon.| | |||
|Support for Ephemeral Storage, in addition to CPU/Memory|Y|Y|| | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Can you explain why Poseidon's approach is better? (bulk scheduling approach is superior),
This is due to Poseidon/Firmament bulk scheduling approach superiority versus K8S pod-at-a-time approach.
nit: Could add these items under Use Cases
- For “Big Data/AI” jobs consisting of large no. of tasks, throughput benefits are tremendous. 2) Substantial throughput benefits also for service or batch job scenarios where workload resource requirements are uniform across Replicasets/Deployments/Jobs.|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The benefits of bulk scheduling have now been added to the Latest Throughput Performance Testing Results section.
The throughput benefit is already mentioned under the Use Cases section.
@kbhawkey I am working out of India and am available till 11am california time. That way we can have an overlap and discuss any other changes at that time. |
{{< feature-state for_k8s_version="v1.13" state="alpha" >}} | ||
{{< note >}} | ||
Current release of Poseidon-Firmament scheduler is an alpha release. | ||
{{< /note >}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@anvithks, here are some more suggestions
. I am just reviewing; doc maintainers approve.
If the note content is moved to the left, I think the rendered output may look better. You may need to test this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The note is a custom Hugo shortcode. I am not sure if I can change the style. I will check.
Maybe I can just display the text emboldened instead of using note.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it to emboldened text.
|
||
It models the scheduling problem as a constraint-based optimization over a flow network graph. This is achieved by reducing scheduling to a min-cost max-flow optimization problem. Due to the inherent rescheduling capabilities, the Poseidon/Firmament scheduler constantly keeps refining the workloads placements dynamically. | ||
It models the scheduling problem as a constraint-based optimization over a flow network graph. This is achieved by reducing scheduling to a min-cost max-flow optimization problem. Due to the inherent rescheduling capabilities, the Poseidon-Firmament scheduler constantly keeps refining the workloads placements dynamically. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion:
Remove, Due to the inherent rescheduling capabilities
Instead,
The Poseidon-Firmament scheduler dynamically refines the workload placements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is done.
- Based on the extensive performance test results, Poseidon/Firmament scales much better than Kubernetes default scheduler as the number of nodes increase in a cluster. This is due to the fact that Poseidon/Firmament is able to amortize more and more work across workloads. | ||
- Poseidon/Firmament Scheduler outperforms K8S default scheduler by a wide margin when it comes to throughput performance numbers for scenarios where compute resource requirements are somewhat uniform across jobs (Replicasets/Deployments/Jobs). As shown in the graph below, Poseidon/Firmament scheduler end-to-end throughput performance numbers (including bind time) consistently get better and better as the number of nodes in a cluster increase. For example, for a 2,700 nodes cluster (shown in the graph below), Poseidon/Firmament scheduler is 7X (or more) better end-to-end throughput-wise that includes bind time. | ||
- Based on the extensive performance test results, Poseidon-Firmament scales much better than Kubernetes default scheduler as the number of nodes increase in a cluster. This is due to the fact that Poseidon-Firmament is able to amortize more and more work across workloads. | ||
- Poseidon-Firmament Scheduler outperforms K8S default scheduler by a wide margin when it comes to throughput performance numbers for scenarios where compute resource requirements are somewhat uniform across jobs (Replicasets/Deployments/Jobs). As shown in the graph below, Poseidon-Firmament scheduler end-to-end throughput performance numbers (including bind time) consistently get better and better as the number of nodes in a cluster increase. For example, for a 2,700 nodes cluster (shown in the graphs [here](https://github.com/kubernetes-sigs/poseidon/blob/master/docs/benchmark/README.md)), Poseidon-Firmament scheduler is 7X (or more) better end-to-end throughput-wise that includes bind time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Based on our performance test results, ...
Do you want to compare this scheduler to the default k8s?
Is this still relevant, As shown in the graph below,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was an oversight. Fixed it.
- Workloads (pods) are bulk scheduled for enabling scheduling decisions at massive scale. | ||
- Based on the extensive performance test results, Poseidon/Firmament scales much better than Kubernetes default scheduler as the number of nodes increase in a cluster. This is due to the fact that Poseidon/Firmament is able to amortize more and more work across workloads. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Workloads ... are bulk scheduled to enable scheduling at massive scale.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is done.
- Based on the extensive performance test results, Poseidon/Firmament scales much better than Kubernetes default scheduler as the number of nodes increase in a cluster. This is due to the fact that Poseidon/Firmament is able to amortize more and more work across workloads. | ||
- Poseidon/Firmament Scheduler outperforms K8S default scheduler by a wide margin when it comes to throughput performance numbers for scenarios where compute resource requirements are somewhat uniform across jobs (Replicasets/Deployments/Jobs). As shown in the graph below, Poseidon/Firmament scheduler end-to-end throughput performance numbers (including bind time) consistently get better and better as the number of nodes in a cluster increase. For example, for a 2,700 nodes cluster (shown in the graph below), Poseidon/Firmament scheduler is 7X (or more) better end-to-end throughput-wise that includes bind time. | ||
- Based on the extensive performance test results, Poseidon-Firmament scales much better than Kubernetes default scheduler as the number of nodes increase in a cluster. This is due to the fact that Poseidon-Firmament is able to amortize more and more work across workloads. | ||
- Poseidon-Firmament Scheduler outperforms K8S default scheduler by a wide margin when it comes to throughput performance numbers for scenarios where compute resource requirements are somewhat uniform across jobs (Replicasets/Deployments/Jobs). As shown in the graph below, Poseidon-Firmament scheduler end-to-end throughput performance numbers (including bind time) consistently get better and better as the number of nodes in a cluster increase. For example, for a 2,700 nodes cluster (shown in the graphs [here](https://github.com/kubernetes-sigs/poseidon/blob/master/docs/benchmark/README.md)), Poseidon-Firmament scheduler is 7X (or more) better end-to-end throughput-wise that includes bind time. | ||
|
||
- Availability of complex rule constraints. | ||
- Scheduling in Firmament is very dynamic; it keeps cluster resources in a global optimal state during every scheduling run. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Scheduling in Poseidon-Firmament is dynamic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is done.
|
||
Although, Poseidon/Firmament scheduler is capable of scheduling various types of workloads (service, batch, etc.), following are the few use cases where it excels the most: | ||
Although, Poseidon-Firmament scheduler is capable of scheduling various types of workloads (service, batch, etc.), following are the few use cases where it excels the most: | ||
1. For “Big Data/AI” jobs consisting of large number of tasks, throughput benefits are tremendous. | ||
2. Substantial throughput benefits also for service or batch job scenarios where workload resource requirements are uniform across jobs (Replicasets/Deployments/Jobs). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Service or batch jobs where workload resource requirements ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is done.
@@ -65,27 +67,29 @@ Although, Poseidon/Firmament scheduler is capable of scheduling various types of | |||
- **Alpha Release - Incubation repo.** at https://github.com/kubernetes-sigs/poseidon. | |||
- Currently, Poseidon-Firmament scheduler **does not provide support for high availability**, our implementation assumes that the scheduler cannot fail. The [design document](https://github.com/kubernetes-sigs/poseidon/blob/master/docs/design/README.md) describes possible ways to enable high availability, but we leave this to future work. | |||
- We are **not aware of any production deployment** of Poseidon-Firmament scheduler at this time. | |||
- Poseidon-Firmament is supported from Kubernetes release 1.6 and works with all subsequent releases. | |||
- The current Poseidon release is 0.6 and the corresponding Firmament release is 0.6. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, extra space: supported from
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is done.
|Taints & Tolerations|Y|Y|| | ||
|Baseline Scheduling capability in accordance to available compute resources (CPU & Memory) on a node|Y|Y**|Not all Predicates & Priorities are supported at this time.| | ||
|Extreme Throughput at scale|Y**|Y|This is due to Poseidon/Firmament bulk scheduling approach superiority versus K8S pod-at-a-time approach. Substantial throughput benefits using Firmament scheduler as long as resource requirements (CPU/Memory) for incoming Pods is uniform across Replicasets/Deployments/Jobs. This is mainly due to efficient amortization of work across Replicasets/Deployments/Jobs . 1) For “Big Data/AI” jobs consisting of large no. of tasks, throughput benefits are tremendous. 2) Substantial throughput benefits also for service or batch job scenarios where workload resource requirements are uniform across Replicasets/Deployments/Jobs.| | ||
|Extreme Throughput at scale|Y**|Y|This is due to Poseidon-Firmament bulk scheduling approach superiority versus K8S pod-at-a-time approach. Substantial throughput benefits using Firmament scheduler as long as resource requirements (CPU/Memory) for incoming Pods is uniform across Replicasets/Deployments/Jobs. This is mainly due to efficient amortization of work across Replicasets/Deployments/Jobs . 1) For “Big Data/AI” jobs consisting of large no. of tasks, throughput benefits are tremendous. 2) Substantial throughput benefits also for service or batch job scenarios where workload resource requirements are uniform across Replicasets/Deployments/Jobs.| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would take out superiority
and simply say that the bulk scheduling approach scales or increases workload placement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is done.
@@ -100,10 +104,15 @@ For developers please refer [here](https://github.com/kubernetes-sigs/poseidon/b | |||
|
|||
## Latest Throughput Performance Testing Results | |||
|
|||
As mentioned earlier, workloads (pods) are bulk scheduled for enabling scheduling decisions at massive scale. Based on the extensive performance test results, Poseidon/Firmament scales much better than Kubernetes default scheduler as the number of nodes increase in a cluster. This is due to the fact that Poseidon/Firmament is able to amortize more and more work across workloads. | |||
As mentioned earlier, workloads (pods) are bulk scheduled for enabling scheduling decisions at massive scale. Based on the extensive performance test results, Poseidon-Firmament scales much better than Kubernetes default scheduler as the number of nodes increase in a cluster. Pod-by-pod schedulers (e.g., Kubernetes scheduler) typically process one pod at a time. These schedulers have the following crucial drawbacks: | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds very familiar to the content above. Can it be combined with prior content or can this section just summarize the test results and include the link to your test results in the Poseidon github repo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have reworded this section. Please check.
…upport for K8S. (kubernetes#11752) * Added documentation about Poseidon-Firmament scheduler * Fixed some style issues. * Udpated the document as per the review comments. * Fixed some typos and updated the document * Updated the document as per the review comments. * Updated the document as per review comments. Added config details. * Updated the document as per the latest review comments. Fixed nits * Made changes as per latest suggestions. * Some more changes added.
61c2f1d
to
2a74809
Compare
|
||
Poseidon-Firmament scheduler is an alternate scheduler alongside the default Kubernetes scheduler. | ||
|
||
{{% /capture %}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I re-read the page. Looking better.
I still think this sentence could use more tweaks:
Poseidon-Firmament scheduler is an alternate scheduler that can be deployed alongside the default Kubernetes scheduler.
OR
Poseidon-Firmament scheduler is a Kubernetes compatible scheduler that can be used alongside the default Kubernetes scheduler.
Someone else will have to provide a lgtm and approve.
What happens with the doc that was merged to the 1.1.4 branch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the line.
I am not sure about this.
When we had merged the PR into the dev-1.14 branch the document was very different and has since been updated.
@bsalamat could you please advise on how to proceed.
Do we have to bring the dev-1.14 branch up-to-date with master ?
/assign @tengqm
/assign @stewart-yu
|
||
## Introduction | ||
|
||
Poseidon is the integration glue for the [Firmament scheduler](https://github.com/Huawei-PaaS/firmament) with Kubernetes. Poseidon-Firmament scheduler augments the current Kubernetes scheduling capabilities. It incorporates novel flow network graph based scheduling capabilities alongside the default Kubernetes Scheduler. Firmament scheduler models workloads and clusters as flow networks and runs min-cost flow optimizations over these networks to make scheduling decisions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other tweaks:
Poseidon is a set of services (what is it besides glue? Libraries????) that integrates the Firmament scheduler with Kubernetes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed this line.
Small edits to match style guidelines.
|
||
### Flow graph scheduling based Poseidon-Firmament scheduler provides the following key advantages: | ||
- Workloads (pods) are bulk scheduled to enable scheduling at massive scale.. | ||
- Based on the extensive performance test results, Poseidon-Firmament scales much better than the Kubernetes default scheduler as the number of nodes increase in a cluster. This is due to the fact that Poseidon-Firmament is able to amortize more and more work across workloads. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand what "amortize more and more work across workloads" means. Do you mean "workers (nodes)" instead of workloads?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Andrew, what this means is that instead of doing filtering and scoring for each individual workload, our scheduler does filtering and scoring once for workloads which have the similar resource requirements. Hence, it amortizes filtering & scoring work across workloads (pods in the case of Kubernetes).
since this is a Kubernetes extension, not a direct feature, it shouldn't use the regular feature state tagging.
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: chenopis The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…upport for K8S (kubernetes#12069) * Official documentation on Poseidon/Firmament, a new multi-scheduler support for K8S. (kubernetes#11752) * Added documentation about Poseidon-Firmament scheduler * Fixed some style issues. * Udpated the document as per the review comments. * Fixed some typos and updated the document * Updated the document as per the review comments. * Updated the document as per review comments. Added config details. * Updated the document as per the latest review comments. Fixed nits * Made changes as per latest suggestions. * Some more changes added. * Updated as per suggestions. * Changed the release process section. * SIG Docs edits Small edits to match style guidelines. * add plus to feature state * capitalization * revert feature state shortcode since this is a Kubernetes extension, not a direct feature, it shouldn't use the regular feature state tagging. (cherry picked from commit 7730c15)
…upport for K8S (kubernetes#12069) * Official documentation on Poseidon/Firmament, a new multi-scheduler support for K8S. (kubernetes#11752) * Added documentation about Poseidon-Firmament scheduler * Fixed some style issues. * Udpated the document as per the review comments. * Fixed some typos and updated the document * Updated the document as per the review comments. * Updated the document as per review comments. Added config details. * Updated the document as per the latest review comments. Fixed nits * Made changes as per latest suggestions. * Some more changes added. * Updated as per suggestions. * Changed the release process section. * SIG Docs edits Small edits to match style guidelines. * add plus to feature state * capitalization * revert feature state shortcode since this is a Kubernetes extension, not a direct feature, it shouldn't use the regular feature state tagging. (cherry picked from commit 7730c15)
…upport for K8S (kubernetes#12069) * Official documentation on Poseidon/Firmament, a new multi-scheduler support for K8S. (kubernetes#11752) * Added documentation about Poseidon-Firmament scheduler * Fixed some style issues. * Udpated the document as per the review comments. * Fixed some typos and updated the document * Updated the document as per the review comments. * Updated the document as per review comments. Added config details. * Updated the document as per the latest review comments. Fixed nits * Made changes as per latest suggestions. * Some more changes added. * Updated as per suggestions. * Changed the release process section. * SIG Docs edits Small edits to match style guidelines. * add plus to feature state * capitalization * revert feature state shortcode since this is a Kubernetes extension, not a direct feature, it shouldn't use the regular feature state tagging. (cherry picked from commit 7730c15)
…12343) * Removed the old version of the Poseidon documentation. Incorrect location. * Official documentation on Poseidon/Firmament, a new multi-scheduler support for K8S (#12069) * Official documentation on Poseidon/Firmament, a new multi-scheduler support for K8S. (#11752) * Added documentation about Poseidon-Firmament scheduler * Fixed some style issues. * Udpated the document as per the review comments. * Fixed some typos and updated the document * Updated the document as per the review comments. * Updated the document as per review comments. Added config details. * Updated the document as per the latest review comments. Fixed nits * Made changes as per latest suggestions. * Some more changes added. * Updated as per suggestions. * Changed the release process section. * SIG Docs edits Small edits to match style guidelines. * add plus to feature state * capitalization * revert feature state shortcode since this is a Kubernetes extension, not a direct feature, it shouldn't use the regular feature state tagging. (cherry picked from commit 7730c15)
* Official documentation on Poseidon/Firmament, a new multi-scheduler support for K8S. (#11752) * Added documentation about Poseidon-Firmament scheduler * Fixed some style issues. * Udpated the document as per the review comments. * Fixed some typos and updated the document * Updated the document as per the review comments. * Document timeout attribute for kms-plugin. (#12158) See 72540. * Official documentation on Poseidon/Firmament, a new multi-scheduler (#12343) * Removed the old version of the Poseidon documentation. Incorrect location. * Official documentation on Poseidon/Firmament, a new multi-scheduler support for K8S (#12069) * Official documentation on Poseidon/Firmament, a new multi-scheduler support for K8S. (#11752) * Added documentation about Poseidon-Firmament scheduler * Fixed some style issues. * Udpated the document as per the review comments. * Fixed some typos and updated the document * Updated the document as per the review comments. * Updated the document as per review comments. Added config details. * Updated the document as per the latest review comments. Fixed nits * Made changes as per latest suggestions. * Some more changes added. * Updated as per suggestions. * Changed the release process section. * SIG Docs edits Small edits to match style guidelines. * add plus to feature state * capitalization * revert feature state shortcode since this is a Kubernetes extension, not a direct feature, it shouldn't use the regular feature state tagging. (cherry picked from commit 7730c15) * Remove initializers from doc. It will be removed in 1.14 (#12331) * kubeadm: Document CRI auto detection functionality (#12462) Signed-off-by: Rostislav M. Georgiev <[email protected]> * Minor doc change for GAing Pod DNS Config (#12514) * Graduate ExpandInUsePersistentVolumes feature to beta (#10574) * Rename 2018-11-07-grpc-load-balancing-with-linkerd.md.md file (#12594) * Add dynamic percentage of node scoring to user docs (#12235) * Add dynamic percentage of node scoring to user docs * addressed review comments * delete special symbol (#12445) * Update documentation for VolumeSubpathEnvExpansion (#11843) * Update documentation for VolumeSubpathEnvExpansion * Address comments - improve descriptions * Graduate Pod Priority and Preemption to GA (#12428) * Added Instana links to the documentation (#12977) * Added link to the Instana Kubernetes integration * Added Instana link for services section Added Instana and a link to the Kubernetes integration to the analytics services section and broadened the scope to APM, monitoring and analytics. * Oxford comma /flex * More Oxford commas, because they matter * Update kubectl plugins to stable (#12847) * documentation for CSI topology beta (#12889) * Document changes to default RBAC discovery ClusterRole(Binding)s (#12888) * Document changes to default RBAC discovery ClusterRole(Binding)s Documentation for kubernetes/enhancements#789 and kubernetes/kubernetes#73807 * documentation review feedback * CSI raw block to beta (#12931) * Change incorrect string raw to block (#12926) Fixes #12925 * Update documentation on node OS/arch labels (#12976) These labels have been promoted to GA: kubernetes/enhancements#793 * local pv GA doc updates (#12915) * Publish CRD OpenAPI Documentation (#12910) * add documentation for CustomResourcePublishOpenAPI * address comments fix links, ordered lists, style and typo * kubeadm: add document for upgrading from 1.13 to 1.14 (single CP and HA) (#13189) * kubeadm: add document for upgrading from 1.13 to 1.14 - remove doc for upgrading 1.10 -> 1.11 * kubeadm: apply amends to upgrade-1.14 doc * kubeadm: apply amends to upgrade-1.14 doc (part2) * kubeadm: apply amends to upgrade-1.14 doc (part3) * kubeadm: add note about "upgrade node experimental-control-plane" + add comment about `upgrade plan` * kubeadm: add missing "You should see output similar to this" * fix bullet indentation (#13214) * mark PodReadinessGate GA (#12800) * Update RuntimeClass documentation for beta (#13043) * Update RuntimeClass documentation for beta * Update feature gate & add upgrade section * formatting fixes * Highlight upgrade action required * Address feedback * CSI ephemeral volume alpha documentation (#10934) * update kubectl documentation (#12867) * update kubectl documentation * add document for Secret/ConfigMap generators * replace `kubectl create -f` by `kubectl apply -f` * Add page for kustomization support in kubectl * fix spelling errors and address comments * Documentation for Windows GMSA feature (#12936) * Documentation for Windows GMSA feature Signed-off-by: Deep Debroy <[email protected]> * Enhancements to GMSA docs Signed-off-by: Deep Debroy <[email protected]> * Fix links Signed-off-by: Deep Debroy <[email protected]> * Fix GMSA link Signed-off-by: Deep Debroy <[email protected]> * Add GMSA feature flag in feature flag list Signed-off-by: Deep Debroy <[email protected]> * Relocate GMSA to container configuration Signed-off-by: Deep Debroy <[email protected]> * Add example for container spec Signed-off-by: Deep Debroy <[email protected]> * Remove changes in Windows index Signed-off-by: Deep Debroy <[email protected]> * Update configure-gmsa.md * Update configure-gmsa.md * Update configure-gmsa.md * Update configure-gmsa.md * Rearrange the steps into two sections and other edits Signed-off-by: Deep Debroy <[email protected]> * Fix links Signed-off-by: Deep Debroy <[email protected]> * Add reference to script to generate GMSA YAMLs Signed-off-by: Deep Debroy <[email protected]> * Some more clarifications for GMSA Signed-off-by: Deep Debroy <[email protected]> * HugePages graduated to GA (#13004) * HugePages graduated to GA * fixing nit for build * Docs for node PID limiting (kubernetes/kubernetes#73651) (#12932) * kubeadm: update the reference documentation for 1.14 (#12911) * kubeadm: update list of generated files for 1.14 NOTE: PLACEHOLDERS! these files are generated by SIG Docs each release, but we need them to pass the k/website PR CI. - add join_phase* (new sub phases of join) - add init_phase_upload-certs.md (new upload certs phase for init) - remove alpha-preflight (now both init and join have this) * kubeadm: update reference docs includes for 1.14 - remove includes from alpha.md - add upload-certs to init-phase.md - add join-phase.md and it's phases * kubeadm: update the editorial content of join and init - cleanup master->control-plane node - add some notes about phases and join - remove table about pre-pulling images - remove outdated info about self-hosting * kubeadm: update target release for v1alpha3 removal 1.14 -> 1.15 * kubeadm: copy edits for 1.14 reference docs (part1) * kubeadm: use "shell" for code blocks * kubeadm: update the 1.14 HA guide (#13191) * kubeadm: update the 1.14 HA guide * kubeadm: try to fix note/caution indent in HA page * kubeadm: fix missing sudo and minor amends in HA doc * kubeadm: apply latest amends to the HA doc for 1.14 * fixed a few missed merge conflicts * Admission Webhook new features doc (#12938) - kubernetes/kubernetes#74998 - kubernetes/kubernetes#74477 - kubernetes/kubernetes#74562 * Clarifications and fixes in GMSA doc (#13226) * Clarifications and fixes in GMSA doc Signed-off-by: Deep Debroy <[email protected]> * Update configure-gmsa.md * Reformat to align headings and pre-reqs better Signed-off-by: Deep Debroy <[email protected]> * Reformat to align headings and pre-reqs better Signed-off-by: Deep Debroy <[email protected]> * Reformat to fix bullets Signed-off-by: Deep Debroy <[email protected]> * Reword application of sample gmsa Signed-off-by: Deep Debroy <[email protected]> * Update configure-gmsa.md * Address feedback to use active voice Signed-off-by: Deep Debroy <[email protected]> * Address feedback to use active voice Signed-off-by: Deep Debroy <[email protected]> * RunAsGroup documentation for Progressing this to Beta (#12297) * start serverside-apply documentation (#13077) * start serverside-apply documentation * add more concept info on server side apply * Update api concepts * Update api-concepts.md * fix style issues * Document CSI update (#12928) * Document CSI update * Finish CSI documentation Also fix mistake with ExpandInUsePersistentVolumes documented as beta * Overall docs for CSI Migration feature (#12935) * Placeholder docs for CSI Migration feature Signed-off-by: Deep Debroy <[email protected]> * Address CR comments and update feature gates Signed-off-by: Deep Debroy <[email protected]> * Add mappings for CSI plugins Signed-off-by: Deep Debroy <[email protected]> * Add sections for AWS and GCE PD migration Signed-off-by: Deep Debroy <[email protected]> * Add docs for Cinder and CSI Migration info Signed-off-by: Deep Debroy <[email protected]> * Clarify scope to volumes with file system Signed-off-by: Deep Debroy <[email protected]> * Change the format of EBS and Cinder CSI Migration sections to follow the GCE template Signed-off-by: Deep Debroy <[email protected]> * Windows documentation updates for 1.14 (#12929) * Updated the note to indicate doc work for 1.14 * first attempt at md export from gdoc * simplifyig * big attempt * moving DRAFT windows content to PR for review * moving content to PR in markdown for review * updated note tags * Delete windows-contributing.md deleting this file as it is already ported to the github contributor guide * fixed formatting in intro and cluster setup guide * updating formatting for running containers guide * rejiggered end of troubleshooting * fixed minor typos * Clarified the windows binary download step * Update _index.md making updates based on feedback * Update _index.md updating ovn-kubernetes docs * Update _index.md * Update _index.md * updating relative docs links updating all the links to be relative links to /docs * Update _index.md * Update _index.md updates for windows services and ovn-kubernetes * formatted for correct step numbering * fix typos * Update _index.md updates for flannel PR in troubleshooting * Update _index.md * Update _index.md updating a few sections like roadmap, services, troubleshooting/filing tickets * Update _index.md * Update _index.md * Update _index.md * Fixed a few whitespace issues * Update _index.md * Update _index.md * Update _index.md * add section on upgrading CoreDNS (#12909) * documentation for kubelet resource metrics endpoint (#12934) * windows docs updates for 1.14 (#13279) * Delete sample-l2bridge-wincni-config.json this file is not used anywhere * Update _index.md * Update _index.md * Update _index.md * Update _index.md * Update _index.md * Rename content/en/docs/getting-started-guides/windows/_index.md to content/en/docs/setup/windows/_index.md moving to new location * Delete flannel-master-kubectl-get-ds.png * Delete flannel-master-kubeclt-get-pods.png * Delete windows-docker-error.png * Add files via upload * Rename _index.md to add-windows-nodes.md * Create _index.md * Update _index.md * Update add-windows-nodes.md * Update add-windows-nodes.md * Create user-guide-windows-nodes.md * Create user-guide-windows-containers.md * Update and rename add-windows-nodes.md to intro-windows-nodes.md * Update user-guide-windows-containers.md * Rename intro-windows-nodes.md to intro-windows-in-kubernetes.md * Update user-guide-windows-nodes.md * Update user-guide-windows-containers.md * Update user-guide-windows-containers.md * Update user-guide-windows-nodes.md * Update user-guide-windows-containers.md * Update _index.md * Update intro-windows-in-kubernetes.md * Update intro-windows-in-kubernetes.md fixing the pause image * Update intro-windows-in-kubernetes.md changing tables from html to MD * Update user-guide-windows-nodes.md converting tables from HTML to MD * Update intro-windows-in-kubernetes.md * Update user-guide-windows-nodes.md * Update user-guide-windows-nodes.md * Update user-guide-windows-nodes.md updating the numbering , even though it messes up the notes a little bit. Jim will file a ticket to follow up * Update user-guide-windows-nodes.md * update to windows docs for 1.14 (#13322) * Update intro-windows-in-kubernetes.md * Update intro-windows-in-kubernetes.md * Update intro-windows-in-kubernetes.md * Update intro-windows-in-kubernetes.md * Update intro-windows-in-kubernetes.md * Update user-guide-windows-containers.md * Update user-guide-windows-nodes.md * Update intro-windows-in-kubernetes.md (#13344) * server side apply followup (#13321) * change some parts of serverside apply docs in response to comments * fix typos and wording * Update config.toml (#13365)
Refer PR #11752
With reference to the above PR and after consulting with @bsalamat raising the PR to the master branch (PR has been merged in dev1.14 branch already)
Added documentation about Poseidon-Firmament scheduler
Fixed some style issues.
Updated the document as per the review comments.
Fixed some typos and updated the document
Updated the document as per the review comments.
(cherry picked from commit 4652684)