Skip to content

Commit

Permalink
Updated the document as per the review comments.
Browse files Browse the repository at this point in the history
  • Loading branch information
anvithks committed Dec 18, 2018
1 parent 9aa2fa9 commit c5a72db
Showing 1 changed file with 4 additions and 6 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,7 @@ Poseidon/Firmament scheduler runs alongside the default Kubernetes Scheduler as
### Flow graph scheduling based Poseidon/Firmament scheduler provides the following key advantages:
- Workloads (pods) are bulk scheduled for enabling scheduling decisions at massive scale.
- Based on the extensive performance test results, Poseidon/Firmament scales much better than Kubernetes default scheduler as the number of nodes increase in a cluster. This is due to the fact that Poseidon/Firmament is able to amortize more and more work across workloads.
- Poseidon/Firmament Scheduler outperforms K8S default scheduler by a wide margin when it comes to throughput performance numbers for scenarios where compute resource requirements are somewhat uniform across jobs (Replicasets/Deployments/Jobs). As shown in the graph below, Poseidon/Firmament scheduler end-to-end throughput performance numbers (including bind time) consistently get better and better as the number of nodes in a cluster increase. For example, for a 2,700 nodes cluster (shown in the graph below), Poseidon/Firmament scheduler is 7X (or more) better end-to-end throughput-wise that includes bind time.
<br/><br/>
It is also important to highlight that Poseidon/Firmament Scheduling algorithm as it is outperforms K8S default scheduling algorithm by a wide margin (up to 30X or so) when it comes to throughput performance numbers for scenarios where compute resource requirements are somewhat uniform across jobs. In the future, we are planning to further reduce/optimize the bind time by doing bulk bind process in order to make use of bulk scheduling in Poseidon/Firmament in order to realize even better end-to-end throughput performance numbers.
- Poseidon/Firmament Scheduler outperforms K8S default scheduler by a wide margin when it comes to throughput performance numbers for scenarios where compute resource requirements are somewhat uniform across jobs (Replicasets/Deployments/Jobs). As shown in the graph below, Poseidon/Firmament scheduler end-to-end throughput performance numbers (including bind time) consistently get better and better as the number of nodes in a cluster increase. For example, for a 2,700 nodes cluster (shown in the graph below), Poseidon/Firmament scheduler is 7X (or more) better end-to-end throughput-wise that includes bind time.

- Availability of complex rule constraints.
- Scheduling in Firmament is very dynamic; it keeps cluster resources in a global optimal state during every scheduling run.
Expand Down Expand Up @@ -63,11 +61,11 @@ Although, Poseidon/Firmament scheduler is capable of scheduling various types of
|Feature|Kubernetes Default Scheduler|Poseidon/Firmament Scheduler|Notes|
|--- |--- |--- |--- |
|Node Affinity/Anti-Affinity|Y|Y||
|Pod Affinity/Anti-Affinity - including support for pod anti-affinity symmetry|Y|Y|Currently, throughput numbers are definitely better for default scheduler versus Poseidon/Firmament due to the recent pod affinity/anti-affinity optimizations within K8S (release 1.11 & 1.12). Poseidon/Firmament scheduler currently “drip-feeds” pods into the scheduling algorithm and processing is one-pod-at-a-time, exactly similar to how default scheduler does. We initially tried achieving pod affinity/anti-affinity functionality using native flow network constructs (convex arc costs, “XOR” & “AND” flow network constructs as described in Ionel Gog’s PHD thesis) but we ran into few challenges and could not make it work.Using the flow network constructs we intend on achieving similar throughput benefits for affinity/anti-affinity pods as we are currently seeing for normal pods.|
|Pod Affinity/Anti-Affinity - including support for pod anti-affinity symmetry|Y|Y|Currently, the default scheduler outperforms the Poseidon/Firmament scheduler pod affinity/anti-affinity functionality. We are working towards resolving this.|
|Taints & Tolerations|Y|Y||
|Baseline Scheduling capability in accordance to available compute resources (CPU & Memory) on a node|Y|Y**|Not all Predicates & Priorities are supported at this time (although, all the SIG scheduling e2e tests are passing).|
|Baseline Scheduling capability in accordance to available compute resources (CPU & Memory) on a node|Y|Y**|Not all Predicates & Priorities are supported at this time.|
|Extreme Throughput at scale|Y**|Y|This is due to Poseidon/Firmament bulk scheduling approach superiority versus K8S pod-at-a-time approach. Substantial throughput benefits using Firmament scheduler as long as resource requirements (CPU/Memory) for incoming Pods is uniform across Replicasets/Deployments/Jobs. This is mainly due to efficient amortization of work across Replicasets/Deployments/Jobs . 1) For “Big Data/AI” jobs consisting of large no. of tasks, throughput benefits are tremendous. 2) Substantial throughput benefits also for service or batch job scenarios where workload resource requirements are uniform across Replicasets/Deployments/Jobs.|
|Optimal Scheduling|Pod-by-Pod scheduler, processes one pod at a time (may result into sub-optimal scheduling)|Bulk Scheduling (Optimal scheduling)|Pod-by-Pod K8S default scheduler may assign tasks to a sub-optimal machine or may end up migrating the first task to another machine – potentially losing all the work the task has done – and replace it with the second task. By contrast, Firmament considers all unscheduled tasks at the same time together with their soft and hard constraints. Thus, avoids unnecessary task migrations and wasting task work.|
|Optimal Scheduling|Pod-by-Pod scheduler, processes one pod at a time (may result into sub-optimal scheduling)|Bulk Scheduling (Optimal scheduling)|Pod-by-Pod K8S default scheduler may assign tasks to a sub-optimal machine. By contrast, Firmament considers all unscheduled tasks at the same time together with their soft and hard constraints.|
|Colocation Interference Avoidance|N|N**|Planned in Poseidon/Firmament.|
|Priority Pre-emption|Y|N**|Partially exists in Poseidon/Firmament versus extensive support in K8S default scheduler.|
|Inherent Re-Scheduling|N|Y**|Poseidon/Firmament scheduler supports workload re-scheduling. In each scheduling run it considers all the pods, including running pods, and as a result can migrate or evict pods – a globally optimal scheduling environment.|
Expand Down

0 comments on commit c5a72db

Please sign in to comment.