-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switching to new machine types on aaw-dev cluster #2022
Comments
Contact for FinOps is Lyly Vu |
Reached out to Lyly to confirm choice of new virtual machine models and whether there is any special pricing arrangement for other VM models. |
Sent more info to FinOps on when and how many machines are expected to be running on the affected nodepools. Waiting for feedback. |
Links to the PRs in the infrastructure repos that will effect the machine type changes after confirmation given from FinOps: |
@jacek-dudek Please update this issue with content from emails with finOps. Let's follow up next fiscal |
Attaching email communication with FinOps regarding this, and putting issue into backlog until FinOps gets back to us with recommendations: Thank you for that information. I’ve noted that the following will be used 24/7 for AAW: Lyly From: Dudek, Jacek (StatCan) [email protected] Hello again Lyly, So here are the expected number of nodes per nodepool: I expect those numbers to stay the same after the change of machine types. And these machines are always up and running whenever the cluster is up. So generally they'll be running 24 hours per day. (We have other nodepools for intermittent workloads started by users.) Here's a link to the github issue I opened to document this work if you want to comment directly in the issue: Also, here are links to the github issues where we discussed the reasoning behind the suggested changes: Regards, From: Vuu, Lyly (SSC/SPC) [email protected] Unclassified | Non classifié Morning Jacek, First of all, FinOps appreciates your proactiveness in rightsizing your VMs – this is great! Your proposed VM SKU changes to Standard_Dds_v5 and Standard_Eds_v5 are in line with commonly used SKUs in the tenant so these choices are acceptable. We have Reserved Instances (RIs) for these SKUs although we cannot control where these RIs get applied (Azure automatically applies the RI discount where it is most beneficial). Can you please tell me how many instances you’ve expecting of each, and how many hours they are expected to run per day? Cc-ing my StatCan FinOps counterparts (Sarah and Ravi) as we’re currently in the midst of doing a RI review and the information you’re providing will be relevant. Thanks, From: Dudek, Jacek (StatCan) [email protected] Hello Lyly, I help maintain a kubernetes cluster for the Advanced Analytics Workspace platform (AAW) at Statistics Canada. Recently we reviewed the virtual machine types that are being used for the clusters hosting that platform and examined the resource utilization. We noticed that most of these machines are underutilized and we may be able to switch to lower performance models. Currently I'm proposing VM model changes across three nodepools in the development cluster as follows: I expect the number of machines in each nodepool to remain the same, and so would expect a lower monthly cost based on calculations using Microsoft Azure's pricing calculator. I wanted to confirm with you if we have any sort of special pricing arrangements on other VM models that might come into play in my cost estimations and would be a better choice. Please let me know if you require more information to make an assessment. Regards, |
Reach to cloud native team and/or fin ops to discuss switching to more cost effective machines for system, cloudmainsys, and general nodepools on aaw-dev cluster. Follow up to these issues: #1965, #1993.
Proposed machine models for each nodepool are:
cloudmainsys, system: Standard-D2ds_v5
general: Standard_E4ds_v5
The text was updated successfully, but these errors were encountered: