Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workflow fails if there is an existing Batch pool using unsupported VM size #421

Closed
tonybendis opened this issue Jul 20, 2022 · 0 comments · Fixed by #472
Closed

Workflow fails if there is an existing Batch pool using unsupported VM size #421

tonybendis opened this issue Jul 20, 2022 · 0 comments · Fixed by #472
Assignees
Labels
Bug Something isn't working
Milestone

Comments

@tonybendis
Copy link
Contributor

Describe the bug
When using an existing Batch account, the deployment fails while running a test workflow, if there is an existing pool with an unsupported VM size.

Steps to Reproduce
Create a batch account and a pool using Basic_A0 VM size. Deploy new CoA instance using --BatchAccountName parameter. The deployment fails with "Test workflow failed" and storage account contains a file in workflows/failed directory with the following content:
..."FailureReason": "UnknownError",
"SystemLogs": [
"Object reference not set to an instance of an object.",
" at TesApi.Web.BatchScheduler.<>c__DisplayClass44_0.<CheckBatchAccountQuotas...

Additional context
This can be reproduced by creating the pool using unsupported VM size in existing Cromwell on Azure instance and submitting a workflow.

Solution
Possible fix is to ignore unsupported VM sizes when calculating the number of vCPUs in use. This may result in underreporting of usage if those pools have running nodes. Alternatively, use an API to get the current list of VM sizes for this purpose, while still maintaining the current hardcoded list of supported VM sizes for the purpose of deciding which VM size to use for task execution. Or combine both methods, since it is possible to create pools with VM sizes that Batch does not list as supported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working
Projects
None yet
3 participants