-
Notifications
You must be signed in to change notification settings - Fork 732
Description
Bug report
On Nextflow version 23.04.2
, a few Google Batch users have encountered issues when running jobs in us-central1
on Google Batch, if there are machine families specified for machineType
, the job fails and returns error indicating [GOOGLE BATCH] Cannot select machine type using cloud info for task: <process_name> | null
It seems a fix for this was already pushed in #3961, and an updated release for the nf-google plugin in Nextflow is required.
Expected behavior and actual behavior
With google.batch.spot = True
and specifying a VM family with machineType = 'n2-*'
, jobs should be exclusively submitted to n2-*
instance types, or fall back to letting Google Batch decide on a custom type depending on CPU and memory allocated for the task, especially if the pricing info can't be retrieved from the cloud info API for that region.
Only seeing this issue with the us-central1
region currently. Tested in europe-north1
(Finland) and machine type selection works fine.
Steps to reproduce the problem
Run nextflow run nf-core/rnaseq -r 3.12.0 -profile test
with the following configuration:
google {
project = 'tower-cloud-testing'
location = 'us-central1'
batch {
spot = true
}
process {
machineType = 'n2-*'
}
Program output
Trimmed GCP batch job log for one of the failed jobs:
allocationPolicy:
instances:
- policy:
machineType: n2-*
provisioningModel: SPOT
status:
state: FAILED
statusEvents:
- description: Job state is set from QUEUED to SCHEDULED for job projects/687213979415/locations/us-central1/jobs/nf-8b82962a-1687270780854.
eventTime: '2023-06-20T14:19:49.943016717Z'
type: STATUS_CHANGED
- description: "Job gets no longer retryable information Batch Error: code - CODE_GCE_BAD_REQUEST,\
\ description - googleapi: Error 400: Invalid value for field 'resource.properties.machineType':\
\ 'n2-*'. Machine type 'n2-*' must be a valid resource name., invalid, already\
\ retried 3 times, errors record CODE_GCE_BAD_REQUEST."
Nextflow log excerpt:
DEBUG n.c.g.batch.GoogleBatchTaskHandler - [GOOGLE BATCH] Cannot select machine type using cloud info for task: `NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GUNZIP_GTF (genes.gtf.gz)` | Cannot invoke "java.lang.Iterable.iterator()" because "self" is null
Environment
- Nextflow version: 23.04.02 build 5870
- Operating system: Linux
- nf-google plugin version: 1.7.3