-
Notifications
You must be signed in to change notification settings - Fork 33
Specify compute resources using process labels #219
Conversation
- The wildcard/default compute resource label needs to be specified in the config file BEFORE loading any tool-specific compute resource labels, or it will override these settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in general it's a nice idea because it's clean up quite nicely the config.
I'm wondering however how far we should go with the label. For SCENIC, they go down to the processe but then for Cell Ranger not. Is there a reason for that ? Usually mkfastq
will take much more less time than count
for instance.
It's a good point about label granularity. But looking at the existing code, and resource usage from previous runs, these seem to be more or less sufficient. There were only a few unique values for Scenic is the most cpu/memory intensive by far and so needs specific labels for each of the three main processes. The cellrangers have one label per tool and I just took these from the original |
- Added as an optional profile `cluster_retry` - Currently only adds retry options for the labels found in the main repo
Ok, I think the bulk of the PR is essentially done. I've added labels to all processes from all submodules, some with submodule-specific labels. The last thing to do is add some documentation... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nicely cleaning the config & the code :)
Specifying all computing resources in
clusterOptions
is specific to grid computing systems and won't work with other executors (Google Pipelines, Kubernetes, AWS, etc.). Using the standard Nextflow way of specifying cpu, memory, etc. is more compatible with systems outside the VSC.Major changes:
conf/compute_resources.config
.clusterOptions
is still present for grid-specific options (the cluster account-A
parameter would go here)nextflow config ...
and can be edited by the user (instead of being hard-coded into the processes).local
/pbs
/other) is defined globally in the config and applies to all processes, but tool-specific configs can override this (to have a mix oflocal
andpbs
processes).(#154)
Submodule update progress:
channels(no processes in this submodule)