-
Notifications
You must be signed in to change notification settings - Fork 737
Description
New feature
For rnasplice pipeline, I wanted to provide my previous salmon output as an input as suggested here
So I prepared input sheet like this:
sample,condition,salmon_results
xyz,case,s3://bucket/path/salmon/xyz
where xyz directory is in Glacier storage class but has been restored
xyz/
├── aux_info
│ ├── ambig_info.tsv
│ ├── expected_bias.gz
│ ├── fld.gz
│ ├── meta_info.json
│ ├── observed_bias_3p.gz
│ └── observed_bias.gz
├── cmd_info.json
├── lib_format_counts.json
├── libParams
│ └── flenDist.txt
├── logs
│ └── salmon_quant.log
├── quant.genes.sf
└── quant.sf
Each of the file inside xyz has been individually restored and has been tested for download. It does not require --force-glacier-transfer option.
But my sample-sheet above forces nxf_s3_download() function to attempt recursive download on xyz directory which is not downloadable without specifying --force-glacier-transfer option
Example:
aws s3 cp s3://bucket/path/salmon/xyz ./xyz/ --recursive does not work
Error:
warning: Skipping file s3://bucket/path/salmon/xyz/aux_info/ambig_info.tsv. Object is of storage class GLACIER. Unable to perform download operations on GLACIER objects. You must restore the object to be able to perform the operation. See aws s3 download help for additional parameter options to ignore or force these transfers
aws s3 cp s3://bucket/path/salmon/xyz ./xyz/ --recursive --force-glacier-transfer works
Usage scenario
Provide restored glacier directory paths as input
Suggest implementation
Similar to a change made here , --quiet was changed to --only-show-errors , --force-glacier-transfer option could be made a default in case the s3 object is not in standard storage
Or,
Introduce an optional setting similar to aws.client.glacierAutoRetrieval, say aws.client.glacierForceTransfer, wihch when true will add --force-glacier-transfer to nxf_s3_download() function