Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: make TaskParameter type field optional #76

Closed
pditommaso opened this issue Aug 3, 2017 · 13 comments
Closed

Proposal: make TaskParameter type field optional #76

pditommaso opened this issue Aug 3, 2017 · 13 comments

Comments

@pditommaso
Copy link

pditommaso commented Aug 3, 2017

Currently the TaskParameter type is a required field. The caller needs to specify if the path is a file or directory both for input and output parameters.

However I have the impression this information add almost no value, because the server implementation needs in any case query the remote resource the apply the required operation (upload/download).

On the other hand, requiring this information as mandatory severely limits the ability of tools that dynamically discovers the type of output paths, such as Nextflow, to implement this specification.

My proposal is to make this field optional. When it is specified the semantic remains identical, and the server validates the inputs/outputs verifying the type matches. When the type is not specified the server just applies the operation to the path regardless if it is a file or directory.

@buchanae
Copy link
Contributor

buchanae commented Aug 3, 2017

We've considered whether we should just remove this field. Previously there were other fields which made it necessary, but I believe they have been removed. If possible, I would prefer removing it instead of making it optional.

@geoffjentry
Copy link
Contributor

Agree that I'd prefer to try to remove things vs. optionalizing them where possible.

@adamstruck
Copy link
Member

+1 for removing the field.

@pditommaso
Copy link
Author

pditommaso commented Aug 4, 2017

I'm fine to remove it as TaskParameter. It may still be needed in the task view in relation to #77 (comment) .

@buchanae
Copy link
Contributor

As I understand, one argument for keeping this field is that it adds a level of type checking to the system: if you define the input/output as a Directory and the system encounters a File, the system should let the user know something unexpected occurred by returning an error.

A counterargument is that this type checking responsibility should be handled by workflow engines or other TES clients. In the case of Nextflow, this allows them to forgo type checking, which is a core feature of their system. Engines which want simple file/directory type checking would implement this by communicating with the storage layer directly.

Hope I captured the arguments accurately.

From a technical standpoint, we (the Funnel dev team) think it's possible to implement a system with a pre-defined input/output file type in Funnel, but we'd like the chance to get it fully implemented in order to discover any edge cases.

@pditommaso
Copy link
Author

pditommaso commented Oct 24, 2017

There's are three options here: 1) keep mandatory, 2) make it optional, 3) removed it.

I've explained in the original comment why having it as a mandatory field is a problem. I'm fine both with 2 and 3. Though I tend to think the best solution would be 2 (optional), because it would give the opportunity to enforce the type checking if needed/requested.

Also it could be useful in relation to #77 (eg. outputs all files/directories matching a glob pattern).

@adamstruck
Copy link
Member

I am also fine with both options 2 and 3. I'd like to get some version of this and #77 into the spec soon.

@geoffjentry
Copy link
Contributor

@erikvdbergh Is this something you all have an opinion on?

@erikvdbergh
Copy link

I would want to keep it around but making it optional is fine. I'm curious @pditommaso how NextFlow handles HTTP inputs when this is not defined? A lot of http endpoints don't allow directory listing so for us it is a quick way to throw it back with an error.

I understand that for FTP, S3 and other endpoints this is a moot field, so I'd agree with making it optional. Keeping it around would also prevent clobbering of directories with files if the output type is unclear or in case of some errors.

@pditommaso
Copy link
Author

pditommaso commented Jun 13, 2018

how NextFlow handles HTTP inputs when this is not defined?

Nextflow does not allow glob patter for HTTP input files, exactly for the reason you have mentioned. However the proposal for an optional GLOB file type is related to *output* files.

@wleepang
Copy link

I'm curious if there are any updates on this issue. Reading through the thread it looks like there is consensus on have the field be optional. What would be the appropriate next steps?

@kellrott
Copy link
Member

kellrott commented Sep 2, 2021

The current proposal for v1.1 is to make the field optional, and then request the server fill it out during runtime once it accesses the actual elements.

@kellrott kellrott added this to the 1.1 milestone Oct 12, 2021
@uniqueg
Copy link
Contributor

uniqueg commented Jul 19, 2022

Related PR #155

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants