[WIP][ENH] Created BoutiquesTask for simiplified execution of Boutiques tools in Pydra #210

ValHayot · 2020-03-01T18:49:17Z

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Summary

Boutiques tools can be executed in Pydra, by, for example, executing a command such as :

pydra.engine.task.BoutiquesTask("zenodo.3267250", name="fsl_bet", bosh_args=[
             "-v{0}:{0}".format(os.path.abspath(args.bids_dir)),
             "-v{0}:{0}".format(os.path.abspath(args.output_dir))],
             infile=wf.lzin.infile,
             maskfile=wf.lzin.maskfile
)

Checklist

All tests passing
I have added tests to cover my changes
I have updated documentation (if necessary)
My code follows the code style of this project
(we are using black: you can pip install pre-commit,
run pre-commit install in the pydra directory
and black will be run automatically with each commit)

Acknowledgment

I acknowledge that this contribution will be available under the Apache 2 license.

…handled

djarecka · 2020-03-03T18:56:59Z

Thanks @ValHayot for the PR. I wanted to finally start working on Boutiques :)

@satra and I were originally thinking about slightly different approach - reading boutiques spec and running the ShellCommandTask. Perhaps we could have both, but I'm curious what do you think and if you were also considering it

ValHayot · 2020-03-04T19:23:34Z

Either works, but I'd imagine that there'd be a bit of redundancy with ShellCommandTasks. For instance, Boutiques already does all the output file validation, so I'm not quite sure why the user should have to provide an OutputSpec to get more than just rc, stdout and stderr when using Boutiques. However, the most straightforward way to get the output file names and their keybindings is through the Python interface, I believe.

I'm going to CC @glatard as he might have something to add.

glatard · 2020-03-04T19:30:25Z

Running a Boutiques task through ShellCommand would require forking a shell to call the Boutiques interpreter that would then start another Python interpreter. I thought it would be lighter to just go through the Boutiques Python API. But of course it adds a dependency.

satra · 2020-03-05T01:59:25Z

i think this is a good starting point. some comments below.

so I'm not quite sure why the user should have to provide an OutputSpec to get more than just rc, stdout and stderr

@ValHayot - how would i select say the brain output of bet and the mask output of bet for downstream tasks. pydra is a dataflow engine so the key aspect of pydra is to be able to construct workflows.

@glatard - going through the api is fine for now. the key thing we wanted to see is if we could give people a quick way to create neuroimaging workflows using the boutiques descriptors of tools. eventually we will convert the nipype interfaces to pydra tasks, but we are going to break away the tools into their own packages. i would also make boutiques an optional add on rather than a pydra dependency.

pydra would put you in a cached working directory when the task is run. so in the above example, the second part of bosh_args would have to be filled in during the run.

djarecka · 2020-03-30T03:52:46Z

@ValHayot - I've been trying to work on you example, and I have several questions. Perhaps, two most important for the beginning:

is there a reason why have you used boutiques.descriptor2func instead of a command line tool?
i'm getting an error, that it can't find container engine. I do have docker, do I have to have something specific? Do you always have to run in a container?

Perhaps we could also chat one day if you find some time?

ValHayot · 2020-04-16T05:19:06Z

Hey @djarecka , @satra , sorry for taking so long to get back to you. I wanted to fix my PR before replying so I could show you what I meant, and then got busy.

Basically, when you run a task, it produces an ExecutorOutput object, which contains the output files and error codes. Now of course, in my example, I left the OutputSpec to be "out" and just assigned a list of ExecutorOutput filenames to out (see: line 356). However, if you want to set the OutputSpec in the init to ensure that further down the line it doesn't crash because it was not defined prior to execution, parsing the output-files section of the Zenodo descriptor would enable you to do so.
Although, perhaps you'd like users to explicitly define the outputs such that it's clearer to them how they should be transferring the data over to the next task?

Feel free to contact me. Should be a bit more available now :)

djarecka · 2020-04-16T16:01:53Z

@ValHayot - thanks for your answer! I was parsing output-files from the zenodo file, but the problem is that not all output is created, it depends on the arguments used with the command. So it would be great to know which output is expected to exist after running the task.

How can I get ExecutorOutput when I run bosh as a cli? I'm almost sure that last week I saw some files that were created by boutiques after running the task that had the list of outputs, but yesterday I couldn't find anything, and I'm not sure anymore if I saw these files, or just wanted to see ;-)

glatard · 2020-04-19T22:38:46Z

@djarecka unfortunately the executor output object isn't saved from the CLI, but we could have than done quickly if you need it. In this way, output files wouldn't be known before they are produced though.

If you to know which files will be produced before the tool is run, evaluate can do it like this, but with limitations. For instance:

bosh evaluate zenodo.2602109 '{"in_file": "data.nii.gz", "out_file": "out"}' output-files

would return all possible output files since the tool interface doesn't include conditional outputs.

djarecka · 2020-04-21T15:48:51Z

@glatard - I'm thinking what should be the best way to provide output_specification...

One way would be to expect user to specify which output she/he expects and simply return error if the output is not created.
The other way is to read the zenodo file and get all of the output-files (as I'm doing now), and at the end check which one are present, and raise an error only when the output that is not created is a part of a workflow connection, but this is also not perfect...

@satra - do you have opinion?

satra · 2020-04-21T16:22:16Z

i think there are two questions implicit in this discussion:

what is the list of output keys? this would be used to allow users to connect outputs of the task to downstream tasks.
what are the values of these keys after execution? this would be used after execution to set the results structure of the task and thus provide values to downstream nodes.

@glatard and @djarecka - one is at the point of task creation, the other is after execution. i think (1) would come from the spec and (2) would come from some serialization of the executor object.

djarecka · 2020-04-21T16:39:48Z

@satra - so, I was more thinking about the first issue, that's why having the list of the output after execution doesn't really help. But the zenodo spec does not give you answer to this questions, since you have all possible outputs, doesn't take into account the list of argument provided to the command. That's why I've started thinking that maybe the user should be required to specify list of outputs (that would have to be a subset of list generated from the zenodo file)

glatard · 2020-04-21T16:58:03Z

@djarecka just to be clear: it is possible for a Boutiques spec developer to map outputs to input values, this is done through conditional outputs. It is also possible that they specify outputs to be "optional", that is to appear without a specified relation with inputs.

satra · 2020-04-21T17:02:32Z

But the zenodo spec does not give you answer to this questions

i think the zenodo spec gives you the answer to (1) just like nipype outputspec. but it does not give the answer to 2. nipype allows connecting any output independent of what the inputs are. whether that output carries a value is dependent on the execution of the interface. nipype does not impose the list_outputs condition on the workflow creator.

djarecka · 2020-04-21T17:16:44Z

@glatard - but I understand that the developer doesn't have to provide conditional outputs, at least in the fsl_bet example I don't see it

@satra - if you are saying that it's ok to connect output that is not created during the execution of the task, then I could indeed create output_spec using zenodo spec. However, will likely have to make some other changes, since right now that would raise an error if the file can't be found

glatard · 2020-04-21T17:33:15Z

Yes you are right, conditional outputs are optional.

ValHayot added 4 commits February 29, 2020 13:22

added a pydra wrapper for boutiques tools

67e3dee

removed some unused imports

c7e0d29

some reformating and modifications to how input and output specs are …

5a5bfee

…handled

changed bosh_args default value

b2740f7

djarecka mentioned this pull request Apr 9, 2020

Adding BoshTask #231

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP][ENH] Created BoutiquesTask for simiplified execution of Boutiques tools in Pydra #210

[WIP][ENH] Created BoutiquesTask for simiplified execution of Boutiques tools in Pydra #210

ValHayot commented Mar 1, 2020

djarecka commented Mar 3, 2020

ValHayot commented Mar 4, 2020

glatard commented Mar 4, 2020

satra commented Mar 5, 2020

djarecka commented Mar 30, 2020

ValHayot commented Apr 16, 2020 •

edited

Loading

djarecka commented Apr 16, 2020

glatard commented Apr 19, 2020

djarecka commented Apr 21, 2020

satra commented Apr 21, 2020

djarecka commented Apr 21, 2020

glatard commented Apr 21, 2020

satra commented Apr 21, 2020

djarecka commented Apr 21, 2020

glatard commented Apr 21, 2020

[WIP][ENH] Created BoutiquesTask for simiplified execution of Boutiques tools in Pydra #210

Are you sure you want to change the base?

[WIP][ENH] Created BoutiquesTask for simiplified execution of Boutiques tools in Pydra #210

Conversation

ValHayot commented Mar 1, 2020

Types of changes

Summary

Checklist

Acknowledgment

djarecka commented Mar 3, 2020

ValHayot commented Mar 4, 2020

glatard commented Mar 4, 2020

satra commented Mar 5, 2020

djarecka commented Mar 30, 2020

ValHayot commented Apr 16, 2020 • edited Loading

djarecka commented Apr 16, 2020

glatard commented Apr 19, 2020

djarecka commented Apr 21, 2020

satra commented Apr 21, 2020

djarecka commented Apr 21, 2020

glatard commented Apr 21, 2020

satra commented Apr 21, 2020

djarecka commented Apr 21, 2020

glatard commented Apr 21, 2020

ValHayot commented Apr 16, 2020 •

edited

Loading