splitting list of torch.Tensor causes "repeating" behavior of the output #761

wilke0818 · 2024-07-15T23:56:26Z

What version of Pydra are you using?
- 0.23
What were you trying to do?
- Pass torch.Tensors to my function by splitting them as a list.
What did you expect will happen?
- Expected the tensors to be passed to my function, have the function run, and get the result for each tensor.
- if I had a function that just adds 2 to the values in a tensor, inputting the list of tensors [torch.Tensor([1]), torch.Tensor([2])] to be split into this function should create an output of [torch.Tensor([3]), torch.Tensor([4])]
What actually happened?
- Results "repeat" such that the output of a function (tested with multiple functions) match the result of the first split
- To continue the example from above, [torch.Tensor([1]), torch.Tensor([2])] gives out [torch.Tensor([3]), torch.Tensor([3])]
- Similarly, [torch.Tensor([1,2]), torch.Tensor([3,4])] gives out [torch.Tensor([3,4]), torch.Tensor([3,4])]
Can you replicate this behavior?
- yes, the toy example below matches the above description (this is simplified from what I was actually trying to do but I imagine fixing on this toy example will fix other issues)

import torch
import pydra

@pydra.mark.task
def test_task(fake_audio_input):
  return fake_audio_input+2

wf = pydra.Workflow(name='wf_test', input_spec=['audio_input'])

wf.split('audio_input', audio_input =[torch.tensor([1]),torch.tensor([2])]).combine('audio_input')

wf.add(test_task(name='testing', fake_audio_input=wf.lzin.audio_input))
# wf.combine('audio_input')
wf.set_output([('wf_out', wf.testing.lzout.out)])


with pydra.Submitter(plugin='cf') as sub:
    sub(wf)
print(wf.done)
# results = wf.result(return_inputs='val')
results = wf(plugin='cf')
print(results)
# output: [Result(output=Output(wf_out=tensor([3])), runtime=None, errored=False), Result(output=Output(wf_out=tensor([3])), runtime=None, errored=False)]

Similarly, using other tensor structures doesn't change the behavior:

wf.split('audio_input', audio_input =[torch.tensor([1,2]),torch.tensor([3,4])]).combine('audio_input')
# output: [Result(output=Output(wf_out=tensor([3, 4])), runtime=None, errored=False), Result(output=Output(wf_out=tensor([3, 4])), runtime=None, errored=False)]

or

wf.split('audio_input', audio_input =[torch.tensor([[1],[2]]),torch.tensor([[3],[4]])]).combine('audio_input')
# output: [Result(output=Output(wf_out=tensor([[3],
#        [4]])), runtime=None, errored=False), Result(output=Output(wf_out=tensor([[3],
#        [4]])), runtime=None, errored=False)]

Notably, using numpy does give the expected behavior (likely as a result of #340)

wf.split('audio_input', audio_input =[np.array([1,2]),np.array([3,4])]).combine('audio_input')
# output: [Result(output=Output(wf_out=array([3, 4])), runtime=None, errored=False), Result(output=Output(wf_out=array([5, 6])), runtime=None, errored=False)]

The text was updated successfully, but these errors were encountered:

wilke0818 · 2024-07-16T17:54:02Z

A further look shows that the checksum/hashing process might be the root cause. The checksum/hashes for tensors seems to be the same even if the values of the tensors differ, whereas this is not the case for numpy.

effigies · 2024-07-16T18:10:14Z

If that's the case, then what you probably need to do is register a serializer for the type:

from pydra.utils.hash import register_serializer, Cache

@register_serializer(torch.tensor)
def bytes_repr_tensor(obj: torch.tensor, cache: Cache) -> Iterator[bytes]:
    # Some efficient method for turning the object into a byte sequence to hash

See https://github.com/nipype/pydra/blob/master/pydra/utils/hash.py for examples.

If you have an approach that will work with all array-likes, we could update bytes_repr_numpy to apply more broadly.

wilke0818 · 2024-07-17T23:55:27Z

Hi @effigies I have coded up a solution locally, but am getting
remote: Permission to nipype/pydra.git denied to wilke0818. fatal: unable to access 'https://github.com/nipype/pydra.git/': The requested URL returned error: 403
was wondering if this is related to the repo or is with my account? Not sure if I need to be added to the repository to make contributions.

effigies · 2024-07-17T23:58:43Z

You'll need to fork the repository, push to a branch on your own fork, and then create a pull request.

See https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/getting-started/about-collaborative-development-models.

satra · 2024-07-18T19:33:17Z

it should be the case that without any registering additional serializers pydra should behave appropriately for hashing an arbitrary object (however inefficiently). this is the second example where this has broken down. that seems like a pydra bug.

effigies · 2024-07-18T19:55:54Z

C extension objects are going to be difficult, as they may not be introspectable (or differentiable) in the same way as pure Python objects, using __dict__ or __slots__. I believe this is why numpy arrays were special-cased in the beginning.

In #762 I have suggested that we identify numpy array API objects with a protocol, which should cover many of these use cases.

satra · 2024-07-18T20:58:42Z

perhaps this issue is more about how we are detecting types of objects then. may be if we are not confident, we can and should fallback to hashing the pickled bytestream (that should generally work). i believe if this code reached the "pickle and has bytestream part", @wilke0818 wouldn't have the erroneous behavior that he noticed.

wilke0818 added the bug Something isn't working label Jul 15, 2024

wilke0818 mentioned this issue Jul 18, 2024

Registered serializer for common classes of additional array-like objects #762

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

splitting list of torch.Tensor causes "repeating" behavior of the output #761

splitting list of torch.Tensor causes "repeating" behavior of the output #761

wilke0818 commented Jul 15, 2024

wilke0818 commented Jul 16, 2024

effigies commented Jul 16, 2024

wilke0818 commented Jul 17, 2024

effigies commented Jul 17, 2024

satra commented Jul 18, 2024

effigies commented Jul 18, 2024 •

edited

Loading

satra commented Jul 18, 2024

splitting list of torch.Tensor causes "repeating" behavior of the output #761

splitting list of torch.Tensor causes "repeating" behavior of the output #761

Comments

wilke0818 commented Jul 15, 2024

wilke0818 commented Jul 16, 2024

effigies commented Jul 16, 2024

wilke0818 commented Jul 17, 2024

effigies commented Jul 17, 2024

satra commented Jul 18, 2024

effigies commented Jul 18, 2024 • edited Loading

satra commented Jul 18, 2024

effigies commented Jul 18, 2024 •

edited

Loading