Delay the initialization of CUDA tensor converter #3419

mthrok · 2023-06-08T05:01:04Z

StreamReader decoding process is composed of the three steps;

Decode the incoming AVPacket into AVFrame
Pass AVFrame through AVFilter to perform post process
Convert the resulgint AVFrame

The internal of StreamReader was refactored in #3188 so that the above pipeline is initialized at the time output stream is defined and output stream shape can be retrieved.

For CPU decoder, this works fine because resizing happens in step 2, and the resulting shape can be retrievable.
However, this is problematic for GPU decoder, as resizing is currently done using GPU decoder option (step 1) and there seems to be no interface to retrieve the output shape. This refactor introduced regression, which is described in #3405

AVFilter internally is adoptive to the change of input frame size. This commit changes the conversion process to be similar, so that it will wait until the first frame comes in to finalize the frame shape.

Fix #3405

pytorch-bot · 2023-06-08T05:01:06Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/audio/3419

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures, 3 Pending

As of commit f6cc0ac:

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2023-06-08T12:34:38Z

@mthrok has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Summary: StreamReader decoding process is composed of the three steps; 1. Decode the incoming AVPacket into AVFrame 2. Pass AVFrame through AVFilter to perform post process 3. Convert the resulgint AVFrame The internal of StreamReader was refactored in pytorch#3188 so that the above pipeline is initialized at the time output stream is defined and output stream shape can be retrieved. For CPU decoder, this works fine because resizing happens in step 2, and the resulting shape can be retrievable. However, this is problematic for GPU decoder, as resizing is currently done using GPU decoder option (step 1) and there seems to be no interface to retrieve the output shape. This refactor introduced regression, which is described in pytorch#3405 AVFilter internally is adoptive to the change of input frame size. This commit changes the conversion process to be similar, so that it will wait until the first frame comes in to finalize the frame shape. Fix pytorch#3405 Pull Request resolved: pytorch#3419 Differential Revision: D46557505 Pulled By: mthrok fbshipit-source-id: 4cfe83a5b02fc372a5bc9fd5930711360dad0e93

facebook-github-bot · 2023-06-08T13:32:17Z

This pull request was exported from Phabricator. Differential Revision: D46557505

facebook-github-bot · 2023-06-08T16:21:01Z

@mthrok merged this pull request in 7dff24c.

github-actions · 2023-06-08T16:21:05Z

Hey @mthrok.
You merged this PR, but labels were not properly added. Please add a primary and secondary label (See https://github.com/pytorch/audio/blob/main/.github/process_commit.py).

Some guidance:

Use 'module: ops' for operations under 'torchaudio/{transforms, functional}', and ML-related components under 'torchaudio/csrc' (e.g. RNN-T loss).

Things in "examples" directory:

'recipe' is applicable to training recipes under the 'examples' folder,
'tutorial' is applicable to tutorials under the “examples/tutorials” folder
'example' is applicable to everything else (e.g. C++ examples)
'module: docs' is applicable to code documentations (not to tutorials).

Regarding examples in code documentations, please also use 'module: docs'.

Please use 'other' tag only when you’re sure the changes are not much relevant to users, or when all other tags are not applicable. Try not to use it often, in order to minimize efforts required when we prepare release notes.

When preparing release notes, please make sure 'documentation' and 'tutorials' occur as the last sub-categories under each primary category like 'new feature', 'improvements' or 'prototype'.

Things related to build are by default excluded from the release note, except when it impacts users. For example:
* Drop support of Python 3.7.
* Add support of Python 3.X.
* Change the way a third party library is bound (so that user needs to install it separately).

facebook-github-bot added the CLA Signed label Jun 8, 2023

mthrok force-pushed the fix-gpu-resize branch from 7734905 to c859eed Compare June 8, 2023 05:06

mthrok force-pushed the fix-gpu-resize branch from c859eed to f6cc0ac Compare June 8, 2023 13:32

facebook-github-bot closed this in 7dff24c Jun 8, 2023

facebook-github-bot added the Merged label Jun 8, 2023

mthrok added C++ module: IO bug fix labels Jun 8, 2023

mthrok deleted the fix-gpu-resize branch June 8, 2023 17:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Delay the initialization of CUDA tensor converter #3419

Delay the initialization of CUDA tensor converter #3419

mthrok commented Jun 8, 2023

pytorch-bot bot commented Jun 8, 2023 •

edited

Loading

facebook-github-bot commented Jun 8, 2023

facebook-github-bot commented Jun 8, 2023

facebook-github-bot commented Jun 8, 2023

github-actions bot commented Jun 8, 2023

Delay the initialization of CUDA tensor converter #3419

Delay the initialization of CUDA tensor converter #3419

Conversation

mthrok commented Jun 8, 2023

pytorch-bot bot commented Jun 8, 2023 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/audio/3419

❌ 3 New Failures, 3 Pending

facebook-github-bot commented Jun 8, 2023

facebook-github-bot commented Jun 8, 2023

facebook-github-bot commented Jun 8, 2023

github-actions bot commented Jun 8, 2023

Some guidance:

pytorch-bot bot commented Jun 8, 2023 •

edited

Loading