Create fast path for returning YUVPlanes to numpy arrays #1393

hmaarrfk · 2024-05-04T14:13:28Z

No description provided.

hmaarrfk · 2024-05-04T14:25:41Z

Thank you for considering.

WyattBlue · 2024-05-05T06:13:39Z

The problem with this patch is that when the fast path is used, it disrupts the structure of the underlying pixel format. Any code that relies on that behavior is going to break. This was discussed in #788

hmaarrfk · 2024-05-05T10:43:00Z

Yes. I remember now. Ultimately this function doesn't give that guarantee. It al4eady changes between a view and a copy.

Would you consider the addition of a keyword argument to specify the users intention?

WyattBlue · 2024-05-06T02:30:06Z

Any libraries that uses on to_ndarray() for VideoFrames probably without realizing it, reply on the underlying data format to be preserved. auto-editor, for example, depends on frame.to_ndarray().to_bytes() to return the underlying bytes so the ffmpeg cli can treat them as raw frames. This allows for efficient rendering videos.

What is the point of to_ndarray() being fast if it returns data in an undefined layout? Do you have a particular use case?

hmaarrfk · 2024-05-06T12:03:36Z

Any libraries that uses on to_ndarray() for VideoFrames probably without realizing it, reply on the underlying data format to be preserved.

The data "format" hasn't changed.

What has changed is the fact that VideoFrame and the returned numpy array share the same memory. If you think this is an issue, i'm happy to add different function, but the in its current form on main to_ndarray does the following

yuv420p yuvj420p -- data copy
yuyv422 -- data copy
gbrp10be", "gbrp12be", "gbrp14be", "gbrp16be", "gbrp10le", "gbrp12le", "gbrp14le", "gbrp16le" -- data copy
"gbrpf32be", "gbrpf32le" -- data copy
"rgb24", "bgr24" -- data view
"gray", "gray8", "rgb8", "bgr8" -- data view

so rgb24 and gray formats are returning views. I presume it was because it was easier to guarantee correctness then.

I'm asking here that yuv420p and yuvj420p return "maybe views" if the underlying image data is well organized (according to the checks in the if statement).

auto-editor, for example, depends on frame.to_ndarray().to_bytes() to return the underlying bytes so the ffmpeg cli can treat them as raw frames.

This should still work. I can add a test for this usecase.

What is the point of to_ndarray() being fast if it returns data in an undefined layout?

I utilize the well defined layout of the YUV420 format in FFMPEG to avoid the memory copy.

For a 8x4 array of pixels, YUV420 and YUV420J would store:

yyyyyyyy
yyyyyyyy
yyyyyyyy
yyyyyyyy
uuuuuuuu
vvvvvvvv

From wikipedia: https://en.wikipedia.org/wiki/Chroma_subsampling

So the "fastpath" checks that the memory is all contiguous, and simply returns a pointer to the underlying data from the FFMPEG VideoFrame object.

Avoiding this copy is pretty important since for 4k (or 8k! which I am interested in) video decoding:

3840 x 2160 x 1.5 = 12MB

per frame.

Copying this around unnecessarily creates REAL problems with your CPU cache and just slows things down alot.

As for my usecase, it revolves around 4k and 8k video encoding / decoding. And as such benefits from what appears to be micro optimizations.

I can add a test if you want, but generally speaking, I remember that without my check, something already failed. I recall it being due to some odd image shape in the testing suite which was interesting to learn about.

WyattBlue · 2024-05-07T10:05:09Z

It applying the fastpath does break auto-editor though:

It wouldn't be an issue as long as the fastpath is not applied by default. I'm sympathetic to your use-case. I do think having a new default argument is a little janky, with the **kwargs already there.

I think the way forward is make a new method to_ndarray2(). It would handle the yuv420p yuvj420p like your code does now. Making it a new method has the benefit of side-stepping any backwards compatibility problems.

hmaarrfk · 2024-05-07T10:43:27Z

It applying the fastpath does break auto-editor though:

This is really strange....

I find on place where you use the "naked" to_ndarray
https://github.com/WyattBlue/auto-editor/blob/8f11ab54fade4fa5ab4cd7cb43ae54a4212dbff3/auto_editor/analyze.py#L392

but nothing jumps to me as getting mutated.

(In fact you could save yourself some time by doing current_frame = frame.to_ndarray().astype('int16') but that is beside the point)

I would love to recreate that screenshot to see what might be going on?

I have a feeling that maybe I'm not adding enough safeguards on the fastpath.

Can you teach me how you got to your screenshot?

hmaarrfk · 2024-05-07T10:44:05Z

The other place is https://github.com/WyattBlue/auto-editor/blob/8f11ab54fade4fa5ab4cd7cb43ae54a4212dbff3/auto_editor/render/video.py#L370 but also seems "safe"

hmaarrfk · 2024-05-07T11:11:21Z

I suspect the reason that I don't see this bug it is because you are making use of elaborate filters and using filter graphs which is not what workflow i go through.

I added two checks for data continuity:

                y_plane._buffer_ptr + y_plane.buffer_size * bytes_per_pixel == u_plane._buffer_ptr and
                u_plane._buffer_ptr + u_plane.buffer_size * bytes_per_pixel == v_plane._buffer_ptr

I think they should resolve things, but I need to check that it is still "correct" in the sense that the fast path is still triggered.

hmaarrfk · 2024-05-07T12:05:33Z

I need to recheck my buffer continuity check, it doesn't seem to be triggering "true" on what I thought it should.

hmaarrfk · 2024-05-07T12:19:21Z

Hmmm, it seems that the array isn't guaranteed to be contiguous, even in the "nice" cases

ipdb> u_plane.buffer_ptr - (y_plane.buffer_ptr + y_plane.buffer_size)
4864
ipdb> v_plane.buffer_ptr - (u_plane.buffer_ptr + u_plane.buffer_size)
1728

how is any of my code working.....

hmaarrfk · 2024-05-07T12:53:06Z

it seems that maybe the function i'm looking for is:
get_buffer2 that would allow me to create the buffer that I need to ensure that memory isn't copied.
https://ffmpeg.org/doxygen/6.0/structAVCodecContext.html#aef79333a4c6abf1628c55d75ec82bede

hmaarrfk · 2024-05-07T12:53:20Z

Thank you for your thoroughness in testing!

hmaarrfk marked this pull request as ready for review May 4, 2024 14:18

WyattBlue added the tests requested Please add tests to your PR label May 6, 2024

WyattBlue removed the tests requested Please add tests to your PR label May 7, 2024

This comment was marked as off-topic.

Sign in to view

hmaarrfk force-pushed the patch-4 branch from 28c6ac0 to ba30068 Compare May 7, 2024 11:23

hmaarrfk marked this pull request as draft May 7, 2024 12:04

hmaarrfk added 2 commits May 7, 2024 08:31

Create fast path for returning YUVPlanes to numpy arrays

534bb01

Add a check for buffer continuity

08b3a7e

hmaarrfk force-pushed the patch-4 branch from ba30068 to 08b3a7e Compare May 7, 2024 12:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create fast path for returning YUVPlanes to numpy arrays #1393

Create fast path for returning YUVPlanes to numpy arrays #1393

hmaarrfk commented May 4, 2024

hmaarrfk commented May 4, 2024

WyattBlue commented May 5, 2024

hmaarrfk commented May 5, 2024

WyattBlue commented May 6, 2024

hmaarrfk commented May 6, 2024

WyattBlue commented May 7, 2024

This comment was marked as off-topic.

hmaarrfk commented May 7, 2024

hmaarrfk commented May 7, 2024

hmaarrfk commented May 7, 2024

hmaarrfk commented May 7, 2024

hmaarrfk commented May 7, 2024

hmaarrfk commented May 7, 2024

hmaarrfk commented May 7, 2024

Create fast path for returning YUVPlanes to numpy arrays #1393

Are you sure you want to change the base?

Create fast path for returning YUVPlanes to numpy arrays #1393

Conversation

hmaarrfk commented May 4, 2024

hmaarrfk commented May 4, 2024

WyattBlue commented May 5, 2024

hmaarrfk commented May 5, 2024

WyattBlue commented May 6, 2024

hmaarrfk commented May 6, 2024

WyattBlue commented May 7, 2024

This comment was marked as off-topic.

hmaarrfk commented May 7, 2024

hmaarrfk commented May 7, 2024

hmaarrfk commented May 7, 2024

hmaarrfk commented May 7, 2024

hmaarrfk commented May 7, 2024

hmaarrfk commented May 7, 2024

hmaarrfk commented May 7, 2024