Some H265 encoded videos return an error when seeking to particular points in time #179

ahmadsharif1 · 2024-08-13T16:43:11Z

🐛 Describe the bug

# First generate a test video:
conda install -c conda-forge x265

# Download and build ffmpeg
git clone https://git.ffmpeg.org/ffmpeg.git
cd ffmpeg
./configure --enable-nonfree --enable-gpl --prefix=$(readlink -f ../bin) --enable-libx265  --enable-rpath --extra-ldflags=-Wl,-rpath=$CONDA_PREFIX/lib --enable-filter=drawtext --enable-libfontconfig --enable-libfreetype --enable-libharfbuzz
ffmpeg -f lavfi -i color=size=128x128:duration=1:rate=10:color=blue -vf "drawtext=fontsize=30:fontcolor=white:x=(w-text_w)/2:y=(h-text_h)/2:text='Frame %{frame_num}'" -vcodec libx265 -pix_fmt yuv420p -g 2 -crf 10 h265_video.mp4 -y

# Now use torchcodec to seek into this file at timestamp 0.5 and write to a bmp file:
$ cat test.py

from torchcodec.decoders._simple_video_decoder import SimpleVideoDecoder
import sys
from PIL import Image

# Assume `rgb_tensor` is your PyTorch tensor with shape (3, H, W)
# The values in `rgb_tensor` should be in the range [0, 1]
def save_tensor_as_bmp(tensor, filename):
    # Convert the tensor to a numpy array
    numpy_array = tensor.mul(1).byte().cpu().numpy()

    # Reorder dimensions from (3, H, W) to (H, W, 3)
    numpy_array = numpy_array.transpose(1, 2, 0)

    # Create a PIL image from the numpy array
    image = Image.fromarray(numpy_array)

    # Save the image as a BMP file
    image.save(filename, format='BMP')



def main():
    video_path = sys.argv[1]
    ts = float(sys.argv[2])
    print(video_path)
    decoder = SimpleVideoDecoder(video_path)
    print(f"Getting frame at {ts=}")
    frame = decoder.get_frame_displayed_at(seconds=ts).data
    bmp_file = f"{video_path}.time{ts}.bmp"
    print(f"Saving to bmp file: {bmp_file}")
    save_tensor_as_bmp(frame, bmp_file)


if __name__ == "__main__":
    main()

# Run the test script like so:

python test.py h265_video.mp4 0.5

This actually fails right now (it throws an exception "no more frames to decode").

With #178 it will get "fixed" in the sense that at least we wont throw an exception, but we will return the wrong frame. i.e. if you run it you will get a bmp file with "Frame 6" instead of "Frame 5". That is a bug because the frame with "Frame 5" is the one that is displayed at timestamp=0.5 (inclusive) to timestamp=0.6 (exclusive).

The underlying cause of this buggy behavior is an FFMPEG bug with H265 videos. When we call avformat_seek_file(), with a max_ts set to an int64 timebase value corresponding to time=0.5, it seeks past our frame to the next frame.

I have filed a bug upstream about this:

https://trac.ffmpeg.org/ticket/11137

Until that bug is resolved, what we can do is to use our own index to seek into the file as opposed to letting FFMPEG seek for us. I will do that in a subsequent PR.

Versions

This bug is for torchcodec v0.0.2

The text was updated successfully, but these errors were encountered:

ahmadsharif1 · 2024-08-15T15:16:20Z

This issue is fixed for the test cases I tried.

I still believe this is a bug in FFMPEG and I have filed a bug on their tracker. But for now we work around this bug by using our own keyframe index to always seek to a keyframe that is the last keyframe before the user-requested timestamp. (Previously we were seeking to the user-requested timestamp).

FFMPEG seems to respect max_ts when seeking to keyframes. The documentation doesn't say anything about that, but the documentation is not clear here anyway.

https://ffmpeg.org/doxygen/7.0/group__lavf__decoding.html#ga3b40fc8d2fda6992ae6ea2567d71ba30

ahmadsharif1 mentioned this issue Aug 14, 2024

Use our own index to seek more accurately when it is available #180

Merged

ahmadsharif1 closed this as completed Aug 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some H265 encoded videos return an error when seeking to particular points in time #179

Some H265 encoded videos return an error when seeking to particular points in time #179

ahmadsharif1 commented Aug 13, 2024

ahmadsharif1 commented Aug 15, 2024

Some H265 encoded videos return an error when seeking to particular points in time #179

Some H265 encoded videos return an error when seeking to particular points in time #179

Comments

ahmadsharif1 commented Aug 13, 2024

🐛 Describe the bug

Versions

ahmadsharif1 commented Aug 15, 2024