[Bugfix][Multimodal] PyAV video backend returns keyframes labeled as targets#42586
Merged
Isotr0py merged 4 commits intoMay 14, 2026
Merged
Conversation
Signed-off-by: Ranran <hzz5361@psu.edu>
Contributor
There was a problem hiding this comment.
Code Review
This pull request optimizes video frame decoding in vllm/multimodal/video.py by reusing the decoder iterator when target frames advance monotonically, which avoids redundant decoding of GOP prefixes associated with per-frame seeking. The docstring for decode_frames was also updated to reflect this shift from keyframe decoding to forward decoding to PTS. There are no review comments to address, and I have no feedback to provide.
Isotr0py
reviewed
May 14, 2026
Member
Isotr0py
left a comment
There was a problem hiding this comment.
Can you add a regression test for this?
Signed-off-by: Ranran <hzz5361@psu.edu>
Signed-off-by: Ranran <hzz5361@psu.edu>
Contributor
Author
|
@Isotr0py added regression test |
Isotr0py
reviewed
May 14, 2026
Signed-off-by: Ranran <hzz5361@psu.edu>
Isotr0py
approved these changes
May 14, 2026
omerpaz95
pushed a commit
to omerpaz95/vllm
that referenced
this pull request
May 18, 2026
…targets (vllm-project#42586) Signed-off-by: Ranran <hzz5361@psu.edu>
omerpaz95
pushed a commit
to omerpaz95/vllm
that referenced
this pull request
May 18, 2026
…targets (vllm-project#42586) Signed-off-by: Ranran <hzz5361@psu.edu>
mfylcek
pushed a commit
to mfylcek/vllm
that referenced
this pull request
May 19, 2026
…targets (vllm-project#42586) Signed-off-by: Ranran <hzz5361@psu.edu>
jhu960213
pushed a commit
to jhu960213/vllm
that referenced
this pull request
May 20, 2026
…targets (vllm-project#42586) Signed-off-by: Ranran <hzz5361@psu.edu>
h1t35h
pushed a commit
to h1t35h/vllm
that referenced
this pull request
May 21, 2026
…targets (vllm-project#42586) Signed-off-by: Ranran <hzz5361@psu.edu>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Bug
From #39986
backend="pyav"returns the keyframe at-or-before each requested target, labeled as the target.container.seek()defaults tobackward=True(snaps to nearest keyframe ≤ pts); the previous loop took the very next decoded frame without advancing to the actual target. Affected workloads: anymedia_io_kwargs={"video": {"backend": "pyav"}}on long-GOP clips.Fix
Decode forward until
frame.pts >= pts. Reuse the open decoder while targets advance monotonically (thenp.linspacecommon case). Only re-seek on rewind or stream exhaust, so the GOP prefix isn't re-decoded once per target.Test
Synthesised single-keyframe 200-frame H.264 fixture (green channel = frame index) decoded by both backends:
mean|Δ| = 0.0.Self-contained CPU repro runs in <1s — no model, no dataset, no GPU. Happy to land it as a regression test under
tests/multimodal/if reviewers want; the existing pyav tests only check shape/dtype/count, never pixel content vs the opencv reference, which is why this slipped through.Minimal Repro
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.BEFORE SUBMITTING, PLEASE READ https://docs.vllm.ai/en/latest/contributing (anything written below this line will be removed by GitHub Actions)