Skip to content

Optimize nemotron VL image/video preprocessing#40283

Merged
tomeras91 merged 2 commits into
vllm-project:mainfrom
netanel-haber:optimize-nemotron-image-video-preprocessing
Apr 19, 2026
Merged

Optimize nemotron VL image/video preprocessing#40283
tomeras91 merged 2 commits into
vllm-project:mainfrom
netanel-haber:optimize-nemotron-image-video-preprocessing

Conversation

@netanel-haber
Copy link
Copy Markdown
Contributor

@netanel-haber netanel-haber commented Apr 19, 2026

Purpose

Compile and reorganize image/video preprocessing for nemotron nano VL, reducing the amount of CPU time and memory needed.

  • Fused resize+normalize+cast under @torch.compile — CPU kernel for permute → bicubic → /255 → (x-mean)/std → dtype.
  • dtype conversion integrated in the fusion to avoid a later separate autocast
  • contiguous fused to avoid a later separate H2H copy
  • Skip torch.cat on the single-image / single-video path to avoid a redundant copy
  • Batched tokenizer call for video frame separators
  1 video of 512x512x512, H100
Before:     apply_hf_processor_ms 898.57 898.63 4.58 905.18
After:       apply_hf_processor_ms 254.21 254.56 3.35 260.79

@netanel-haber:
LGTM. I ran evals. VoxPopuli (audio+text), InfoVQA_VAL (image+text) and DailyOmni (video+audio+text) are on par before and after.
Originally @milesial's pr: #40093 - I moved it to my fork to just fix DCO and push it through, since he is currently AFK. Otherwise, there are no changes.

Signed-off-by: milesial <milesial@users.noreply.github.com>
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request optimizes the nano_nemotron_vl processor by introducing a compiled _bicubic_resize_and_normalize function that fuses resizing, normalization, and dtype casting. It also adds _pil_to_nhwc_tensor for efficient image conversion and refactors get_video_repl to use batch tokenization for frame separators. Preprocessing logic for both images and videos has been updated to reduce unnecessary tensor concatenations and support broader configuration of normalization parameters. I have no feedback to provide.

Copy link
Copy Markdown
Member

@tomeras91 tomeras91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the optimizations!

@tomeras91 tomeras91 added the ready ONLY add when PR is ready to merge/full CI is needed label Apr 19, 2026
@tomeras91 tomeras91 enabled auto-merge (squash) April 19, 2026 12:40
@tomeras91 tomeras91 merged commit 982beae into vllm-project:main Apr 19, 2026
47 checks passed
bnellnm pushed a commit to neuralmagic/vllm that referenced this pull request Apr 20, 2026
Signed-off-by: milesial <milesial@users.noreply.github.com>
Co-authored-by: milesial <milesial@users.noreply.github.com>
baonudesifeizhai pushed a commit to baonudesifeizhai/vllm that referenced this pull request Apr 23, 2026
Signed-off-by: milesial <milesial@users.noreply.github.com>
Co-authored-by: milesial <milesial@users.noreply.github.com>
avinashsingh77 pushed a commit to avinashsingh77/vllm that referenced this pull request Apr 27, 2026
Signed-off-by: milesial <milesial@users.noreply.github.com>
Co-authored-by: milesial <milesial@users.noreply.github.com>
Signed-off-by: Avinash Singh <avinashsingh.rcoem@gmail.com>
Lafunamor pushed a commit to Lafunamor/vllm that referenced this pull request May 1, 2026
Signed-off-by: milesial <milesial@users.noreply.github.com>
Co-authored-by: milesial <milesial@users.noreply.github.com>
Signed-off-by: Adrian <info@zzit.ch>
mystous pushed a commit to mystous/vllm_hybrid that referenced this pull request May 10, 2026
Signed-off-by: milesial <milesial@users.noreply.github.com>
Co-authored-by: milesial <milesial@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants