test: add nanov3 prefill decode test by ZhiyuLi-Nvidia · Pull Request #2141 · NVIDIA-NeMo/RL

ZhiyuLi-Nvidia · 2026-03-23T12:49:16Z

What does this PR do ?

Background add nanov3 prefill decode test.

Vllm prefill and decode should generate consistent logprob. It was an issue before vllm<0.17.0 and fixed in vllm==0.17.0. Add the test to guard vllm prefill and decode logprob consistency.

Issues

It was an issue as per #2100

Fixed after vllm bump up.

Background:
I’d expect bump up vllm version to vLLM 0.17.0 would resolve the issue:
Some findings are as follows:
old vllm version is self-conflicting while megatron is good

vLLM decode  vs vLLM prefill:     TME = 322.490753   DIVERGED   <== vllm is self-conflicting 
vLLM decode  vs Megatron prefill: TME = 313.127106   DIVERGED   <== what we saw
vLLM prefill vs Megatron prefill: TME = 1.030557     HEALTHY   <==  megatron is good and aligns well with vllm

prefillvllm prefill is like pass all prompt generated tokens to vllm and let it calculate logprobs, it is similar as a single forward pass in training.
after bumping up vllm to 0.17.0 using the container /lustre/fsw/portfolios/coreai/users/terryk/enroot-images/gitlab-master.nvidia.com/terryk/images/nemo-rl:hemil-automodel-transformers-v5-9db945aa4.squashfs

  vLLM decode  vs vLLM prefill:     TME = 1.032336
  vLLM decode  vs Megatron prefill: TME = 1.032111
  vLLM prefill vs Megatron prefill: TME = 1.031372all 64 healthy

I think the root cause should be relevant to the prefill/decode kernel with mamba, kv cache and they were fixed with vllm bump up.

Usage

See tests.

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
[x ] Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

...

copy-pr-bot · 2026-03-23T12:49:19Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Zhiyu Li <zhiyul@NVIDIA.com>

ZhiyuLi-Nvidia · 2026-03-23T13:10:07Z

/ok to test 30f2253

Signed-off-by: Zhiyu Li <zhiyul@NVIDIA.com>

ZhiyuLi-Nvidia · 2026-03-23T13:22:52Z

/ok to test d255ec9

terrykong · 2026-03-23T16:07:28Z

approved, but let's see where the v5 PR is. if it's almost done by the time this CI finishes, let's just do this in a following up PR to not slow down that PR. check with @yuki-97 on her preference on merging this one

yuki-97 · 2026-03-24T02:34:52Z

thanks @ZhiyuLi-Nvidia @terrykong !
since CI passes and the PR is independent with other codes, it should be safe. I'll directly merge it to the bump PR.

Signed-off-by: Zhiyu Li <zhiyul@NVIDIA.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

Signed-off-by: Zhiyu Li <zhiyul@NVIDIA.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Terry Kong <terryk@nvidia.com>

ZhiyuLi-Nvidia requested review from a team and terrykong as code owners March 23, 2026 12:49

github-actions bot added documentation Improvements or additions to documentation CI Relating to CI labels Mar 23, 2026

test: add vLLM prefill-decode logprob consistency CI test for Nano v3

30f2253

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Zhiyu Li <zhiyul@NVIDIA.com>

ZhiyuLi-Nvidia force-pushed the zhiyul/add-nanov3-prefill-decode-test branch from 61485e3 to 30f2253 Compare March 23, 2026 12:57

github-actions bot removed the CI Relating to CI label Mar 23, 2026

ZhiyuLi-Nvidia changed the title ~~Zhiyul/add nanov3 prefill decode test~~ test: add nanov3 prefill decode test Mar 23, 2026

ZhiyuLi-Nvidia added CI:L1 Run doctests, unit tests, and functional tests CI Relating to CI and removed CI:L1 Run doctests, unit tests, and functional tests CI Relating to CI labels Mar 23, 2026

copy-pr-bot bot had a problem deploying to nemo-ci March 23, 2026 13:10 Error

lint

d255ec9

Signed-off-by: Zhiyu Li <zhiyul@NVIDIA.com>

copy-pr-bot bot temporarily deployed to nemo-ci March 23, 2026 13:23 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci March 23, 2026 14:40 Inactive

terrykong approved these changes Mar 23, 2026

View reviewed changes

terrykong added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Mar 23, 2026

copy-pr-bot bot temporarily deployed to nemo-ci March 23, 2026 16:15 Inactive

yuki-97 merged commit 1179183 into hemil/automodel-transformers-v5 Mar 24, 2026
47 of 49 checks passed

yuki-97 deleted the zhiyul/add-nanov3-prefill-decode-test branch March 24, 2026 02:35

yuki-97 pushed a commit that referenced this pull request Mar 24, 2026

test: add nanov3 prefill decode test (#2141)

93c3c99

Signed-off-by: Zhiyu Li <zhiyul@NVIDIA.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

terrykong pushed a commit that referenced this pull request Mar 24, 2026

test: add nanov3 prefill decode test (#2141)

99b5247

Signed-off-by: Zhiyu Li <zhiyul@NVIDIA.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Terry Kong <terryk@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: add nanov3 prefill decode test#2141

test: add nanov3 prefill decode test#2141
yuki-97 merged 2 commits intohemil/automodel-transformers-v5from
zhiyul/add-nanov3-prefill-decode-test

ZhiyuLi-Nvidia commented Mar 23, 2026 •

edited

Loading

Uh oh!

copy-pr-bot bot commented Mar 23, 2026

Uh oh!

ZhiyuLi-Nvidia commented Mar 23, 2026

Uh oh!

ZhiyuLi-Nvidia commented Mar 23, 2026

Uh oh!

terrykong commented Mar 23, 2026 •

edited

Loading

Uh oh!

yuki-97 commented Mar 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ZhiyuLi-Nvidia commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do ?

Issues

Usage

Before your PR is "Ready for review"

Additional Information

Uh oh!

copy-pr-bot bot commented Mar 23, 2026

Uh oh!

ZhiyuLi-Nvidia commented Mar 23, 2026

Uh oh!

ZhiyuLi-Nvidia commented Mar 23, 2026

Uh oh!

terrykong commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yuki-97 commented Mar 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ZhiyuLi-Nvidia commented Mar 23, 2026 •

edited

Loading

terrykong commented Mar 23, 2026 •

edited

Loading