docs: Add missing arguments to DeepScaler evaluation by butsugiri · Pull Request #502 · NVIDIA-NeMo/RL

butsugiri · 2025-06-11T07:13:20Z

What does this PR do ?

This PR attempts to fix the document for deepscaler experiments

Currently, necessary arguments are missing, which leads to poor evaluation results

============================================================
model_name='step_300-hf' dataset_name='aime_2024'
max_new_tokens=2048 temperature=0.0 top_p=1.0 top_k=-1

metric='pass@1' num_tests_per_prompt=1

score=0.0333 (1.0/30)
============================================================

By specifying cot.txt (as is done in training setup), the result improves a bit.

============================================================
model_name='step_300-hf' dataset_name='aime_2024'
max_new_tokens=2048 temperature=0.0 top_p=1.0 top_k=-1

metric='pass@1' num_tests_per_prompt=1

score=0.1333 (4.0/30)
============================================================

By allowing the generation of more than 2048 tokens, the result is even better (This PR)

============================================================
model_name='step_300-hf' dataset_name='aime_2024'
max_new_tokens=8192 temperature=0.0 top_p=1.0 top_k=-1

metric='pass@1' num_tests_per_prompt=1

score=0.3667 (11.0/30)
============================================================

Issues

n/a

Usage

n/a

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests? --> n/a
Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests --> n/a
Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs. --> n/a

Signed-off-by: Shun Kiyono <shun.kiyono@sbintuitions.co.jp>

SahilJain314 · 2025-06-26T22:19:43Z

Thanks for the PR! Slipped past us for a bit.

abukharin-nv

LGTM! I would also suggest increasing max_len to 32K, but that is kind of a subjective choice.

butsugiri · 2025-06-30T01:32:22Z

@abukharin-nv
Thank you for reviewing my PR! I have set max_len to 32768, and it gave me the following results:

--> I am updating my PR.

============================================================
model_name='step_300-hf' dataset_name='aime_2024'
max_new_tokens=32768 temperature=0.0 top_p=1.0 top_k=-1

metric='pass@1' num_tests_per_prompt=1

score=0.3667 (11.0/30)
============================================================

Please let me know if there's anything I can improve.

Signed-off-by: Shun Kiyono <shun.kiyono@sbintuitions.co.jp>

add missing arguments

cf2c168

Signed-off-by: Shun Kiyono <shun.kiyono@sbintuitions.co.jp>

github-actions bot added the documentation Improvements or additions to documentation label Jun 11, 2025

butsugiri changed the title ~~Add missing arguments to DeepScaler evaluation~~ docs: Add missing arguments to DeepScaler evaluation Jun 11, 2025

parthchadha requested a review from abukharin-nv June 26, 2025 22:19

abukharin-nv previously approved these changes Jun 27, 2025

View reviewed changes

butsugiri dismissed abukharin-nv’s stale review via 3e00e33 June 30, 2025 01:33

8k --> 32k

d1564af

Signed-off-by: Shun Kiyono <shun.kiyono@sbintuitions.co.jp>

butsugiri force-pushed the fix-deepscaler-arguments branch from 3e00e33 to d1564af Compare June 30, 2025 01:35

SahilJain314 approved these changes Jun 30, 2025

View reviewed changes

SahilJain314 enabled auto-merge June 30, 2025 04:10

butsugiri temporarily deployed to nemo-ci June 30, 2025 04:10 — with GitHub Actions Inactive

SahilJain314 added this pull request to the merge queue Jun 30, 2025

Merged via the queue into NVIDIA-NeMo:main with commit 0b5550f Jun 30, 2025
12 of 14 checks passed

xxman-google pushed a commit to xxman-google/NeMo-RL that referenced this pull request Jun 30, 2025

docs: Add missing arguments to DeepScaler evaluation (NVIDIA-NeMo#502)

4c4ca53

Signed-off-by: Shun Kiyono <shun.kiyono@sbintuitions.co.jp>

xxman-google pushed a commit to xxman-google/NeMo-RL that referenced this pull request Jul 2, 2025

docs: Add missing arguments to DeepScaler evaluation (NVIDIA-NeMo#502)

cf828d6

Signed-off-by: Shun Kiyono <shun.kiyono@sbintuitions.co.jp>

therealnaveenkamal pushed a commit to therealnaveenkamal/RL that referenced this pull request Jul 7, 2025

docs: Add missing arguments to DeepScaler evaluation (NVIDIA-NeMo#502)

c9189a3

Signed-off-by: Shun Kiyono <shun.kiyono@sbintuitions.co.jp>

YzjiaoNvd pushed a commit to YzjiaoNvd/NeMo-RL that referenced this pull request Jul 14, 2025

docs: Add missing arguments to DeepScaler evaluation (NVIDIA-NeMo#502)

eed804e

Signed-off-by: Shun Kiyono <shun.kiyono@sbintuitions.co.jp>

KiddoZhu pushed a commit that referenced this pull request Jul 28, 2025

docs: Add missing arguments to DeepScaler evaluation (#502)

0450ac2

Signed-off-by: Shun Kiyono <shun.kiyono@sbintuitions.co.jp>

FannYYW pushed a commit to xxman-google/NeMo-RL that referenced this pull request Aug 5, 2025

docs: Add missing arguments to DeepScaler evaluation (NVIDIA-NeMo#502)

461eedc

Signed-off-by: Shun Kiyono <shun.kiyono@sbintuitions.co.jp>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: Add missing arguments to DeepScaler evaluation#502

docs: Add missing arguments to DeepScaler evaluation#502
SahilJain314 merged 2 commits intoNVIDIA-NeMo:mainfrom
butsugiri:fix-deepscaler-arguments

butsugiri commented Jun 11, 2025 •

edited

Loading

Uh oh!

SahilJain314 commented Jun 26, 2025

Uh oh!

abukharin-nv left a comment

Uh oh!

butsugiri commented Jun 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

butsugiri commented Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do ?

Issues

Usage

Before your PR is "Ready for review"

Uh oh!

SahilJain314 commented Jun 26, 2025

Uh oh!

abukharin-nv left a comment

Choose a reason for hiding this comment

Uh oh!

butsugiri commented Jun 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

butsugiri commented Jun 11, 2025 •

edited

Loading