Skip to content

Adding long context benchmark MRCR#634

Merged
Kipok merged 22 commits intomainfrom
fayejf/mrcr
Aug 11, 2025
Merged

Adding long context benchmark MRCR#634
Kipok merged 22 commits intomainfrom
fayejf/mrcr

Conversation

@fayejf
Copy link
Collaborator

@fayejf fayejf commented Aug 1, 2025

OpenAI MRCR (Multi-round co-reference resolution): Long context multiple needle in a haystack benchmark

  1. Prepare data
    By dafault it prepares all 2400 samples up to 1M tokens.
ns prepare_data \
    --data_dir=/workspace/ns-data \
    --cluster=fei-ord \
    mrcr

Or you can prepare subset.

ns prepare_data \
    --data_dir=/workspace/ns-data \
    --cluster=fei-ord \
    mrcr --max_context_window 131072 --needles_subset 2 --setup needle2_128k
  1. Run evaluation
    Specific eval split or use what saved in __init__.py (default is all)
model=Meta-Llama-3.1-8B-Instruct
split=needle2_64k
ns eval \
    --cluster=fei-ord \
    --data_dir=/workspace/ns-data \
    --server_type=vllm \
    --model=/hf_models/$model \
    --server_gpus=8 \
    --benchmarks=mrcr:0 \
    --split=$split \
    --output_dir=/workspace/results/mrcr/split/$model 

fayejf and others added 7 commits July 31, 2025 10:24
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
@fayejf fayejf requested a review from Kipok August 1, 2025 05:25
Copy link
Collaborator

@Kipok Kipok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Just a few small changes are needed

from tqdm import tqdm
import tempfile

subprocess.run(["pip install tiktoken"], check=True, shell=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please move it inside the function where it's needed. Otherwise this is going to run on every import even when the script isn't called

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! changed.

output_file = data_dir / f"{setup}.jsonl"

with open(data_dir / "__init__.py", "w", encoding="utf-8") as init_file:
init_file.write(f"EVAL_SPLIT = '{setup}'\n")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

best not to override init dynamically here. Users can always provide --split argument to change this, so no need to change defaults

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Changed!

fayejf and others added 7 commits August 4, 2025 16:31
revert test

Co-authored-by: Igor Gitman <igitman@nvidia.com>
Signed-off-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Igor Gitman <igitman@nvidia.com>
Signed-off-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Igor Gitman <igitman@nvidia.com>
Signed-off-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Igor Gitman <igitman@nvidia.com>
Signed-off-by: fayejf <36722593+fayejf@users.noreply.github.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
@fayejf fayejf requested a review from Kipok August 5, 2025 16:22
Signed-off-by: Igor Gitman <igitman@nvidia.com>
@Kipok
Copy link
Collaborator

Kipok commented Aug 6, 2025

Or is it supposed to use that messages list directly (so complete the last turn of a large multi-turn generation)? In that case you should put it as a list in "messages" key and then set ++prompt_format=openai in generation args

fayejf and others added 2 commits August 6, 2025 16:06
Signed-off-by: fayejf <fayejf07@gmail.com>
@fayejf
Copy link
Collaborator Author

fayejf commented Aug 6, 2025

Or is it supposed to use that messages list directly (so complete the last turn of a large multi-turn generation)? In that case you should put it as a list in "messages" key and then set ++prompt_format=openai in generation args

Wait do we support that? I wanted it but I didn't know.
I think people follow this way complete the last turn of a large multi-turn generation. But does this support for all models in nemo-skills?

messages = json.loads(row["prompt"])
completion = client.chat.completions.create(
    model=MODEL,
    messages=messages,
)
response = completion.choices[0].message.content

@Kipok
Copy link
Collaborator

Kipok commented Aug 6, 2025

yes, that should be supported with the parameters I shared

fayejf and others added 4 commits August 8, 2025 14:13
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Copy link
Collaborator

@Kipok Kipok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks!

@Kipok Kipok merged commit 2c84e05 into main Aug 11, 2025
4 checks passed
@fayejf fayejf deleted the fayejf/mrcr branch August 11, 2025 21:50
SeanNaren pushed a commit to SeanNaren/NeMo-Skills that referenced this pull request Aug 15, 2025
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <36722593+fayejf@users.noreply.github.com>
Signed-off-by: Igor Gitman <igitman@nvidia.com>
Co-authored-by: Igor Gitman <igitman@nvidia.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
shtoshni pushed a commit that referenced this pull request Aug 15, 2025
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <36722593+fayejf@users.noreply.github.com>
Signed-off-by: Igor Gitman <igitman@nvidia.com>
Co-authored-by: Igor Gitman <igitman@nvidia.com>
Signed-off-by: Shubham Toshniwal <stoshniwal@nvidia.com>
SeanNaren pushed a commit to SeanNaren/NeMo-Skills that referenced this pull request Aug 18, 2025
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <36722593+fayejf@users.noreply.github.com>
Signed-off-by: Igor Gitman <igitman@nvidia.com>
Co-authored-by: Igor Gitman <igitman@nvidia.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
SeanNaren pushed a commit to SeanNaren/NeMo-Skills that referenced this pull request Aug 18, 2025
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <36722593+fayejf@users.noreply.github.com>
Signed-off-by: Igor Gitman <igitman@nvidia.com>
Co-authored-by: Igor Gitman <igitman@nvidia.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
wasiahmad pushed a commit that referenced this pull request Oct 1, 2025
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <36722593+fayejf@users.noreply.github.com>
Signed-off-by: Igor Gitman <igitman@nvidia.com>
Co-authored-by: Igor Gitman <igitman@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants