Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 14 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -150,17 +150,20 @@ Purpose: Training-ready environments with curated datasets.
> Each resource server includes example data, configuration files, and tests. See each server's README for details.

<!-- START_TRAINING_SERVERS_TABLE -->
| Resource Server | Domain | Dataset | Description | Value | Config | Train | Validation | License |
| -------------------------- | --------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------- | ----- | ---------- | ---------------------------------------------- |
| Google Search | agent | <a href='https://huggingface.co/datasets/nvidia/Nemotron-RL-knowledge-web_search-mcqa'>Nemotron-RL-knowledge-web_search-mcqa</a> | Multi-choice question answering problems with search tools integrated | Improve knowledge-related benchmarks with search tools | <a href='resources_servers/google_search/configs/google_search.yaml'>config</a> | ✓ | - | Apache 2.0 |
| Math Advanced Calculations | agent | <a href='https://huggingface.co/datasets/nvidia/Nemotron-RL-math-advanced_calculations'>Nemotron-RL-math-advanced_calculations</a> | An instruction following math environment with counter-intuitive calculators | Improve instruction following capabilities in specific math environments | <a href='resources_servers/math_advanced_calculations/configs/math_advanced_calculations.yaml'>config</a> | ✓ | - | Apache 2.0 |
| Workplace Assistant | agent | <a href='https://huggingface.co/datasets/nvidia/Nemotron-RL-agent-workplace_assistant'>Nemotron-RL-agent-workplace_assistant</a> | Workplace assistant multi-step tool-using environment | Improve multi-step tool use capability | <a href='resources_servers/workplace_assistant/configs/workplace_assistant.yaml'>config</a> | ✓ | ✓ | Apache 2.0 |
| Mini Swe Agent | coding | <a href='https://huggingface.co/datasets/SWE-Gym/SWE-Gym'>SWE-Gym</a> | A software development with mini-swe-agent orchestration | Improve software development capabilities, like SWE-bench | <a href='resources_servers/mini_swe_agent/configs/mini_swe_agent.yaml'>config</a> | ✓ | ✓ | MIT |
| Instruction Following | instruction_following | <a href='https://huggingface.co/datasets/nvidia/Nemotron-RL-instruction_following'>Nemotron-RL-instruction_following</a> | Instruction following datasets targeting IFEval and IFBench style instruction following capabilities | Improve IFEval and IFBench | <a href='resources_servers/instruction_following/configs/instruction_following.yaml'>config</a> | ✓ | - | Apache 2.0 |
| Structured Outputs | instruction_following | <a href='https://huggingface.co/datasets/nvidia/Nemotron-RL-instruction_following-structured_outputs'>Nemotron-RL-instruction_following-structured_outputs</a> | Check if responses are following structured output requirements in prompts | Improve instruction following capabilities | <a href='resources_servers/structured_outputs/configs/structured_outputs_json.yaml'>config</a> | ✓ | ✓ | Apache 2.0 |
| Equivalence Llm Judge | knowledge | <a href='https://huggingface.co/datasets/nvidia/Nemotron-RL-knowledge-openQA'>Nemotron-RL-knowledge-openQA</a> | Short answer questions with LLM-as-a-judge | Improve knowledge-related benchmarks like GPQA / HLE | <a href='resources_servers/equivalence_llm_judge/configs/equivalence_llm_judge.yaml'>config</a> | ✓ | - | Apache 2.0 |
| Mcqa | knowledge | <a href='https://huggingface.co/datasets/nvidia/Nemotron-RL-knowledge-mcqa'>Nemotron-RL-knowledge-mcqa</a> | Multi-choice question answering problems | Improve benchmarks like MMLU / GPQA / HLE | <a href='resources_servers/mcqa/configs/mcqa.yaml'>config</a> | ✓ | - | Apache 2.0 |
| Math With Judge | math | <a href='https://huggingface.co/datasets/nvidia/Nemotron-RL-math-OpenMathReasoning'>Nemotron-RL-math-OpenMathReasoning</a> | Math dataset with math-verify and LLM-as-a-judge | Improve math capabilities including AIME 24 / 25 | <a href='resources_servers/math_with_judge/configs/math_with_judge.yaml'>config</a> | ✓ | ✓ | Creative Commons Attribution 4.0 International |
| Resource Server | Domain | Dataset | Description | Value | Config | Train | Validation | License |
| -------------------------- | --------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------- | ----- | ---------- | --------------------------------------------------------- |
| Calendar | agent | <a href='https://huggingface.co/datasets/nvidia/Nemotron-RL-agent-calendar_scheduling'>Nemotron-RL-agent-calendar_scheduling</a> | - | - | <a href='resources_servers/calendar/configs/calendar.yaml'>config</a> | ✓ | ✓ | Apache 2.0 |
| Google Search | agent | <a href='https://huggingface.co/datasets/nvidia/Nemotron-RL-knowledge-web_search-mcqa'>Nemotron-RL-knowledge-web_search-mcqa</a> | Multi-choice question answering problems with search tools integrated | Improve knowledge-related benchmarks with search tools | <a href='resources_servers/google_search/configs/google_search.yaml'>config</a> | ✓ | - | Apache 2.0 |
| Math Advanced Calculations | agent | <a href='https://huggingface.co/datasets/nvidia/Nemotron-RL-math-advanced_calculations'>Nemotron-RL-math-advanced_calculations</a> | An instruction following math environment with counter-intuitive calculators | Improve instruction following capabilities in specific math environments | <a href='resources_servers/math_advanced_calculations/configs/math_advanced_calculations.yaml'>config</a> | ✓ | - | Apache 2.0 |
| Workplace Assistant | agent | <a href='https://huggingface.co/datasets/nvidia/Nemotron-RL-agent-workplace_assistant'>Nemotron-RL-agent-workplace_assistant</a> | Workplace assistant multi-step tool-using environment | Improve multi-step tool use capability | <a href='resources_servers/workplace_assistant/configs/workplace_assistant.yaml'>config</a> | ✓ | ✓ | Apache 2.0 |
| Code Gen | coding | <a href='https://huggingface.co/datasets/nvidia/nemotron-RL-coding-competitive_coding'>nemotron-RL-coding-competitive_coding</a> | - | - | <a href='resources_servers/code_gen/configs/code_gen.yaml'>config</a> | ✓ | ✓ | Apache 2.0 |
| Mini Swe Agent | coding | <a href='https://huggingface.co/datasets/SWE-Gym/SWE-Gym'>SWE-Gym</a> | A software development with mini-swe-agent orchestration | Improve software development capabilities, like SWE-bench | <a href='resources_servers/mini_swe_agent/configs/mini_swe_agent.yaml'>config</a> | ✓ | ✓ | MIT |
| Instruction Following | instruction_following | <a href='https://huggingface.co/datasets/nvidia/Nemotron-RL-instruction_following'>Nemotron-RL-instruction_following</a> | Instruction following datasets targeting IFEval and IFBench style instruction following capabilities | Improve IFEval and IFBench | <a href='resources_servers/instruction_following/configs/instruction_following.yaml'>config</a> | ✓ | - | Apache 2.0 |
| Structured Outputs | instruction_following | <a href='https://huggingface.co/datasets/nvidia/Nemotron-RL-instruction_following-structured_outputs'>Nemotron-RL-instruction_following-structured_outputs</a> | Check if responses are following structured output requirements in prompts | Improve instruction following capabilities | <a href='resources_servers/structured_outputs/configs/structured_outputs_json.yaml'>config</a> | ✓ | ✓ | Apache 2.0 |
| Equivalence Llm Judge | knowledge | <a href='https://huggingface.co/datasets/nvidia/Nemotron-RL-knowledge-openQA'>Nemotron-RL-knowledge-openQA</a> | Short answer questions with LLM-as-a-judge | Improve knowledge-related benchmarks like GPQA / HLE | <a href='resources_servers/equivalence_llm_judge/configs/equivalence_llm_judge.yaml'>config</a> | ✓ | - | Apache 2.0 |
| Mcqa | knowledge | <a href='https://huggingface.co/datasets/nvidia/Nemotron-RL-knowledge-mcqa'>Nemotron-RL-knowledge-mcqa</a> | Multi-choice question answering problems | Improve benchmarks like MMLU / GPQA / HLE | <a href='resources_servers/mcqa/configs/mcqa.yaml'>config</a> | ✓ | - | Apache 2.0 |
| Math With Judge | math | <a href='https://huggingface.co/datasets/nvidia/Nemotron-RL-math-OpenMathReasoning'>Nemotron-RL-math-OpenMathReasoning</a> | Math dataset with math-verify and LLM-as-a-judge | Improve math capabilities including AIME 24 / 25 | <a href='resources_servers/math_with_judge/configs/math_with_judge.yaml'>config</a> | ✓ | ✓ | Creative Commons Attribution 4.0 International |
| Math With Judge | math | <a href='https://huggingface.co/datasets/nvidia/Nemotron-RL-math-stack_overflow'>Nemotron-RL-math-stack_overflow</a> | - | - | <a href='resources_servers/math_with_judge/configs/math_stack_overflow.yaml'>config</a> | ✓ | ✓ | Creative Commons Attribution-ShareAlike 4.0 International |
<!-- END_TRAINING_SERVERS_TABLE -->

## 📖 Documentation
Expand Down
14 changes: 6 additions & 8 deletions resources_servers/calendar/configs/calendar.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,18 +18,16 @@ calendar_simple_agent:
- name: train
type: train
jsonl_fpath: resources_servers/calendar/data/train.jsonl
gitlab_identifier:
dataset_name: calendar
version: 0.0.1
artifact_fpath: calendar/train.jsonl
huggingface_identifier:
repo_id: nvidia/Nemotron-RL-agent-calendar_scheduling
artifact_fpath: train.jsonl
license: Apache 2.0
- name: validation
type: validation
jsonl_fpath: resources_servers/calendar/data/validation.jsonl
gitlab_identifier:
dataset_name: calendar
version: 0.0.1
artifact_fpath: calendar/validation.jsonl
huggingface_identifier:
repo_id: nvidia/Nemotron-RL-agent-calendar_scheduling
artifact_fpath: validation.jsonl
license: Apache 2.0
- name: example
type: example
Expand Down
5 changes: 2 additions & 3 deletions resources_servers/code_gen/configs/code_gen.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,8 @@ code_gen_simple_agent:
- name: opencodereasoning_filtered_train
type: train
jsonl_fpath: resources_servers/code_gen/data/opencodereasoning_filtered_25k_train.jsonl
gitlab_identifier:
dataset_name: opencodereasoning_filtered
version: 0.0.1
huggingface_identifier:
repo_id: nvidia/nemotron-RL-coding-competitive_coding
artifact_fpath: opencodereasoning_filtered_25k_train.jsonl
license: Apache 2.0
num_repeats: 1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,16 +25,14 @@ math_with_judge_simple_agent:
- name: train
type: train
jsonl_fpath: resources_servers/math_with_judge/data/math_stack_overflow_train.jsonl
gitlab_identifier:
dataset_name: math_stack_overflow
version: 0.0.1
huggingface_identifier:
repo_id: nvidia/Nemotron-RL-math-stack_overflow
artifact_fpath: math_stack_overflow_problems.jsonl
license: Creative Commons Attribution-ShareAlike 4.0 International
- name: validation
type: validation
jsonl_fpath: resources_servers/math_with_judge/data/aime24_validation.jsonl
gitlab_identifier:
dataset_name: aime24
version: 0.0.1
artifact_fpath: aime24.jsonl
huggingface_identifier:
repo_id: nvidia/Nemotron-RL-math-OpenMathReasoning
artifact_fpath: aime24_validation.jsonl
license: Apache 2.0
Original file line number Diff line number Diff line change
Expand Up @@ -27,21 +27,13 @@ math_with_judge_simple_agent:
- name: train
type: train
jsonl_fpath: resources_servers/math_with_judge/data/train.jsonl
gitlab_identifier:
dataset_name: math_open_math_reasoning
version: 0.0.1
artifact_fpath: open_math_reasoning_problems.jsonl
huggingface_identifier:
repo_id: nvidia/Nemotron-RL-math-OpenMathReasoning
artifact_fpath: open_math_reasoning_problems.jsonl
license: Creative Commons Attribution 4.0 International
- name: validation
type: validation
jsonl_fpath: resources_servers/math_with_judge/data/aime24_validation.jsonl
gitlab_identifier:
dataset_name: aime24
version: 0.0.1
artifact_fpath: aime24.jsonl
huggingface_identifier:
repo_id: nvidia/Nemotron-RL-math-OpenMathReasoning
artifact_fpath: aime24_validation.jsonl
Expand Down