diff --git a/README.md b/README.md
index 941c4b185..8dedd912b 100644
--- a/README.md
+++ b/README.md
@@ -155,7 +155,7 @@ Purpose: Training-ready environments with curated datasets.
| Google Search | agent | Nemotron-RL-knowledge-web_search-mcqa | Multi-choice question answering problems with search tools integrated | Improve knowledge-related benchmarks with search tools | config | ✓ | - | Apache 2.0 |
| Math Advanced Calculations | agent | Nemotron-RL-math-advanced_calculations | An instruction following math environment with counter-intuitive calculators | Improve instruction following capabilities in specific math environments | config | ✓ | - | Apache 2.0 |
| Workplace Assistant | agent | Nemotron-RL-agent-workplace_assistant | Workplace assistant multi-step tool-using environment | Improve multi-step tool use capability | config | ✓ | ✓ | Apache 2.0 |
-| Mini Swe Agent | coding | SWE-bench_Verified | A software development with mini-swe-agent orchestration | Improve software development capabilities, like SWE-bench | config | ✓ | ✓ | MIT |
+| Mini Swe Agent | coding | SWE-Gym | A software development with mini-swe-agent orchestration | Improve software development capabilities, like SWE-bench | config | ✓ | ✓ | MIT |
| Instruction Following | instruction_following | Nemotron-RL-instruction_following | Instruction following datasets targeting IFEval and IFBench style instruction following capabilities | Improve IFEval and IFBench | config | ✓ | - | Apache 2.0 |
| Structured Outputs | instruction_following | Nemotron-RL-instruction_following-structured_outputs | Check if responses are following structured output requirements in prompts | Improve instruction following capabilities | config | ✓ | ✓ | Apache 2.0 |
| Equivalence Llm Judge | knowledge | Nemotron-RL-knowledge-openQA | Short answer questions with LLM-as-a-judge | Improve knowledge-related benchmarks like GPQA / HLE | config | ✓ | - | Apache 2.0 |
diff --git a/docs/about/concepts/key-terminology.md b/docs/about/concepts/key-terminology.md
index 5ea705bb5..e219de2e9 100644
--- a/docs/about/concepts/key-terminology.md
+++ b/docs/about/concepts/key-terminology.md
@@ -68,15 +68,18 @@ Reward / Reward Signal
SFT (Supervised Fine-Tuning)
Training approach using examples of good model behavior. Shows successful rollouts as training data.
-DPO (Direct Preference Optimization)
- Training approach using pairs of rollouts where one is preferred over another. Teaches better vs worse responses.
-
RL (Reinforcement Learning)
Training approach where models learn through trial-and-error interaction with environments using reward signals.
Online vs Offline Training
- - **Online**: Model learns while interacting with environment in real-time (RL)
- - **Offline**: Model learns from pre-collected rollout data (SFT/DPO)
+ - **Online**: Model learns while interacting with environment in real-time
+ - **Offline**: Model learns from pre-collected rollout data
+
+DPO (Direct Preference Optimization)
+ An offline RL training approach using pairs of rollouts where one is preferred over another. Teaches better vs worse responses.
+
+GRPO (Group Relative Policy Optimization)
+ Reinforcement learning algorithm that optimizes policies by comparing groups of rollouts relative to each other. Used for online RL training with language models.
```
## Interaction Patterns
diff --git a/docs/contribute/rl-framework-integration/index.md b/docs/contribute/rl-framework-integration/index.md
index 663a24921..7e3ab0329 100644
--- a/docs/contribute/rl-framework-integration/index.md
+++ b/docs/contribute/rl-framework-integration/index.md
@@ -8,7 +8,7 @@ These guides cover how to integrate NeMo Gym into a new RL training framework. U
- Contributing NeMo Gym integration for a training framework that does not have one yet
:::{tip}
-Just want to train models? Use {ref}`NeMo RL ` instead.
+Just want to train models? Use {ref}`NeMo RL ` instead.
:::
## Prerequisites
diff --git a/docs/index.md b/docs/index.md
index 95976559b..0cc5cb287 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -108,6 +108,7 @@ Collect and view rollouts
::::
+
## Tutorials
Hands-on tutorials to build and customize your training environments.
@@ -120,23 +121,23 @@ Hands-on tutorials to build and customize your training environments.
:link-type: doc
Implement or integrate existing tools and define task verification logic.
+++
-{bdg-secondary}`custom-environments` {bdg-secondary}`tools`
+{bdg-primary}`beginner` {bdg-secondary}`30 min` {bdg-secondary}`custom-environments` {bdg-secondary}`tools`
:::
-:::{grid-item-card} {octicon}`database;1.5em;sd-mr-1` Offline Training (SFT, DPO)
-:link: tutorials/offline-training-w-rollouts
-:link-type: doc
-Train with SFT or DPO using collected rollouts.
+:::{grid-item-card} {octicon}`workflow;1.5em;sd-mr-1` Offline Training with Rollouts
+:link: offline-training-w-rollouts
+:link-type: ref
+Transform rollouts into training data for {term}`supervised fine-tuning (SFT) ` and {term}`direct preference optimization (DPO) `.
+++
{bdg-secondary}`sft` {bdg-secondary}`dpo`
:::
-:::{grid-item-card} {octicon}`zap;1.5em;sd-mr-1` RL Training with NeMo RL
-:link: tutorials/rl-training-with-nemo-rl
-:link-type: doc
-Train with GRPO using NeMo RL and NeMo Gym.
+:::{grid-item-card} {octicon}`workflow;1.5em;sd-mr-1` GRPO with NeMo RL
+:link: training-nemo-rl-grpo-index
+:link-type: ref
+Learn how to set up NeMo Gym and NeMo RL training environments, run tests, prepare data, and launch single-node and multi-node training runs.
+++
-{bdg-secondary}`grpo` {bdg-secondary}`nemo-rl`
+{bdg-primary}`training` {bdg-secondary}`rl` {bdg-secondary}`grpo`
:::
::::
@@ -200,8 +201,8 @@ Rollout Collection
tutorials/index.md
tutorials/creating-resource-server
+tutorials/nemo-rl-grpo/index.md
tutorials/offline-training-w-rollouts
-tutorials/rl-training-with-nemo-rl
```
```{toctree}
diff --git a/docs/reference/faq.md b/docs/reference/faq.md
index 8f62a6f21..30d925a70 100644
--- a/docs/reference/faq.md
+++ b/docs/reference/faq.md
@@ -14,7 +14,7 @@ Tests are strongly encouraged and you must have at least one test for every serv
# How To: Upload and download a dataset from HuggingFace
-The huggingface client requires that your credentials are in `env.yaml`, along with some other pertinent details needed to upload to the designated place.
+The huggingface client requires that your credentials are in `env.yaml`, along with some other pertinent details needed to upload to the designated place.
```yaml
hf_token: {your huggingface token}
hf_organization: {your huggingface org}
@@ -22,19 +22,19 @@ hf_collection_name: {your collection}
hf_collection_slug: {your collection slug} # alphanumeric string found at the end of a collection URI
# optional:
-hf_dataset_prefix: str # field to override the default value "NeMo-Gym" prepended to the dataset name
+hf_dataset_prefix: str # field to override the default value "Nemotron-RL" prepended to the dataset name
```
Naming convention for Huggingface datasets is as follows.
-`{hf_organization}/{hf_dataset_prefix}-{domain}–{resource_server_name}-{your dataset name}`
+`{hf_organization}/{hf_dataset_prefix}-{domain}–{resource_server OR dataset_name}`
E.g.:
-`NVIDIA/Nemo-Gym-Math-math_with_judge-dapo17k`
+`nvidia/Nemotron-RL-math-OpenMathReasoning`
-You will only need to manually input the `{your dataset name}` portion of the above when inputting the `dataset_name` flag in the upload command (refer to the command below). Everything preceding it will be automatically populated using your config prior to upload.
+You will only need to manually input the `{dataset_name}` portion of the above when inputting the `dataset_name` flag in the upload command (refer to the command below). Everything preceding it will be automatically populated using your config prior to upload. Note that it is optional, and overrides `resource_server` if used.
To upload to Huggingface, use the below command:
```bash
@@ -47,6 +47,45 @@ ng_upload_dataset_to_hf \
Because of the required dataset nomenclature, the resource server config path is required when uploading. Specifically, `domain` is used in the naming of a dataset in Huggingface.
+By default, the `split` parameter for uploading is set to `train`, which will run a check on the required fields `{"responses_create_params", "reward_profiles", "expected_answer"}`. Specifying `validation` or `test` bypasses this check:
+
+```bash
+resource_config_path="resources_servers/multineedle/configs/multineedle.yaml"
+ng_gitlab_to_hf_dataset \
+ +dataset_name={your dataset name} \
+ +input_jsonl_fpath=data/multineedle_benchmark_validation.jsonl \
+ +resource_config_path=${resource_config_path} \
+ +split=validation
+```
+
+## Uploading with Pull Request workflow
+When uploading to an organization repository where you don't have direct write access (e.g., nvidia/), use the `+create_pr=true` flag to create a Pull Request instead of pushing directly. You can also customize the commit message and description.
+
+If you want to specify the revision (branch name), you can add the `+revision={your branch name}` flag. Excluding `create_pr` (or setting it to `false`) assumes you are committing to an existing branch. Including it assumes it will be a brand new branch.
+
+```bash
+ng_upload_dataset_to_hf \
+ +dataset_name=OpenMathReasoning \
+ +input_jsonl_fpath=data/validation.jsonl \
+ +resource_config_path=${resource_config_path} \
+ +split=validation \
+ +create_pr=true \
+ +revision=my-branch-name \
+ +commit_message="Add validation set" \
+ +commit_description="Includes 545 examples"
+```
+
+The command will output a link to the created Pull Request:
+```bash
+[Nemo-Gym] - Pull Request created: https://huggingface.co/datasets/nvidia/Nemotron-RL-math-OpenMathReasoning/discussions/1
+```
+
+:::{note}
+The commit_message and commit_description parameters work for both direct pushes and Pull Requests. If not provided, HuggingFace auto-generates a commit message based on the filename.
+:::
+
+
+## Deleting Datasets from Gitlab
You can optionally pass a `+delete_from_gitlab=true` flag to the above command, which will delete the model and all of its artifacts from Gitlab. By default, this is set to `False`.
```bash
resource_config_path="resources_servers/multineedle/configs/multineedle.yaml"
@@ -59,7 +98,7 @@ ng_upload_dataset_to_hf \
There will be a confirmation dialog to confirm the deletion:
```bash
-[Nemo-Gym] - Dataset uploaded successful
+[Nemo-Gym] - Dataset upload successful
[Nemo-Gym] - Found model 'fs-test' in the registry. Are you sure you want to delete it from Gitlab? [y/N]:
```
@@ -83,13 +122,28 @@ ng_delete_dataset_from_gitlab \
Gitlab model names are case sensitive. There can be models named 'My_Model' and 'my_model' living simultaneously in the registry. When uploading to Huggingface with the intention of deleting Gitlab artifacts, be sure the casing of your Huggingface dataset name matches that of Gitlab's.
:::
+
+## Downloading Datasets from Huggingface
Downloading a dataset from Huggingface is straightforward:
+
+**For structured datasets (with train/validation/test splits):**
```bash
ng_download_dataset_from_hf \
- +repo_id=NVIDIA/NeMo-Gym-Instruction_Following-multineedle-{your dataset name} \
- +artifact_fpath=multineedle_benchmark.jsonl \
- +output_fpath=data/multineedle_benchmark_hf.jsonl
+ +repo_id=nvidia/Nemotron-RL-knowledge-mcqa \
+ +output_dirpath=data/mcqa \
+ +split=train
```
+The `split` parameter is optional. If omitted, all available splits will be downloaded as separate JSONL files.
+
+
+**For raw file repositories (with specific JSONL files):**
+```bash
+ng_download_dataset_from_hf \
+ +repo_id=nvidia/Nemotron-RL-instruction_following \
+ +output_dirpath=data/instruction_following \
+ +artifact_fpath=instruction_following.jsonl
+```
+Use `artifact_fpath` when the HuggingFace repo contains raw/arbitrary JSONL files rather than structured dataset splits. You cannot specify both `split` and `artifact_fpath`.
# How To: Prepare and validate data for PR submission or RL training
@@ -120,6 +174,9 @@ example_multi_step_simple_agent:
dataset_name: example_multi_step
version: 0.0.1
artifact_fpath: example_multi_step/train.jsonl
+ huggingface_identifier:
+ repo_id: nvidia/Nemotron-RL-instruction_following
+ artifact_fpath: instruction_following.jsonl
license: Apache 2.0
- name: validation
type: validation
@@ -130,6 +187,9 @@ example_multi_step_simple_agent:
dataset_name: example_multi_step
version: 0.0.1
artifact_fpath: example_multi_step/validation.jsonl
+ huggingface_identifier:
+ repo_id: nvidia/Nemotron-RL-instruction_following
+ artifact_fpath: if_validation.jsonl
license: Apache 2.0
- name: example
type: example
@@ -142,7 +202,8 @@ A dataset object consists of:
- Type: train, validation, or example. Train and validation are as used in NeMo RL or other train frameworks. More information about the example type is in the next section.
- Jsonl fpath: the local file path to your jsonl file for this dataset.
- Num repeats: optionally repeat each row when preparing or collating data. Defaults to 1 if unspecified.
-- Gitlab identifier: The remote path to the dataset as held in the Gitlab dataset registry. This field is required for train and validation datasets. (Not required for example datasets since those are required to be committed to Git).
+- Gitlab identifier: (NVIDIA internal) The remote path to the dataset as held in the Gitlab dataset registry. This field is required for train and validation datasets. (Not required for example datasets since those are required to be committed to Git).
+- HuggingFace identifier: (Public) The remote path to the dataset on HuggingFace. Contains `repo_id` (required) and optionally `artifact_fpath` for raw file repos. If `artifact_fpath` is omitted, the datasets library will infer the `split` from the dataset `type`.
- License: The license of that dataset. Required for train and validation datasets and not required for example datasets, similar in principle to the Gitlab identifier.
- Start idx, end idx: used for slicing your dataset.
```yaml
@@ -153,6 +214,9 @@ A dataset object consists of:
dataset_name: example_multi_step
version: 0.0.1
artifact_fpath: example_multi_step/validation.jsonl
+ huggingface_identifier:
+ repo_id: nvidia/example_multi_step
+ artifact_fpath: example_validation.jsonl
license: Apache 2.0
```
@@ -165,11 +229,32 @@ responses_api_models/openai_model/configs/openai_model.yaml"
ng_prepare_data "+config_paths=[$config_paths]" \
+output_dirpath=data/example_multi_step \
+mode=example_validation
+```
-# Run NeMo Gym servers the exact same way with the same configs!
+To download missing datasets automatically, add +should_download=true. By default, datasets are downloaded from HuggingFace:
+```bash
+ng_prepare_data "+config_paths=[$config_paths]" \
+ +output_dirpath=data/example_multi_step \
+ +mode=train_preparation \
+ +should_download=true
+```
+
+For NVIDIA internal users, you can download from GitLab instead:
+
+```bash
+ng_prepare_data "+config_paths=[$config_paths]" \
+ +output_dirpath=data/example_multi_step \
+ +mode=train_preparation \
+ +should_download=true \
+ +data_source=gitlab
+```
+
+Run NeMo Gym servers the exact same way with the same configs!
+```bash
ng_run "+config_paths=[$config_paths]"
```
+
The `ng_prepare_data` command will:
1. Attempt to load all the datasets you specified from disk. Missing datasets will be reported before any processing is done.
2. For each dataset, read example by example. Check the format and report the filepaths and indices/ranges of offending examples if any.
diff --git a/docs/tutorials/creating-resource-server.md b/docs/tutorials/creating-resource-server.md
index 9f8ffd255..46789eb9e 100644
--- a/docs/tutorials/creating-resource-server.md
+++ b/docs/tutorials/creating-resource-server.md
@@ -2,7 +2,7 @@
# Creating a Resource Server
-Learn how to create a custom resource server to implement tools, verifiers, and business logic for your AI agents.
+Learn how to create a custom resource server to implement tools, verifiers, and business logic for your training environment.
::::{grid} 2
:gutter: 3
@@ -24,19 +24,19 @@ Learn how to create a custom resource server to implement tools, verifiers, and
## What is a Resource Server?
-Resource servers are the backbone of tool-based agent interactions in NeMo Gym. They provide:
+Resource servers are the backbone of tool-based interactions in NeMo Gym. They provide:
-- **Tool implementations**: APIs that agents can call to perform actions or retrieve information
-- **Verification logic**: Functions to evaluate agent performance and compute rewards
-- **Business logic abstraction**: Clean separation between agent logic and domain-specific functionality
+- **Tool implementations**: APIs that models can call to perform actions or retrieve information
+- **Verification logic**: Functions to evaluate model performance and compute rewards
+- **Business logic abstraction**: Clean separation between model logic and domain-specific functionality
-Each resource server must implement a `verify` function that evaluates the agent's interactions and returns a reward signal for reinforcement learning.
+Each resource server must implement a `verify` function that evaluates the model's interactions and returns a reward signal for reinforcement learning.
---
## 1. Initialize the Resource Server
-Resource servers live in the `resources_servers/` directory. Create a weather server that provides weather information to agents.
+Resource servers live in the `resources_servers/` directory. Create a weather server that provides weather information to models.
Run the initialization command from the repository root:
@@ -153,9 +153,9 @@ class MyWeatherResourcesServer(SimpleResourcesServer):
async def verify(self, body: BaseVerifyRequest) -> BaseVerifyResponse:
"""
- Verification function: Evaluate agent performance.
+ Verification function: Evaluate rollout performance.
- This function is called after an agent completes an interaction.
+ This function is called after a rollout completes.
Return a reward between 0.0 and 1.0.
For this simple example, we always return 1.0 (success).
@@ -175,7 +175,7 @@ if __name__ == "__main__":
3. **Server Class**: Extends `SimpleResourcesServer` and implements tools and verification
4. **`setup_webserver()`**: Registers FastAPI routes for your tools
5. **Tool Methods**: Async functions that implement the actual tool logic
-6. **`verify()`**: **Required** method that evaluates agent performance and returns a reward
+6. **`verify()`**: **Required** method that evaluates task performance and returns a reward
---
@@ -293,15 +293,15 @@ policy_base_url: https://api.openai.com/v1
policy_model_name: gpt-4o-mini
```
-### Test the Agent
+### Test the resources server
-After the servers start, test your agent in a new terminal:
+After the servers start, test your resources server in a new terminal:
```bash
python responses_api_agents/simple_agent/client.py
```
-The agent should be able to use your `get_weather` tool to answer questions about weather!
+The model should be able to use your `get_weather` tool to answer questions about weather!
---
@@ -319,7 +319,7 @@ Your resource server needs example data for testing and validation. Create `reso
### Generate Example Rollouts
-Collect rollouts by running the agent against your example inputs. This generates interaction traces showing how agents use your tools:
+Collect rollouts by running against your example inputs. This generates interaction traces showing how models use your tools:
```bash
ng_collect_rollouts +agent_name=my_weather_tool_simple_agent \
@@ -331,7 +331,7 @@ ng_collect_rollouts +agent_name=my_weather_tool_simple_agent \
```
:::{note}
-Ensure your servers are running (from step 6) before collecting rollouts. The command processes each input example, runs it through the agent, and saves the complete interaction including tool calls and verification rewards to `example_rollouts.jsonl`.
+Ensure your servers are running (from step 6) before collecting rollouts. The command processes each input example, runs it through the servers, and saves the complete interaction including tool calls and verification rewards to `example_rollouts.jsonl`.
:::
---
@@ -386,14 +386,14 @@ Before submitting a PR, ensure you have:
## Advanced: Custom Verification
-For more sophisticated verification, you can implement custom logic in the `verify` function. Here's an example that checks if the agent used the correct tool:
+For more sophisticated verification, you can implement custom logic in the `verify` function. Here's an example that checks if the model used the correct tool:
```python
async def verify(self, body: BaseVerifyRequest) -> BaseVerifyResponse:
"""
- Advanced verification: Check if agent called the get_weather tool.
+ Advanced verification: Check if model called the get_weather tool.
"""
- # Check if the agent made a function call
+ # Check if the model made a function call
used_tool = False
for output in body.response.output:
if output.type == "function_call" and output.name == "get_weather":
@@ -420,7 +420,7 @@ Now that you have a working resource server:
1. **Add training data**: Collect rollouts and prepare datasets for RL training
2. **Add complex verification**: Add reward shaping and detailed performance metrics
3. **Scale up**: Add more tools and more sophisticated business logic
-4. **Integrate with RL**: Use {doc}`rl-training-with-nemo-rl` to train agents on your tasks
+4. **Integrate with RL**: Use {ref}`RL Training with NeMo RL using GRPO ` to train models on your tasks
::::{grid} 2
:gutter: 3
@@ -432,9 +432,9 @@ Learn how to collect and process rollouts for training data.
:::
:::{grid-item-card} {octicon}`rocket;1.5em;sd-mr-1` Train with NeMo RL
-:link: rl-training-with-nemo-rl
-:link-type: doc
-Train agents using your resource server with NeMo RL.
+:link: training-nemo-rl-grpo-index
+:link-type: ref
+Train models using your resource server with NeMo RL.
:::
::::
@@ -480,8 +480,8 @@ You've learned how to:
✅ Configure the required `domain` field
✅ Add tools and verification logic
✅ Write and run tests
-✅ Run your server with an agent
+✅ Run your server with an model
✅ Create required data artifacts
-Resource servers are the foundation for building custom RL environments in NeMo Gym. Experiment with different tool implementations and verification strategies to create engaging tasks for your agents!
+Resource servers are the foundation for building custom RL environments in NeMo Gym. Experiment with different tool implementations and verification strategies to create engaging tasks for your models!
diff --git a/docs/tutorials/index.md b/docs/tutorials/index.md
index bd6cda0a7..40764fb96 100644
--- a/docs/tutorials/index.md
+++ b/docs/tutorials/index.md
@@ -2,7 +2,7 @@
# NeMo Gym Tutorials
-Hands-on learning experiences that guide you through building, training, and deploying AI agents with NeMo Gym.
+Hands-on tutorials to build and customize your training environments.
:::{tip}
**New to NeMo Gym?** Begin with the {doc}`Get Started <../get-started/index>` section for a guided tutorial from installation through your first verified agent. Return here afterward to learn about advanced topics like additional rollout collection methods and training data generation. You can find the project repository on [GitHub](https://github.com/NVIDIA-NeMo/Gym).
@@ -19,9 +19,9 @@ Create custom resource servers and implement tool-based agent interactions.
:::{grid-item-card} {octicon}`tools;1.5em;sd-mr-1` Creating a Resource Server
:link: creating-resource-server
:link-type: doc
-Build custom resource servers with tools, verification logic, and business logic for your AI agents.
+Implement or integrate existing tools and define task verification logic.
+++
-{bdg-primary}`beginner` {bdg-secondary}`30 min`
+{bdg-primary}`beginner` {bdg-secondary}`30 min` {bdg-secondary}`custom-environments` {bdg-secondary}`tools`
:::
::::
@@ -43,12 +43,21 @@ Transform rollouts into training data for {term}`supervised fine-tuning (SFT) ` agentic {term}`tool-use ` {term}`training environment `** that tests a model's ability to execute business tasks in a simulated workplace setting.
+
+:::{card}
+
+**Goal**: Understand the training environment and how tasks are structured and verified.
+
+^^^
+
+**In this section, you will learn**:
+
+1. How tasks are structured for multi-step tool calling
+2. The available databases and tools
+3. How the environment verifies task completion
+
+:::
+
+:::{button-ref} training-nemo-rl-grpo-index
+:color: secondary
+:outline:
+:ref-type: ref
+
+← Back to Tutorial Overview
+:::
+
+---
+
+## How the Model Completes Tasks
+
+For each task, the model must:
+
+1. Understand the user's intent from natural language
+2. Determine which tools to call and in what order
+3. Infer correct parameters (for example, look up email addresses or find matching customer records)
+4. Execute all necessary steps to complete the task
+
+The model has up to **6 tool calling steps** to accomplish each task.
+
+---
+
+## Available Databases and Tools
+
+Each task is a natural language request that the model must complete using the available tools. All tasks share the same set of tools that allow the model to retrieve more information or perform actions. Each {term}`task instance ` uses isolated database instances so actions from different rollouts don't interfere.
+
+- **Databases**: Email, Calendar, Analytics, Project Management, Customer Relationship Manager (CRM)
+- **Tools**: Distributed across these databases
+- **Tasks**: Common business activities (such as sending emails, scheduling meetings, and managing projects)
+
+All tasks are available in the [Workplace Assistant HuggingFace dataset](https://huggingface.co/datasets/nvidia/Nemotron-RL-agent-workplace_assistant).
+
+---
+
+## Task Examples
+
+::::{tab-set}
+
+:::{tab-item} Single-Step Task
+
+**User query**: "Send an email to john.smith@atlas.com with the subject 'Team Meeting' and body 'Let's meet tomorrow at 2pm to discuss the project.'"
+
+**Expected tool call**:
+```python
+email_send_email(
+ recipient="john.smith@atlas.com",
+ subject="Team Meeting",
+ body="Let's meet tomorrow at 2pm to discuss the project."
+)
+```
+
+The tool adds a new email to the emails database.
+
+:::
+
+:::{tab-item} Multi-Step Task
+
+**User query**: "John is taking over all of Akira's leads that are interested in software. Can you reassign them in the CRM?"
+
+**Expected output sequence**:
+
+1. `company_directory_find_email_address(name="Akira")` → Returns `"akira.tanaka@atlas.com"`
+2. `company_directory_find_email_address(name="John")` → Returns `"john.smith@atlas.com"`
+3. `customer_relationship_manager_search_customers(assigned_to_email="akira.tanaka@atlas.com", product_interest="software", status="lead")` → Returns 3 matching leads
+4. `customer_relationship_manager_update_customer(customer_id="00000095", field="assigned_to_email", new_value="john.smith@atlas.com")`
+5. `customer_relationship_manager_update_customer(customer_id="00000080", field="assigned_to_email", new_value="john.smith@atlas.com")`
+6. `customer_relationship_manager_update_customer(customer_id="00000035", field="assigned_to_email", new_value="john.smith@atlas.com")`
+
+:::
+
+:::{tab-item} Input Format
+
+Each task is a `responses_create_params` object:
+
+```json
+{
+ "responses_create_params": {
+ "input": [
+ {
+ "role": "system",
+ "content": "Today's date is Thursday, 2023-11-30 and the current time is 23:59:00. Remember the current date and time when answering queries. Meetings must not start before 9am or end after 6pm."
+ },
+ {
+ "role": "user",
+ "content": "John is taking over all of Akira's leads that are interested in software. Can you reassign them in the CRM?"
+ }
+ ],
+ "tools": [
+ {
+ "type": "function",
+ "name": "email_send_email",
+ "description": "Sends an email to the specified recipient.",
+ "parameters": {
+ "type": "object",
+ "properties": {
+ "recipient": {
+ "type": "string",
+ "description": "Email address of the recipient"
+ },
+ "subject": {
+ "type": "string",
+ "description": "Subject line of the email"
+ },
+ "body": {
+ "type": "string",
+ "description": "Body content of the email"
+ }
+ },
+ "required": ["recipient", "subject", "body"],
+ "additionalProperties": false
+ },
+ "strict": false
+ }
+ ]
+ }
+}
+```
+
+The full task includes all 27 tools across the 5 databases.
+
+:::
+
+::::
+
+---
+
+## How Verification Works
+
+The environment is implemented as a FastAPI-based resource server that executes tools and verification. It uses **state-matching verification**: instead of requiring exact tool sequences, it compares final database states.
+
+::::{tab-set}
+
+:::{tab-item} Why State-Matching?
+
+- **Flexibility**: Multiple valid solution paths exist for the same task
+- **Robustness**: Model can recover from mistakes mid-trajectory
+- **Goal-oriented**: Focuses on outcomes, not specific procedures
+
+:::
+
+:::{tab-item} Verification Process
+
+```python
+async def verify(self, body: WorkbenchVerifyRequest) -> WorkbenchVerifyResponse:
+ ground_truth = body.ground_truth
+ response = body.response.output
+
+ total_score = 0.0
+
+ # Convert list of ResponseFunctionToolCall objects into list of dictionaries
+ predicted_function_calls = []
+
+ for message in response:
+ if message.type == "function_call":
+ predicted_function_calls.append(message.model_dump())
+
+ predicted_chat_content = []
+
+ for message in response:
+ if message.type == "output_text":
+ predicted_chat_content.append(message.model_dump())
+
+ total_score += is_correct(predicted_function_calls, ground_truth, None) * 1.0
+ return WorkbenchVerifyResponse(**body.model_dump(), reward=total_score)
+```
+
+The `is_correct` function implements the state-matching logic:
+
+```python
+def is_correct(predicted_actions, ground_truth_actions, error):
+ ...
+
+ # Execute both sequences in fresh environments
+ predict_env = execute_actions_and_reset_state(predicted_actions)
+ ground_truth_env = execute_actions_and_reset_state(ground_truth_actions)
+
+ ... # Extract specific state info
+
+ # Compare final states of all 5 databases
+ return (
+ predicted_calendar_state.equals(ground_truth_calendar_state)
+ and predicted_email_state.equals(ground_truth_email_state)
+ and predicted_analytics_state.equals(ground_truth_analytics_state)
+ and predicted_project_management_state.equals(ground_truth_project_management_state)
+ and predicted_customer_relationship_manager_state.equals(ground_truth_customer_relationship_manager_state)
+ )
+```
+
+:::
+
+:::{tab-item} Error Handling
+
+Tool execution errors are returned to the model (not terminating the rollout), allowing self-correction:
+
+```python
+async def route_to_python_function(self, path, body, request):
+ ...
+ tool_env = self.session_id_to_tool_env[session_id]
+ args = body.model_dump(exclude_unset=True)
+
+ try:
+ function = tool_env["functions"][path]
+ result = function(**args)
+ return WorkbenchResponse(output=result)
+ except Exception as e:
+ # Return error to model so it can self-correct
+ return WorkbenchResponse(output=f"Error executing tool '{path}': {str(e)}")
+```
+
+:::
+
+::::
diff --git a/docs/tutorials/nemo-rl-grpo/gym-configuration.md b/docs/tutorials/nemo-rl-grpo/gym-configuration.md
new file mode 100644
index 000000000..d6cf032e2
--- /dev/null
+++ b/docs/tutorials/nemo-rl-grpo/gym-configuration.md
@@ -0,0 +1,90 @@
+(training-nemo-rl-grpo-gym-configuration)=
+
+# Gym Configuration
+
+Before running GRPO training, you need to configure how NeMo RL connects to NeMo Gym. The training config file contains Gym-specific parameters that control data loading, environment interaction, and validation.
+
+:::{card}
+
+**Goal**: Understand the Gym configuration parameters for RL training.
+
+^^^
+
+**In this section, you will learn**:
+
+1. How to configure data paths for training and validation
+2. How to enable and configure NeMo Gym in NeMo RL
+
+:::
+
+:::{button-ref} training-nemo-rl-grpo-about-workplace-assistant
+:color: secondary
+:outline:
+:ref-type: ref
+
+← Previous: About Workplace Assistant
+:::
+
+---
+
+## Configuration File Location
+
+The full training configuration file is located at:
+
+```
+examples/nemo_gym/grpo_workplace_assistant_nemotron_nano_v2_9b.yaml
+```
+
+---
+
+## Gym Configuration Sections
+
+There are two Gym-specific sections in the NeMo RL training config: `data` and `env`.
+
+### Data Section
+
+```yaml
+data:
+ train_jsonl_fpath: 3rdparty/Gym-workspace/Gym/data/workplace_assistant/train.jsonl
+ validation_jsonl_fpath: 3rdparty/Gym-workspace/Gym/data/workplace_assistant/validation.jsonl
+```
+
+| Parameter | Description |
+|-----------|-------------|
+| `train_jsonl_fpath` | Path to training dataset (prepared in {doc}`Setup `) |
+| `validation_jsonl_fpath` | Path to validation dataset |
+
+### Environment Section
+
+```yaml
+env:
+ should_use_nemo_gym: true
+ nemo_gym:
+ config_paths:
+ - responses_api_models/vllm_model/configs/vllm_model_for_training.yaml
+ - resources_servers/workplace_assistant/configs/workplace_assistant.yaml
+ workplace_assistant_simple_agent:
+ responses_api_agents:
+ simple_agent:
+ max_steps: 6
+```
+
+| Parameter | Description |
+|-----------|-------------|
+| `should_use_nemo_gym` | Set to `true` to enable Gym |
+| `nemo_gym` | Everything under this key is a Gym config |
+| `nemo_gym.config_paths` | Gym config files: vLLM model config and Workplace Assistant agent/resources config |
+| `max_steps` | Maximum tool-calling steps per task (6 for Workplace Assistant) |
+
+:::{important}
+The `vllm_model_for_training.yaml` config is required for NeMo RL training integration.
+:::
+
+---
+
+:::{button-ref} training-nemo-rl-grpo-nemo-rl-configuration
+:color: primary
+:ref-type: ref
+
+Next: NeMo RL Configuration →
+:::
diff --git a/docs/tutorials/nemo-rl-grpo/index.md b/docs/tutorials/nemo-rl-grpo/index.md
new file mode 100644
index 000000000..fedba76ad
--- /dev/null
+++ b/docs/tutorials/nemo-rl-grpo/index.md
@@ -0,0 +1,155 @@
+(training-nemo-rl-grpo-index)=
+
+# RL Training with NeMo RL using GRPO
+
+This tutorial trains NVIDIA [Nemotron Nano 9B v2](https://huggingface.co/nvidia/NVIDIA-Nemotron-Nano-9B-v2) to improve its **{term}`multi-step ` {term}`tool-calling `** capability using the **{term}`GRPO (Group Relative Policy Optimization) `** algorithm on the **Workplace Assistant** environment.
+
+Workplace Assistant is a realistic office simulation (calendar, email, project management, etc.) with complex multi-step tasks, providing a strong data distribution for training enterprise-ready tool-using assistants.
+
+:::{card}
+
+**Goal**: Train a model for multi-step tool calling using GRPO on the Workplace Assistant environment.
+
+^^^
+
+**In this tutorial, you will**:
+
+1. Set up NeMo RL and NeMo Gym for {term}`reinforcement learning ` training
+2. Understand the Workplace Assistant environment and its multi-step tool calling tasks
+3. Configure and run GRPO training on Nemotron Nano v2 9B
+4. Monitor training progress via Weights & Biases (W&B)
+
+:::
+
+> **TL;DR:** Want to jump straight to running commands? Skip to {doc}`Setup `.
+
+---
+
+## Before You Begin
+
+Make sure you have these prerequisites ready:
+
+- ✅ **Hardware**: 1+ nodes with 8× NVIDIA GPUs (80GB+ each, such as H100 or A100)
+ - Single-node testing: 1 node with 8 GPUs
+ - Multi-node production: 8+ nodes with 8 GPUs each recommended
+ - RAM: 64 GB+ per node
+- ✅ **Storage**: 100 GB+ free disk space on a shared filesystem
+- ✅ **Software**: Linux, Python 3.12+, Git, Slurm for multi-node training
+- ✅ **Familiarity**: Python, LLM fine-tuning, basic RL concepts (in-depth RLVR/GRPO knowledge not required)
+
+:::{note}
+NeMo Gym does not require GPUs. GPUs are only necessary for GRPO training with NeMo RL.
+:::
+
+**Optional accounts**:
+
+- **Weights & Biases (W&B)**: For experiment tracking ([sign up](https://wandb.ai/signup), [get API key](https://wandb.ai/authorize)). Training proceeds without W&B if not configured.
+- **HuggingFace**: For downloading models ([create token](https://huggingface.co/settings/tokens)). Recommended to avoid rate limits.
+
+**Total time estimate**: ~3-5 hours (including environment setup, data preparation, and training)
+
+---
+
+## Tutorial Steps
+
+Follow these steps sequentially to complete the tutorial:
+
+::::{grid} 1
+:gutter: 2
+
+:::{grid-item-card} 1. About the Workplace Assistant Training Environment
+:link: training-nemo-rl-grpo-about-workplace-assistant
+:link-type: ref
+
+Understand the dataset you will train on and its multi-step tool calling tasks.
++++
+{bdg-secondary}`background`
+:::
+
+:::{grid-item-card} 2. Gym Configuration
+:link: training-nemo-rl-grpo-gym-configuration
+:link-type: ref
+
+Understand the Gym configuration component in the NeMo RL training config file.
++++
+{bdg-secondary}`configuration`
+:::
+
+:::{grid-item-card} 3. NeMo RL Configuration
+:link: training-nemo-rl-grpo-nemo-rl-configuration
+:link-type: ref
+
+Understand the GRPO and NeMo RL configuration components in the training config file.
++++
+{bdg-secondary}`configuration`
+:::
+
+:::{grid-item-card} 4. Setup
+:link: training-nemo-rl-grpo-setup
+:link-type: ref
+
+Clone repositories, install dependencies, and prepare the training data.
++++
+{bdg-primary}`prerequisite`
+:::
+
+:::{grid-item-card} 5. Single Node Training
+:link: training-nemo-rl-grpo-single-node-training
+:link-type: ref
+
+Perform a single node GRPO training run with success criteria.
++++
+{bdg-primary}`training`
+:::
+
+:::{grid-item-card} 6. Multi-Node Training
+:link: training-nemo-rl-grpo-multi-node-training
+:link-type: ref
+
+Scale to multi-node GRPO training for production.
++++
+{bdg-primary}`training`
+:::
+
+::::
+
+---
+
+## What's Next?
+
+After completing this tutorial, explore these options:
+
+::::{grid} 1 1 2 2
+:gutter: 3
+
+:::{grid-item-card} {octicon}`package;1.5em;sd-mr-1` Use Other Training Environments
+:link: https://github.com/NVIDIA-NeMo/Gym#-available-resource-servers
+
+Browse available resource servers on GitHub to find other training environments.
++++
+{bdg-secondary}`github` {bdg-secondary}`resource-servers`
+:::
+
+:::{grid-item-card} {octicon}`tools;1.5em;sd-mr-1` Build a Custom Training Environment
+:link: ../creating-resource-server
+:link-type: doc
+
+Create your own resource server with custom tools and verification logic.
++++
+{bdg-secondary}`tutorial` {bdg-secondary}`custom-tools`
+:::
+
+::::
+
+```{toctree}
+:caption: NeMo RL GRPO
+:hidden:
+:maxdepth: 1
+
+about-workplace-assistant.md
+gym-configuration.md
+nemo-rl-configuration.md
+setup.md
+single-node-training.md
+multi-node-training.md
+```
\ No newline at end of file
diff --git a/docs/tutorials/nemo-rl-grpo/multi-node-training.md b/docs/tutorials/nemo-rl-grpo/multi-node-training.md
new file mode 100644
index 000000000..d10adc011
--- /dev/null
+++ b/docs/tutorials/nemo-rl-grpo/multi-node-training.md
@@ -0,0 +1,143 @@
+(training-nemo-rl-grpo-multi-node-training)=
+
+# Multi-Node Training
+
+Your single-node test run confirmed that the environment, model, and training loop all work correctly. Now you can scale to multiple nodes for production training, where the full power of distributed computing accelerates your GRPO optimization.
+
+:::{card}
+
+**Goal**: Scale GRPO training to multiple nodes for production training.
+
+^^^
+
+**In this section, you will**:
+
+1. Launch a multi-node training job using Slurm batch mode
+2. Monitor training metrics in Weights & Biases
+
+:::
+
+:::{button-ref} training-nemo-rl-grpo-single-node-training
+:color: secondary
+:outline:
+:ref-type: ref
+
+← Previous: Single Node Training
+:::
+
+---
+
+## Before You Begin
+
+:::{important}
+**Complete the {doc}`Single Node Training ` first. Do not skip it.** The single-node setup validates that your environment is configured correctly before attempting multi-node training.
+:::
+
+Make sure you have:
+
+- ✅ Successfully completed 3 training steps on a single node
+- ✅ Access to the Slurm login/head node (not inside the interactive container)
+- ✅ Weights & Biases API key for experiment tracking
+
+---
+
+## 1. Launch Multi-Node Training
+
+**Estimated time**: Several hours (depending on configuration)
+
+For production training, scale to multiple nodes by changing `cluster.num_nodes`. This example uses **batch mode**, where the `COMMAND` variable specifies what to run automatically when the job starts.
+
+:::{note}
+Run this command from the **Slurm login/head node**, not from inside the interactive container. This submits a new batch job that runs independently.
+:::
+
+```bash
+cd /path/to/nemo/rl
+
+# Submit multi-node job
+# Set these environment variables before running:
+# WANDB_API_KEY: Your Weights & Biases API key for logging
+# EXP_NAME: Experiment name
+# NUM_ACTOR_NODES: Number of GPU nodes to use (2, 4, 8, etc.)
+# CONTAINER_IMAGE_PATH: The container to use.
+# SLURM_ACCOUNT: Slurm account
+# SLURM_PARTITION: Slurm partition
+WANDB_API_KEY={your W&B API key} \
+EXP_NAME=nemo_gym_grpo/nemotron_nano_v2_9b/2nodes/workplace_assistant_001 \
+NUM_ACTOR_NODES=2 \
+REPO_LOCATION=$PWD \
+CONTAINER_IMAGE_PATH=nvcr.io/nvidia/nemo-rl:v0.4.0.nemotron_nano_v3 \
+SLURM_ACCOUNT={your Slurm account} \
+SLURM_PARTITION={your Slurm partition} \
+ examples/nemo_gym/launch_nemo_gym_multinode_training.sh \
+ --config=examples/nemo_gym/grpo_workplace_assistant_nemotron_nano_v2_9b.yaml \
+ logger.wandb.project="$USER-nemo-gym-rl-integration"
+```
+
+:::{tip}
+If you are using enroot following the steps in the {doc}`Setup ` doc and downloaded the container locally, use the local container filepath instead:
+
+```bash
+CONTAINER_IMAGE_PATH=$PWD/../nvcr.io/nvidia/nemo-rl:v0.4.0.nemotron_nano_v3 \
+```
+:::
+
+**✅ Success Check**: The Slurm job is submitted and begins running on multiple nodes.
+
+---
+
+## 2. Monitor Training Progress
+
+Monitor these metrics in W&B to track progress:
+
+| Metric | Description |
+|--------|-------------|
+| `train:reward_mean` | The average reward of your model on this training environment. The reward may be noisy, but it should go up. |
+| `val:accuracy` | The validation performance of your model on this training environment. This should go up steadily. |
+
+The best checkpoint (highest `val:accuracy`) is retained based on `checkpointing.keep_top_k: 3`. You can find checkpoints at the following path:
+
+```bash
+ls results/$EXP_NAME
+```
+
+**✅ Success Check**: Training is successful when:
+
+- Reward mean increases consistently over steps
+- Validation accuracy consistently improves
+- No OOM (Out of Memory) errors occur
+- Checkpoints are saved at specified intervals
+
+---
+
+## 3. Measure Real-World Improvement
+
+The Workplace Assistant environment's tool-calling tasks correlate with performance on the [Berkeley Function Calling Leaderboard (BFCL) v3](https://gorilla.cs.berkeley.edu/leaderboard.html) benchmark. To measure improvement, evaluate the Nemotron Nano v2 9B model on BFCL v3 before and after training, and compare the results. You should observe measurable improvement in tool-calling accuracy.
+
+You can run BFCL v3 evaluations using [NeMo Evaluator](https://github.com/NVIDIA-NeMo/Evaluator), which supports BFCL v3. Refer to the [NeMo Evaluator docs](https://github.com/NVIDIA-NeMo/Evaluator#-supported-benchmarks-and-evaluation-harnesses) for full setup instructions and supported benchmarks.
+
+**✅ Success Check**: BFCL v3 scores improve after training compared to the baseline model.
+
+---
+
+## What's Next?
+
+Congratulations! You've trained Nemotron Nano 9B v2 for multi-step tool calling using GRPO on the Workplace Assistant environment.
+
+::::{grid} 1 1 2 2
+:gutter: 3
+
+:::{grid-item-card} {octicon}`package;1.5em;sd-mr-1` Use Other Training Environments
+:link: https://github.com/NVIDIA-NeMo/Gym#-available-resource-servers
+
+Browse available resource servers on GitHub to find other training environments.
+:::
+
+:::{grid-item-card} {octicon}`tools;1.5em;sd-mr-1` Build a Custom Training Environment
+:link: ../creating-resource-server
+:link-type: doc
+
+Create your own resource server with custom tools and verification logic.
+:::
+
+::::
\ No newline at end of file
diff --git a/docs/tutorials/nemo-rl-grpo/nemo-rl-configuration.md b/docs/tutorials/nemo-rl-grpo/nemo-rl-configuration.md
new file mode 100644
index 000000000..0a4c86501
--- /dev/null
+++ b/docs/tutorials/nemo-rl-grpo/nemo-rl-configuration.md
@@ -0,0 +1,82 @@
+(training-nemo-rl-grpo-nemo-rl-configuration)=
+
+# NeMo RL Configuration
+
+With the Gym configuration in place, the next step is understanding the core training parameters. These control the GRPO algorithm, model behavior, and optimization settings that determine how your model learns.
+
+:::{card}
+
+**Goal**: Understand the GRPO and model hyperparameters for RL training.
+
+^^^
+
+**In this section, you will learn**:
+
+1. Model configuration parameters
+2. GRPO hyperparameters
+3. Optimizer settings
+
+:::
+
+:::{button-ref} training-nemo-rl-grpo-gym-configuration
+:color: secondary
+:outline:
+:ref-type: ref
+
+← Previous: Gym Configuration
+:::
+
+---
+
+## Configuration File Location
+
+The full training configuration file is located at:
+
+```
+examples/nemo_gym/grpo_workplace_assistant_nemotron_nano_v2_9b.yaml
+```
+
+---
+
+## Model Configuration
+
+| Parameter | Value | Description |
+|-----------|-------|-------------|
+| `model_name` | nvidia/NVIDIA-Nemotron-Nano-9B-v2 | Base model |
+| `max_total_sequence_length` | 32768 | Maximum context length |
+| `precision` | bfloat16 | Training precision |
+| `tensor_model_parallel_size` | 8 | Tensor parallelism across GPUs |
+
+---
+
+## GRPO Hyperparameters
+
+| Parameter | Value | Description |
+|-----------|-------|-------------|
+| `num_prompts_per_step` | 4 | Number of prompts per training step |
+| `num_generations_per_prompt` | 4 | Rollouts generated per prompt |
+| `max_num_steps` | 10 | Total training steps |
+| `use_leave_one_out_baseline` | true | Variance reduction technique |
+| `normalize_rewards` | true | Normalize rewards across batch |
+
+---
+
+## Optimizer Settings
+
+| Parameter | Value | Description |
+|-----------|-------|-------------|
+| `optimizer` | Adam | Optimizer type |
+| `lr` | 5.0e-6 | Learning rate |
+| `min_lr` | 5.0e-7 | Minimum learning rate |
+| `weight_decay` | 0.01 | Weight decay |
+| `adam_beta1` / `adam_beta2` | 0.9 / 0.999 | Adam hyperparameters |
+| `clip_grad` | 1.0 | Gradient clipping threshold |
+
+---
+
+:::{button-ref} training-nemo-rl-grpo-setup
+:color: primary
+:ref-type: ref
+
+Next: Setup →
+:::
diff --git a/docs/tutorials/nemo-rl-grpo/setup.md b/docs/tutorials/nemo-rl-grpo/setup.md
new file mode 100644
index 000000000..f27af66fe
--- /dev/null
+++ b/docs/tutorials/nemo-rl-grpo/setup.md
@@ -0,0 +1,210 @@
+(training-nemo-rl-grpo-setup)=
+
+# Setup
+
+Now that you understand the configuration parameters for GRPO training, it's time to set up your environment. This involves launching containers, installing dependencies, and preparing your training data—the foundation for everything that follows.
+
+:::{card}
+
+**Goal**: Set up your environment for GRPO training with NeMo RL and NeMo Gym.
+
+^^^
+
+**In this section, you will**:
+
+1. Launch an interactive GPU session
+2. Clone and install NeMo RL and NeMo Gym
+3. Run sanity tests to validate the setup
+4. Prepare the Workplace Assistant dataset
+
+:::
+
+:::{button-ref} training-nemo-rl-grpo-nemo-rl-configuration
+:color: secondary
+:outline:
+:ref-type: ref
+
+← Previous: NeMo RL Configuration
+:::
+
+---
+
+## Before You Begin
+
+Make sure you have:
+
+- ✅ Access to a Slurm cluster with GPU nodes
+- ✅ A shared filesystem accessible from all nodes
+- ✅ HuggingFace token for downloading models
+
+---
+
+## 1. Enter a GPU Node
+
+**Estimated time**: ~5 minutes
+
+Launch an interactive Slurm session to run training commands. Refer to the [NeMo RL Cluster Setup documentation](https://docs.nvidia.com/nemo/rl/latest/cluster.html#interactive-launching) for more details.
+
+If this is your first time downloading this Docker image, the `srun` command below will take 5-10 minutes.
+
+:::{tip}
+If you are using enroot as a containerization framework, you can pull the container after defining `$CONTAINER_IMAGE_PATH`:
+
+```bash
+mkdir -p "$(dirname "$CONTAINER_IMAGE_PATH")"
+enroot import -o "$CONTAINER_IMAGE_PATH" "docker://${CONTAINER_IMAGE_PATH}"
+# Swap to local container path
+CONTAINER_IMAGE_PATH=./$CONTAINER_IMAGE_PATH
+```
+:::
+
+```bash
+# Use the official NeMo RL container from NGC
+# See: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo-rl
+CONTAINER_IMAGE_PATH=nvcr.io/nvidia/nemo-rl:v0.4.0.nemotron_nano_v3
+
+NUM_ACTOR_NODES=1
+ACCOUNT=
+PARTITION=
+
+CONTAINER_WORKDIR=$PWD
+MOUNTS="$PWD:$PWD"
+srun \
+ --nodes=${NUM_ACTOR_NODES} \
+ --ntasks=1 \
+ --account=${ACCOUNT} \
+ --partition=${PARTITION} \
+ --time=04:00:00 \
+ --gres=gpu:8 \
+ --no-container-mount-home \
+ --container-name=nemo-gym \
+ --container-mounts="${MOUNTS}" \
+ --container-image="${CONTAINER_IMAGE_PATH}" \
+ --container-workdir=$CONTAINER_WORKDIR \
+ --pty /bin/bash
+```
+
+**✅ Success Check**: You should be inside the container with a bash prompt.
+
+---
+
+## 2. Clone and Setup NeMo RL + NeMo Gym
+
+**Estimated time**: ~5-10 minutes
+
+For the first setup on your local filesystem:
+
+```bash
+# Clone NeMo RL repository
+git clone https://github.com/NVIDIA-NeMo/RL
+cd RL
+
+# Clone NeMo Gym as a submodule
+git clone https://github.com/NVIDIA-NeMo/Gym.git 3rdparty/Gym-workspace/Gym
+```
+
+Every time you enter a new container:
+
+```bash
+# CD into your NeMo RL folder
+cd /path/to/nemo/rl
+
+# Initialize all submodules (Megatron, AutoModel, etc.)
+git submodule update --init --recursive
+
+# Activate the NeMo RL virtual environment
+source /opt/nemo_rl_venv/bin/activate
+
+# Install dependencies. This may take 5-10 minutes!
+uv sync --group={build,docs,dev,test} --extra nemo_gym
+uv run nemo_rl/utils/prefetch_venvs.py
+```
+
+**✅ Success Check**: No errors during installation and `uv sync` completes successfully.
+
+---
+
+## 3. Run Sanity Tests
+
+**Estimated time**: ~5-10 minutes
+
+Download the model used in the following tests:
+
+```bash
+HF_HOME=$PWD/.cache/ \
+HF_TOKEN={your HF token} \
+ hf download Qwen/Qwen3-0.6B
+```
+
+Validate your setup before training:
+
+```bash
+HF_HOME=$PWD/.cache/ \
+ ./examples/nemo_gym/run_nemo_gym_single_node_sanity_tests.sh
+```
+
+**✅ Success Check**: All tests pass without errors.
+
+:::{tip}
+If you've run these tests before and encounter HuggingFace rate limit errors, try using the following command:
+
+```bash
+HF_HUB_OFFLINE=1 \
+HF_HOME=$PWD/.cache/ \
+ ./examples/nemo_gym/run_nemo_gym_single_node_sanity_tests.sh
+```
+:::
+
+---
+
+## 4. Prepare NeMo Gym Data
+
+**Estimated time**: ~5 minutes
+
+The Workplace Assistant dataset must be downloaded from HuggingFace and prepared for training. This runs `ng_prepare_data` to download and validate the dataset, and to add an `agent_ref` property to each example that tells NeMo Gym which agent server should handle that example.
+
+Clone and setup the Gym Python environment:
+
+```bash
+# Setup Gym local venv
+cd 3rdparty/Gym-workspace/Gym
+uv venv --python 3.12 --allow-existing .venv
+source .venv/bin/activate
+uv sync --active --extra dev
+```
+
+Add your HuggingFace token to download Gym datasets from HuggingFace. This command will store your HF token in a file that is excluded from Git, so it will never be committed or pushed:
+
+```bash
+echo "hf_token: {your HF token}" >> env.yaml
+```
+
+Prepare the data:
+
+```bash
+config_paths="responses_api_models/vllm_model/configs/vllm_model_for_training.yaml,\
+resources_servers/workplace_assistant/configs/workplace_assistant.yaml"
+
+ng_prepare_data "+config_paths=[${config_paths}]" \
+ +output_dirpath=data/workplace_assistant \
+ +mode=train_preparation \
+ +should_download=true \
+ +data_source=huggingface
+```
+
+Return to the NeMo RL directory and Python environment:
+
+```bash
+cd ../../.. && source /opt/nemo_rl_venv/bin/activate
+```
+
+**✅ Success Check**: Dataset files are created in `3rdparty/Gym-workspace/Gym/data/workplace_assistant/`.
+
+---
+
+:::{button-ref} training-nemo-rl-grpo-single-node-training
+:color: primary
+:ref-type: ref
+
+Next: Single Node Training →
+:::
diff --git a/docs/tutorials/nemo-rl-grpo/single-node-training.md b/docs/tutorials/nemo-rl-grpo/single-node-training.md
new file mode 100644
index 000000000..16af09153
--- /dev/null
+++ b/docs/tutorials/nemo-rl-grpo/single-node-training.md
@@ -0,0 +1,155 @@
+(training-nemo-rl-grpo-single-node-training)=
+
+# Single Node Training
+
+With your environment set up and data prepared, you're ready to run training. But before committing to a multi-hour, multi-node job, it's important to verify everything works correctly on a single node first.
+
+:::{card}
+
+**Goal**: Run a single-node GRPO training session to validate your environment.
+
+^^^
+
+**In this section, you will**:
+
+1. Download the Nemotron Nano 9B v2 model
+2. Configure the model's chat template
+3. Clean up existing processes
+4. Run a test training session with 3 steps
+
+:::
+
+:::{button-ref} training-nemo-rl-grpo-setup
+:color: secondary
+:outline:
+:ref-type: ref
+
+← Previous: Setup
+:::
+
+---
+
+## Before You Begin
+
+Make sure you have:
+
+- ✅ Completed the {doc}`Setup ` instructions
+- ✅ Access to a running container session with GPUs
+- ✅ (Optional) Weights & Biases API key for experiment tracking
+
+:::{tip}
+Coming back from a break on a pre-existing filesystem setup? Run these commands once you enter the container:
+
+```bash
+source /opt/nemo_rl_venv/bin/activate
+uv sync --group={build,docs,dev,test} --extra nemo_gym
+uv run nemo_rl/utils/prefetch_venvs.py
+```
+:::
+
+---
+
+## 1. Download the Model
+
+**Estimated time**: ~5-10 minutes
+
+Download NVIDIA [Nemotron Nano 9B v2](https://huggingface.co/nvidia/NVIDIA-Nemotron-Nano-9B-v2):
+
+```bash
+HF_HOME=$PWD/.cache/ \
+HF_TOKEN={your HF token} \
+ hf download nvidia/NVIDIA-Nemotron-Nano-9B-v2
+```
+
+**✅ Success Check**: Model files are downloaded to `.cache/hub/models--nvidia--NVIDIA-Nemotron-Nano-9B-v2/`.
+
+---
+
+## 2. Configure the Chat Template
+
+**Estimated time**: ~1 minute
+
+The Nemotron Nano 9B v2 model uses a custom chat template that must be modified for RL training. This step modifies the cached version of the chat template:
+
+```bash
+tokenizer_config_path=$(find $PWD/.cache/hub/models--nvidia--NVIDIA-Nemotron-Nano-9B-v2 -name tokenizer_config.json)
+sed -i 's/enable_thinking=true/enable_thinking=false/g' $tokenizer_config_path
+sed -i 's/{%- if messages\[-1\]\['\''role'\''\] == '\''assistant'\'' -%}{%- set ns.last_turn_assistant_content = messages\[-1\]\['\''content'\''\].strip() -%}{%- set messages = messages\[:-1\] -%}{%- endif -%}//g' $tokenizer_config_path
+```
+
+**✅ Success Check**: The `sed` commands complete without errors.
+
+---
+
+## 3. Clean Up Existing Processes
+
+**Estimated time**: ~1 minute
+
+Clean up any existing or leftover Ray/vLLM processes:
+
+```bash
+pkill -f VllmAsyncGenerationWorker
+ray stop --force
+python -c "import ray; ray.shutdown()"
+```
+
+**✅ Success Check**: Commands complete without errors. It is okay if some processes are not found.
+
+---
+
+## 4. Run Training
+
+**Estimated time**: ~15-30 minutes
+
+By default, this runs only 3 training steps (`grpo.max_num_steps=3`) as a small test run in preparation for multi-node training. If you are using a single node for the full training run, you can remove this value. The full training will take several hours.
+
+```bash
+# Set experiment name with timestamp
+EXP_NAME="$(date +%Y%m%d)/nemo_gym_grpo/nemotron_nano_v2_9b/workplace_assistant_001"
+mkdir -p results/$EXP_NAME
+
+# Configuration file path
+CONFIG_PATH=examples/nemo_gym/grpo_workplace_assistant_nemotron_nano_v2_9b.yaml
+
+# Launch training
+# Set these environment variables before running:
+# WANDB_API_KEY: Your Weights & Biases API key for logging
+# logger.wandb.project: Fill in your username
+TORCH_CUDA_ARCH_LIST="9.0 10.0" \
+HF_HOME=$PWD/.cache/ \
+HF_HUB_OFFLINE=1 \
+WANDB_API_KEY={your W&B API key} \
+uv run python examples/nemo_gym/run_grpo_nemo_gym.py \
+ --config=$CONFIG_PATH \
+ ++logger.wandb.project="${Your Username}-nemo-gym-rl-integration" \
+ ++logger.wandb.name=$EXP_NAME \
+ ++logger.log_dir=results/$EXP_NAME \
+ ++policy.generation.vllm_cfg.tool_parser_plugin=$(find $PWD/.cache -name nemotron_toolcall_parser_no_streaming.py) \
+ ++grpo.max_num_steps=3 \
+ ++checkpointing.checkpoint_dir=results/$EXP_NAME &> results/$EXP_NAME/output.log &
+
+# Watch the logs
+tail -f results/$EXP_NAME/output.log
+```
+
+:::{tip}
+The end of the command above does the following:
+
+```bash
+&> results/$EXP_NAME/output.log &
+```
+
+1. `&> results/$EXP_NAME/output.log`: Pipes the terminal outputs into a file at `results/$EXP_NAME/output.log` that you can view.
+2. `&`: This final ampersand runs the job in the background, which frees up your terminal to do other things. You can view all the background jobs using the `jobs` command. If you need to quit the training run, you can use the `fg` command to bring the job from the background into the foreground and then Ctrl+C like normal.
+:::
+
+**✅ Success Check**: Training completes 3 steps on single node without any issues. Check the logs for errors and verify that training steps are progressing.
+
+---
+
+:::{button-ref} training-nemo-rl-grpo-multi-node-training
+:color: primary
+:ref-type: ref
+
+Next: Multi-Node Training →
+:::
diff --git a/docs/tutorials/offline-training-w-rollouts.md b/docs/tutorials/offline-training-w-rollouts.md
index 5b724fc93..db6f7c7a7 100644
--- a/docs/tutorials/offline-training-w-rollouts.md
+++ b/docs/tutorials/offline-training-w-rollouts.md
@@ -1,4 +1,6 @@
-# Offline Training with Rollouts (SFT/DPO) - Experimental
+(offline-training-w-rollouts)=
+
+# Offline Training with Rollouts (SFT/DPO) - Experimental
:::{warning}
This tutorial is **experimental** and may contain bugs. Proceed with caution.
diff --git a/docs/tutorials/rl-training-with-nemo-rl.md b/docs/tutorials/rl-training-with-nemo-rl.md
deleted file mode 100644
index 6c9d5069a..000000000
--- a/docs/tutorials/rl-training-with-nemo-rl.md
+++ /dev/null
@@ -1,174 +0,0 @@
-(rl-training-with-nemo-rl)=
-
-# RL Training with NeMo RL - Experimental
-
-:::{warning}
-This tutorial is **experimental** and may contain bugs. Proceed with caution.
-:::
-
-**Goal**: Train a model with NeMo RL. Learn how to set up NeMo Gym + NeMo RL training environment, run tests, prepare data, and launch single and multi-node training runs!
-
-Multinode Slurm script and run command are at the bottom of this document. Complete the single-node setup first before proceeding to multi-node training. Throughout this tutorial, you may see mentions of "Penguin", which refers to Gym's codename before it was fully open-sourced.
-
-## Single GPU node setup to ensure correctness
-
-### SSH or enter into a GPU node
-
-Here is an example command to enter into a GPU node hosted on a Slurm cluster.
-```bash
-srun \
- --no-container-mount-home \
- --container-mounts=/shared/filesystem:/shared/filesystem \
- --container-image=/path/to/nemo-rl/container \
- --gres=gpu:8 \
- --nodes=1 --ntasks=1 --time 04:00:00 \
- --pty /bin/bash
-```
-
-### Setup NeMo RL and NeMo Gym
-
-```bash
-# CD into your preferred workspace
-# cd /shared/filesystem/$USER
-
-# Clone NeMo RL
-git clone https://github.com/NVIDIA-NeMo/RL
-cd RL
-
-# Clone NeMo Gym
-git clone https://github.com/NVIDIA-NeMo/Gym.git 3rdparty/Penguin-workspace/Penguin
-
-# Pull necessary submodules (for example, megatron, automodel, and so on). Nothing Gym-specific.
-git submodule update --init --recursive
-
-# Initial setup
-source /opt/nemo_rl_venv/bin/activate
-uv sync --group={build,docs,dev,test} --extra penguin
-
-# This will take 10 to 15 minutes
-# We add the HF token here to avoid HF rate limits
-HF_HOME=.cache/ \
-HF_TOKEN={your HF token} \
- ./examples/penguin/run_penguin_single_node_sanity_tests.sh
-
-# If you used Gym previously, to run these tests properly, you may need to set `NRL_FORCE_REBUILD_VENVS=true` on an initial run or something.
-# If you've run these tests before and are getting HF rate limit errors, you can add `HF_HUB_OFFLINE=1`
-```
-
-### Prepare NeMo Gym data
-
-You will need to use Gym's data preparation command `ng_prepare_data` to prepare the data you intend to train on, including data that you already have locally. The `ng_prepare_data` command will add an `agent_ref` property to each example that tells NeMo Gym which agent server to route that example to!
-
-Note: The `ng_prepare_data` command below includes the full set of configuration yaml paths (including the model yaml path). The configs you use to prepare data are the same configs you use for training.
-
-This command will output the data into the `data/bytedtsinghua_dapo17k`, which later configs will point to.
-
-```bash
-# Setup Penguin local venv
-cd 3rdparty/Penguin-workspace/Penguin
-uv venv --python 3.12 --allow-existing
-source .venv/bin/activate
-uv sync --active --extra dev
-
-# Prepare data
-config_paths="responses_api_models/openai_model/configs/openai_model.yaml,\
-resources_servers/math_with_judge/configs/bytedtsinghua_dapo17k.yaml"
-ng_prepare_data "+config_paths=[${config_paths}]" \
- +output_dirpath=data/bytedtsinghua_dapo17k \
- +mode=train_preparation +should_download=true
-
-# Return to NeMo RL directory and Python env
-cd ../../.. && source /opt/nemo_rl_venv/bin/activate
-```
-
-### Single node training
-
-Launch a single node training job training Qwen 3 4B Instruct using the library judge math verifier on the DAPO 17K math dataset. We find that Qwen 3 4B Instruct is the smallest model that still provides experimental signal. We use the DAPO 17K math dataset since it is a solid baseline set by the DAPO team. You should see training start and the reward should end up around 0.8 or greater.
-
-Prerequisites for the command below:
-
-1. A W&B API key
-2. Run the above `ng_prepare_data` command.
-
-
-```bash
-# Run example training config for single node
-pkill -f VllmAsyncGenerationWorker
-ray stop --force
-python -c "import ray; ray.shutdown()"
-EXP_NAME="$(date +%Y%m%d)/penguin_grpo/qwen3_4binstruct/dapo17k_bytedtsinghua_test_001"
-CONFIG_PATH=examples/penguin/grpo_dapo17k_bytedtsinghua_qwen3_4binstruct_nf.yaml
-HF_HOME=.cache/ \
-WANDB_API_KEY={your W&B API key} \
-NRL_FORCE_REBUILD_VENVS=true \
-uv run python examples/penguin/run_grpo_penguin.py \
- --config=$CONFIG_PATH \
- logger.wandb.project="{your username}-nemo-gym-rl-integration" \
- logger.wandb.name=$EXP_NAME \
- logger.log_dir=results/$EXP_NAME \
- grpo.val_at_start=false \
- ++grpo.num_prompts_per_step=4 \
- ++grpo.max_num_steps=3 \
- ++policy.dtensor_cfg.clear_cache_every_n_steps=1 \
- ++cluster.num_nodes=1 \
- checkpointing.checkpoint_dir=results/$EXP_NAME &
-```
-
-## Multi node
-
-We will run a multi-node training job on a Slurm cluster. First, we will write our Slurm job launch script and then run it.
-
-### Submit script
-
-Place this script (named, for example, `temp_penguin_submit.sh`) in the root NeMo RL dir.
-
-```bash
-# ----- PARAMETERS -----
-# WANDB_API_KEY, EXP_NAME, NUM_ACTOR_NODES, REPO_LOCATION
-
-# ----- CONSTANTS -----
-CONTAINER_IMAGE_PATH=/path/to/nemo-rl/container
-
-read -r -d '' COMMAND < "DownloadJsonlDatasetHuggingFaceConfig":
+ if not self.output_dirpath and not self.output_fpath:
+ raise ValueError("Either output_dirpath or output_fpath must be provided")
+ if self.output_dirpath and self.output_fpath:
+ raise ValueError("Cannot specify both output_dirpath and output_fpath")
+ if self.artifact_fpath and self.split:
+ raise ValueError(
+ "Cannot specify both artifact_fpath and split. Use artifact_fpath for targeting a raw file, or split for structured datasets."
+ )
+ # Prevent output_fpath without split when not using artifact_fpath
+ if self.output_fpath and not self.split and not self.artifact_fpath:
+ raise ValueError(
+ "When using output_fpath without artifact_fpath, split must be specified. Use output_dirpath to download all splits."
+ )
+ return self
DatasetType = Union[Literal["train"], Literal["validation"], Literal["example"]]
@@ -306,6 +357,7 @@ class DatasetConfig(BaseModel):
num_repeats: int = Field(default=1, ge=1)
gitlab_identifier: Optional[JsonlDatasetGitlabIdentifer] = None
+ huggingface_identifier: Optional[JsonlDatasetHuggingFaceIdentifer] = None
license: Optional[
Union[
Literal["Apache 2.0"],
@@ -320,7 +372,6 @@ class DatasetConfig(BaseModel):
@model_validator(mode="after")
def check_train_validation_sets(self) -> "DatasetConfig":
if self.type in ["train", "validation"]:
- assert self.gitlab_identifier is not None, f"A Gitlab path is required for {self.name}"
assert self.license is not None, f"A license is required for {self.name}"
return self
diff --git a/nemo_gym/dataset_orchestrator.py b/nemo_gym/dataset_orchestrator.py
index b1cd0b2de..4beeffc52 100644
--- a/nemo_gym/dataset_orchestrator.py
+++ b/nemo_gym/dataset_orchestrator.py
@@ -20,7 +20,7 @@
UploadJsonlDatasetHuggingFaceMaybeDeleteConfig,
)
from nemo_gym.gitlab_utils import delete_model_from_gitlab, is_model_in_gitlab
-from nemo_gym.hf_utils import download_jsonl_dataset as download_jsonl_dataset_from_hf
+from nemo_gym.hf_utils import download_hf_dataset_as_jsonl
from nemo_gym.hf_utils import upload_jsonl_dataset as upload_jsonl_dataset_to_hf
from nemo_gym.server_utils import get_global_config_dict
@@ -73,7 +73,13 @@ def upload_jsonl_dataset_to_hf_and_delete_gitlab_cli() -> None: # pragma: no co
def download_jsonl_dataset_from_hf_cli() -> None: # pragma: no cover
global_config = get_global_config_dict()
config = DownloadJsonlDatasetHuggingFaceConfig.model_validate(global_config)
- download_jsonl_dataset_from_hf(config)
+
+ if config.artifact_fpath:
+ print(f"Downloading file '{config.artifact_fpath}' from '{config.repo_id}'...")
+ else:
+ print(f"Downloading '{config.split or 'all'}' split(s) from '{config.repo_id}'...")
+
+ download_hf_dataset_as_jsonl(config)
def delete_jsonl_dataset_from_gitlab_cli() -> None: # pragma: no cover
diff --git a/nemo_gym/hf_utils.py b/nemo_gym/hf_utils.py
index 7f842b0a5..d26fa530d 100644
--- a/nemo_gym/hf_utils.py
+++ b/nemo_gym/hf_utils.py
@@ -12,27 +12,26 @@
# See the License for the specific language governing permissions and
# limitations under the License.
import json
-from os import environ
+import shutil
from pathlib import Path
import yaml
+from datasets import load_dataset
from huggingface_hub import HfApi, hf_hub_download
from huggingface_hub.utils import HfHubHTTPError
from scripts.update_resource_servers import visit_resource_server
from nemo_gym.config_types import DownloadJsonlDatasetHuggingFaceConfig, UploadJsonlDatasetHuggingFaceConfig
-from nemo_gym.server_utils import get_global_config_dict
def create_huggingface_client(token: str) -> HfApi: # pragma: no cover
- environ["HF_TOKEN"] = token
- client = HfApi()
+ client = HfApi(token=token)
return client
def check_jsonl_format(file_path: str) -> bool: # pragma: no cover
"""Check for the presence of the expected keys in the dataset"""
- required_keys = {"responses_create_params", "reward_profiles", "expected_answer"}
+ required_keys = {"responses_create_params"}
missing_keys_info = []
try:
@@ -48,12 +47,73 @@ def check_jsonl_format(file_path: str) -> bool: # pragma: no cover
return False
except (FileNotFoundError, json.JSONDecodeError) as e:
- print(f"[Nemo-Gym] - Error reading or prasing the JSON file: {e}")
+ print(f"[Nemo-Gym] - Error reading or parsing the JSON file: {e}")
return False
return True
+def download_hf_dataset_as_jsonl(
+ config: DownloadJsonlDatasetHuggingFaceConfig,
+) -> None: # pragma: no cover
+ """
+ Download a HF dataset and save as JSONL.
+ If `artifact_fpath` is provided, downloads that specific file using `hf_hub_download`.
+ Otherwise, uses datasets.load_dataset() to handle structured datasets.
+ """
+ try:
+ # artifact_fpath - download raw jsonl file
+ if config.artifact_fpath:
+ downloaded_path = hf_hub_download(
+ repo_id=config.repo_id,
+ filename=config.artifact_fpath,
+ repo_type="dataset",
+ token=config.hf_token,
+ )
+ output_file = (
+ Path(config.output_fpath or config.output_dirpath) / Path(config.artifact_fpath).name
+ if config.output_dirpath
+ else Path(config.output_fpath)
+ )
+ output_file.parent.mkdir(parents=True, exist_ok=True)
+
+ # We copy the downloaded file from the cache to the target path
+ # to allow renaming (e.g., artifact_fpath="something.jsonl" -> output_fpath="train.jsonl")
+ shutil.copy(downloaded_path, output_file)
+ print(f"[Nemo-Gym] - Downloaded {config.artifact_fpath} to: {output_file}")
+ return
+
+ # no artifact_fpath - use load_dataset() with split (if provided)
+ if config.output_fpath:
+ # Exact output path specified
+ output_file = Path(config.output_fpath)
+ output_file.parent.mkdir(parents=True, exist_ok=True)
+ ds = load_dataset(config.repo_id, split=config.split, token=config.hf_token)
+ ds.to_json(str(output_file))
+ print(f"[Nemo-Gym] - Downloaded {config.split} split to: {output_file}")
+ else:
+ # Output directory specified
+ output_dir = Path(config.output_dirpath)
+ output_dir.mkdir(parents=True, exist_ok=True)
+
+ if config.split:
+ ds = load_dataset(config.repo_id, split=config.split, token=config.hf_token)
+ output_file = output_dir / f"{config.split}.jsonl"
+ ds.to_json(str(output_file))
+ print(f"[Nemo-Gym] - Downloaded {config.split} split to: {output_file}")
+ else:
+ # Download all
+ ds = load_dataset(config.repo_id, token=config.hf_token)
+ for split_name, split_data in ds.items():
+ output_file = output_dir / f"{split_name}.jsonl"
+ split_data.to_json(str(output_file))
+ print(f"[Nemo-Gym] - Downloaded {split_name} split to: {output_file}")
+
+ except Exception as e:
+ print(f"[Nemo-Gym] - Error downloading/converting dataset: {e}")
+ raise
+
+
def upload_jsonl_dataset(
config: UploadJsonlDatasetHuggingFaceConfig,
) -> None: # pragma: no cover
@@ -61,15 +121,19 @@ def upload_jsonl_dataset(
with open(config.resource_config_path, "r") as f:
data = yaml.safe_load(f)
- domain = d.title() if (d := visit_resource_server(data).to_dict().get("domain")) else None
+ domain = d.lower() + "-" if (d := visit_resource_server(data).to_dict().get("domain")) else ""
resource_server = config.resource_config_path.split("/")[1]
- dataset_name = config.dataset_name
+ dataset_name = config.dataset_name or resource_server
prefix = config.hf_dataset_prefix + "-" if config.hf_dataset_prefix else ""
- repo_id = f"{config.hf_organization}/{prefix}{domain}-{resource_server}-{dataset_name}"
- collection_id = f"{config.hf_organization}/{config.hf_collection_name}-{config.hf_collection_slug}"
+ collection_id = (
+ f"{config.hf_organization}/{config.hf_collection_name.lower().replace(' ', '-')}-{config.hf_collection_slug}"
+ )
+
+ repo_id = f"{config.hf_organization}/{prefix}{domain}{dataset_name}"
- # Dataset format check
- if not check_jsonl_format(config.input_jsonl_fpath):
+ # Dataset format check - only strict check for training data
+ is_training = config.split.lower() == "train"
+ if is_training and not check_jsonl_format(config.input_jsonl_fpath):
print("[Nemo-Gym] - JSONL file format check failed.")
return
@@ -78,8 +142,11 @@ def upload_jsonl_dataset(
client.create_repo(repo_id=repo_id, token=config.hf_token, repo_type="dataset", private=True, exist_ok=True)
print(f"[Nemo-Gym] - Repo '{repo_id}' is ready for use")
except HfHubHTTPError as e:
- print(f"[Nemo-Gym] - Error creating repo: {e}")
- raise
+ if config.create_pr and "403" in str(e):
+ print(f"[Nemo-Gym] - Repo '{repo_id}' exists (no create permission, will create PR)")
+ else:
+ print(f"[Nemo-Gym] - Error creating repo: {e}")
+ raise
# Collection id + addition
try:
@@ -97,33 +164,21 @@ def upload_jsonl_dataset(
# File upload
try:
- client.upload_file(
+ commit_info = client.upload_file(
path_or_fileobj=config.input_jsonl_fpath,
path_in_repo=Path(config.input_jsonl_fpath).name,
repo_id=repo_id,
token=config.hf_token,
repo_type="dataset",
+ create_pr=config.create_pr,
+ revision=config.revision,
+ commit_message=config.commit_message,
+ commit_description=config.commit_description,
)
- print("[Nemo-Gym] - Dataset uploaded successful")
+ if config.create_pr:
+ print(f"[Nemo-Gym] - Pull Request created: {commit_info.pr_url}")
+ else:
+ print("[Nemo-Gym] - Dataset upload successful")
except HfHubHTTPError as e:
print(f"[Nemo-Gym] - Error uploading file: {e}")
raise
-
-
-def download_jsonl_dataset(config: DownloadJsonlDatasetHuggingFaceConfig) -> None: # pragma: no cover
- try:
- downloaded_path = hf_hub_download(
- repo_id=config.repo_id,
- repo_type="dataset",
- filename=config.artifact_fpath,
- token=config.hf_token,
- )
- Path(config.output_fpath).write_bytes(Path(downloaded_path).read_bytes())
- except HfHubHTTPError as e:
- print(f"[Nemo-Gym] - Error downloading file: {e}")
-
-
-def download_jsonl_dataset_cli() -> None: # pragma: no cover
- global_config = get_global_config_dict()
- config = DownloadJsonlDatasetHuggingFaceConfig.model_validate(global_config)
- download_jsonl_dataset(config)
diff --git a/nemo_gym/train_data_utils.py b/nemo_gym/train_data_utils.py
index 84609c5fb..d45724639 100644
--- a/nemo_gym/train_data_utils.py
+++ b/nemo_gym/train_data_utils.py
@@ -13,6 +13,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.
import json
+import sys
from abc import abstractmethod
from collections import Counter, defaultdict
from math import sqrt
@@ -34,6 +35,7 @@
DatasetConfig,
DatasetType,
DownloadJsonlDatasetGitlabConfig,
+ DownloadJsonlDatasetHuggingFaceConfig,
ServerInstanceConfig,
)
from nemo_gym.gitlab_utils import download_jsonl_dataset
@@ -42,6 +44,9 @@
GlobalConfigDictParserConfig,
get_global_config_dict,
)
+from nemo_gym.hf_utils import (
+ download_hf_dataset_as_jsonl,
+)
class TrainDataProcessorConfig(BaseNeMoGymCLIConfig):
@@ -67,6 +72,10 @@ class TrainDataProcessorConfig(BaseNeMoGymCLIConfig):
default=False,
description="Whether to automatically download missing datasets from remote registries (default: False).",
)
+ data_source: Literal["gitlab", "huggingface"] = Field(
+ default="huggingface",
+ description="Where to download missing datasets from: 'gitlab' (NVIDIA internal) or 'huggingface' (external).",
+ )
@property
def in_scope_dataset_types(self) -> List[DatasetType]:
@@ -451,16 +460,56 @@ def load_datasets(
"Missing local datasets. You must provide local datasets since download is disabled. Run with `+should_download=true` to enable downloading."
)
+ if not local_datasets_not_found:
+ return
+ backend = config.data_source
+ is_valid, error_msg = validate_backend_credentials(backend)
+ global_config = get_global_config_dict()
+ if not is_valid:
+ print(f"Cannot download datasets: {error_msg}")
+ sys.exit(1)
+
for (
server_name,
datasets,
) in local_datasets_not_found.items(): # pragma: no cover
for d in datasets:
- download_config = DownloadJsonlDatasetGitlabConfig.model_validate(
- d.gitlab_identifier.model_dump() | {"output_fpath": d.jsonl_fpath}
- )
- print(f"Downloading dataset `{d.name}` from `{server_name}` using {download_config}")
- download_jsonl_dataset(download_config)
+ try:
+ if backend == "gitlab":
+ if d.gitlab_identifier is None:
+ print(f"Dataset `{d.name}` missing gitlab_identifier for GitLab backend")
+ continue
+
+ download_config = DownloadJsonlDatasetGitlabConfig.model_validate(
+ d.gitlab_identifier.model_dump() | {"output_fpath": d.jsonl_fpath}
+ )
+ print(
+ f"Downloading dataset `{d.name}` for `{server_name}` from {backend} using {download_config}"
+ )
+ download_jsonl_dataset(download_config)
+
+ elif backend == "huggingface":
+ hf_identifier = d.huggingface_identifier
+
+ if hf_identifier is None:
+ print(f"Dataset `{d.name}` missing huggingface_identifier for HuggingFace backend")
+ continue
+
+ download_config = DownloadJsonlDatasetHuggingFaceConfig.model_validate(
+ {
+ "repo_id": hf_identifier.repo_id,
+ "artifact_fpath": hf_identifier.artifact_fpath,
+ "output_fpath": d.jsonl_fpath,
+ # Only pass split if artifact_fpath is not set
+ **({"split": d.type} if not hf_identifier.artifact_fpath else {}),
+ "hf_token": global_config.get("hf_token"),
+ }
+ )
+ print(f"Downloading '{d.type}' split from {hf_identifier.repo_id} to {d.jsonl_fpath}...")
+ download_hf_dataset_as_jsonl(download_config)
+
+ except Exception as e:
+ print(f"Failed to download dataset `{d.name}` from {backend}: {e}")
########################################
# Validate samples and aggregate metrics
@@ -512,89 +561,91 @@ def _validate_aggregate_metrics(self, aggregate_metrics_dict: Dict, metrics_fpat
"""
Returns the conflicting metrics fpath if invalid. Else returns None
"""
- if metrics_fpath.exists():
- with open(metrics_fpath) as f:
- previous_aggregate_metrics_dict = json.load(f)
-
- def numeric_close(a: float, b: float) -> bool:
- """Helper to compare numbers with a tolerance"""
- if a == b:
- return True
+ if not metrics_fpath.exists():
+ return
+
+ with open(metrics_fpath) as f:
+ previous_aggregate_metrics_dict = json.load(f)
+
+ def numeric_close(a: float, b: float) -> bool:
+ """Helper to compare numbers with a tolerance"""
+ if a == b:
+ return True
+ try:
+ a_f = float(a)
+ b_f = float(b)
+ except Exception:
+ return False
+ scale = max(abs(a_f), abs(b_f)) # Adjuster for tolerance
+
+ # may need to adjust this threshold:
+ tol = 5e-3 if scale >= 1 else 5e-4 # Higher threshold for larger numbers
+ return abs(a_f - b_f) <= max(tol, 1e-9) # Allow small differences
+
+ def diff_values(prev_v, new_v, path: str, diffs: List[str]) -> None:
+ """
+ Recursively compare values at the given path.
+ Keys from previous dict must be present in new dict.
+ Additional fields in new dict are allowed.
+ """
+ if isinstance(prev_v, dict) and isinstance(new_v, dict):
+ for k in prev_v.keys():
+ sub_path = f"{path}.{k}" if path else k
+ if k not in new_v:
+ diffs.append(f"Missing key in new metrics: {sub_path}")
+ continue
+ diff_values(prev_v[k], new_v[k], sub_path, diffs)
+ return
+
+ # Lists: Check for equality regardless of order
+ if isinstance(prev_v, list) and isinstance(new_v, list):
+ if len(prev_v) != len(new_v):
+ diffs.append(f"List length differs at {path}: {len(prev_v)} != {len(new_v)}")
+ return
try:
- a_f = float(a)
- b_f = float(b)
- except Exception:
- return False
- scale = max(abs(a_f), abs(b_f)) # Adjuster for tolerance
-
- # may need to adjust this threshold:
- tol = 5e-3 if scale >= 1 else 5e-4 # Higher threshold for larger numbers
- return abs(a_f - b_f) <= max(tol, 1e-9) # Allow small differences
-
- def diff_values(prev_v, new_v, path: str, diffs: List[str]) -> None:
- """
- Recursively compare values at the given path.
- Keys from previous dict must be present in new dict.
- Additional fields in new dict are allowed.
- """
- if isinstance(prev_v, dict) and isinstance(new_v, dict):
- for k in prev_v.keys():
- sub_path = f"{path}.{k}" if path else k
- if k not in new_v:
- diffs.append(f"Missing key in new metrics: {sub_path}")
- continue
- diff_values(prev_v[k], new_v[k], sub_path, diffs)
+ prev_counter = Counter(prev_v)
+ new_counter = Counter(new_v)
+ if prev_counter != new_counter:
+ diffs.append(f"Multiset mismatch at {path}: {prev_counter} != {new_counter}")
return
-
- # Lists: Check for equality regardless of order
- if isinstance(prev_v, list) and isinstance(new_v, list):
- if len(prev_v) != len(new_v):
- diffs.append(f"List length differs at {path}: {len(prev_v)} != {len(new_v)}")
- return
- try:
- prev_counter = Counter(prev_v)
- new_counter = Counter(new_v)
- if prev_counter != new_counter:
- diffs.append(f"Multiset mismatch at {path}: {prev_counter} != {new_counter}")
- return
- except TypeError:
- # Manual fallback for unhashable elements
- used = set()
- for i, pv in enumerate(prev_v):
- found = False
- for j, nv in enumerate(new_v):
- if j in used:
- continue
- sub_diffs = []
- diff_values(pv, nv, f"{path}[{i}]", sub_diffs)
- if not sub_diffs:
- used.add(j)
- found = True
- break
- if not found:
- diffs.append(f"No matching element for {path}[{i}] in new metrics (unordered)")
- return
-
- if isinstance(prev_v, float) and isinstance(new_v, float):
- if not numeric_close(prev_v, new_v):
- diffs.append(f"Numeric mismatch at {path}: {prev_v} != {new_v}")
+ except TypeError:
+ # Manual fallback for unhashable elements
+ used = set()
+ for i, pv in enumerate(prev_v):
+ found = False
+ for j, nv in enumerate(new_v):
+ if j in used:
+ continue
+ sub_diffs = []
+ diff_values(pv, nv, f"{path}[{i}]", sub_diffs)
+ if not sub_diffs:
+ used.add(j)
+ found = True
+ break
+ if not found:
+ diffs.append(f"No matching element for {path}[{i}] in new metrics (unordered)")
return
- if prev_v != new_v:
- diffs.append(f"Value differs at {path}: {prev_v} != {new_v}")
+ if isinstance(prev_v, float) and isinstance(new_v, float):
+ if not numeric_close(prev_v, new_v):
+ diffs.append(f"Numeric mismatch at {path}: {prev_v} != {new_v}")
+ return
- diffs: List[str] = []
- diff_values(previous_aggregate_metrics_dict, aggregate_metrics_dict, path="", diffs=diffs)
+ if prev_v != new_v:
+ diffs.append(f"Value differs at {path}: {prev_v} != {new_v}")
- if diffs:
- print("Differences found in aggregate metrics:")
- pprint(diffs)
+ diffs: List[str] = []
+ diff_values(previous_aggregate_metrics_dict, aggregate_metrics_dict, path="", diffs=diffs)
- conflicting_metrics_fpath = metrics_fpath.with_name(f"{metrics_fpath.stem}_conflict.json")
- with open(conflicting_metrics_fpath, "w") as f:
- json.dump(aggregate_metrics_dict, f, indent=4)
+ if diffs:
+ print("Differences found in aggregate metrics:")
+ pprint(diffs)
- return conflicting_metrics_fpath
+ conflicting_metrics_fpath = metrics_fpath.with_name(f"{metrics_fpath.stem}_conflict.json")
+ with open(conflicting_metrics_fpath, "w") as f:
+ json.dump(aggregate_metrics_dict, f, indent=4)
+
+ return conflicting_metrics_fpath
def validate_samples_and_aggregate_metrics(
self, server_instance_configs: List[ServerInstanceConfig]
@@ -729,6 +780,34 @@ def collate_samples(
print(f"View your final data!{final_fpaths_str}")
+def validate_backend_credentials(backend: str) -> tuple[bool, str]:
+ """Check if required env variables are present for the chosen backend"""
+ global_config = get_global_config_dict()
+
+ if backend == "gitlab":
+ required = ["mlflow_tracking_uri", "mlflow_tracking_token"]
+ missing = [k for k in required if k not in global_config or not global_config[k]]
+ if missing:
+ return False, (
+ f"GitLab backend selected but missing credentials: {missing}\n"
+ f"Add to env.yaml:\n"
+ f" mlflow_tracking_uri: \n"
+ f" mlflow_tracking_token: "
+ )
+
+ elif backend == "huggingface":
+ required = ["hf_token"]
+ missing = [k for k in required if k not in global_config or not global_config[k]]
+ if missing:
+ return False, (
+ f"HuggingFace backend selected but missing credentials: {missing}\n"
+ f"Add to env.yaml:\n"
+ f" hf_token: \n"
+ )
+
+ return True, ""
+
+
def prepare_data(): # pragma: no cover
global_config_dict = get_global_config_dict(
global_config_dict_parser_config=GlobalConfigDictParserConfig(
diff --git a/pyproject.toml b/pyproject.toml
index ecab3fe20..e211d6f24 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -153,6 +153,11 @@ dependencies = [
# Updated: Fri Nov 07, 2025 with psutil==6.1.1
# License: BSD 3-Clause https://github.com/giampaolo/psutil/blob/master/LICENSE
"psutil",
+
+ # HuggingFace datasets: for loading and converting parquet datasets
+ # Updated Thu Dec 04, 2025 with datasets==4.4.1
+ # License: Apache 2.0 https://github.com/huggingface/datasets/blob/main/LICENSE
+ "datasets",
]
[dependency-groups]
diff --git a/resources_servers/equivalence_llm_judge/configs/equivalence_llm_judge.yaml b/resources_servers/equivalence_llm_judge/configs/equivalence_llm_judge.yaml
index a541aa639..55bfb96d2 100644
--- a/resources_servers/equivalence_llm_judge/configs/equivalence_llm_judge.yaml
+++ b/resources_servers/equivalence_llm_judge/configs/equivalence_llm_judge.yaml
@@ -138,4 +138,5 @@ equivalence_llm_judge_simple_agent:
type: train
license: Apache 2.0
jsonl_fpath: resources_servers/equivalence_llm_judge/data/train.jsonl
- dataset_url: https://huggingface.co/datasets/nvidia/Nemotron-RL-knowledge-openQA
+ huggingface_identifier:
+ repo_id: nvidia/Nemotron-RL-knowledge-openQA
diff --git a/resources_servers/equivalence_llm_judge/data/example_metrics.json b/resources_servers/equivalence_llm_judge/data/example_metrics.json
index 723a669e7..29b267109 100644
--- a/resources_servers/equivalence_llm_judge/data/example_metrics.json
+++ b/resources_servers/equivalence_llm_judge/data/example_metrics.json
@@ -2,7 +2,9 @@
"name": "example",
"type": "example",
"jsonl_fpath": "resources_servers/equivalence_llm_judge/data/example.jsonl",
+ "num_repeats": 1,
"gitlab_identifier": null,
+ "huggingface_identifier": null,
"license": "TBD",
"Number of examples": 5,
"Number of tools": {
diff --git a/resources_servers/equivalence_llm_judge/data/example_openqa_metrics.json b/resources_servers/equivalence_llm_judge/data/example_openqa_metrics.json
new file mode 100644
index 000000000..7605c4d83
--- /dev/null
+++ b/resources_servers/equivalence_llm_judge/data/example_openqa_metrics.json
@@ -0,0 +1,50 @@
+{
+ "name": "example_openqa",
+ "type": "example",
+ "jsonl_fpath": "resources_servers/equivalence_llm_judge/data/example_openqa.jsonl",
+ "num_repeats": 1,
+ "gitlab_identifier": null,
+ "huggingface_identifier": null,
+ "license": "TBD",
+ "Number of examples": 5,
+ "Number of tools": {
+ "Total # non-null values": 0,
+ "Average": 0.0,
+ "Min": 0.0,
+ "Max": 0.0,
+ "Median": 0.0,
+ "Standard deviation": 0.0
+ },
+ "Json-dumped number of words (proxy for token count)": {
+ "Total # non-null values": 5,
+ "Average": 41.2,
+ "Min": 32.0,
+ "Max": 53.0,
+ "Median": 40.0,
+ "Standard deviation": 7.73
+ },
+ "Number of turns": {
+ "Total # non-null values": 5,
+ "Average": 1.0,
+ "Min": 1.0,
+ "Max": 1.0,
+ "Median": 1.0,
+ "Standard deviation": 0.0
+ },
+ "Temperature": {
+ "Total # non-null values": 0,
+ "Average": 0.0,
+ "Min": 0.0,
+ "Max": 0.0,
+ "Median": 0.0,
+ "Standard deviation": 0.0
+ },
+ "expected_answer": {
+ "unique_count": 5,
+ "total_count": 5
+ },
+ "uuid": {
+ "unique_count": 5,
+ "total_count": 5
+ }
+}
\ No newline at end of file
diff --git a/resources_servers/example_multi_step/data/example_metrics.json b/resources_servers/example_multi_step/data/example_metrics.json
index c0465fe93..163f16c63 100644
--- a/resources_servers/example_multi_step/data/example_metrics.json
+++ b/resources_servers/example_multi_step/data/example_metrics.json
@@ -4,6 +4,7 @@
"jsonl_fpath": "resources_servers/example_multi_step/data/example.jsonl",
"num_repeats": 1,
"gitlab_identifier": null,
+ "huggingface_identifier": null,
"license": null,
"Number of examples": 5,
"Number of tools": {
@@ -70,5 +71,4 @@
"Median": 299.0,
"Standard deviation": 0.0
}
-}
-
+}
\ No newline at end of file
diff --git a/resources_servers/google_search/configs/google_search.yaml b/resources_servers/google_search/configs/google_search.yaml
index 66e75c16b..bbc2c0eab 100644
--- a/resources_servers/google_search/configs/google_search.yaml
+++ b/resources_servers/google_search/configs/google_search.yaml
@@ -24,8 +24,10 @@ google_search_simple_agent:
dataset_name: search_STEM_syn_gpqa_v1_2_difficulty_filtered
version: 0.0.1
artifact_fpath: MCQA_syn_gpqa_1_2_difficulty_filtered_responses_api.jsonl
+ huggingface_identifier:
+ repo_id: nvidia/Nemotron-RL-knowledge-web_search-mcqa
+ artifact_fpath: mcqa_search.jsonl
license: Apache 2.0
- dataset_url: https://huggingface.co/datasets/nvidia/Nemotron-RL-knowledge-web_search-mcqa
- name: example
type: example
jsonl_fpath: resources_servers/google_search/data/example.jsonl
diff --git a/resources_servers/google_search/data/example_metrics.json b/resources_servers/google_search/data/example_metrics.json
index 1b1f950a0..1ab222034 100644
--- a/resources_servers/google_search/data/example_metrics.json
+++ b/resources_servers/google_search/data/example_metrics.json
@@ -2,7 +2,9 @@
"name": "example",
"type": "example",
"jsonl_fpath": "resources_servers/google_search/data/example.jsonl",
+ "num_repeats": 1,
"gitlab_identifier": null,
+ "huggingface_identifier": null,
"license": null,
"Number of examples": 5,
"Number of tools": {
diff --git a/resources_servers/google_search/data/train_metrics.json b/resources_servers/google_search/data/train_metrics.json
index fa4b34675..7a3d38319 100644
--- a/resources_servers/google_search/data/train_metrics.json
+++ b/resources_servers/google_search/data/train_metrics.json
@@ -2,6 +2,7 @@
"name": "train",
"type": "train",
"jsonl_fpath": "resources_servers/google_search/data/train.jsonl",
+ "num_repeats": 1,
"gitlab_identifier": {
"dataset_name": "search_STEM_syn_gpqa_v1_2_difficulty_filtered",
"version": "0.0.1",
@@ -13,24 +14,44 @@
"Total # non-null values": 2907,
"Average": 2.0,
"Min": 2.0,
- "Max": 2.0
+ "Max": 2.0,
+ "Median": 2.0,
+ "Standard deviation": 0.0
},
"Json-dumped number of words (proxy for token count)": {
"Total # non-null values": 2907,
- "Average": 282.97557619539043,
+ "Average": 282.98,
"Min": 207.0,
- "Max": 439.0
+ "Max": 439.0,
+ "Median": 280.91,
+ "Standard deviation": 35.33
},
"Number of turns": {
"Total # non-null values": 2907,
"Average": 1.0,
"Min": 1.0,
- "Max": 1.0
+ "Max": 1.0,
+ "Median": 1.0,
+ "Standard deviation": 0.0
},
"Temperature": {
"Total # non-null values": 2907,
- "Average": 0.5999999999999719,
+ "Average": 0.6,
"Min": 0.6,
- "Max": 0.6
+ "Max": 0.6,
+ "Median": 0.6,
+ "Standard deviation": 0.0
+ },
+ "expected_answer": {
+ "unique_count": 4,
+ "total_count": 2907
+ },
+ "task_difficulty_qwen3_32b_avg_8": {
+ "Total # non-null values": 2907,
+ "Average": 0.374,
+ "Min": 0.25,
+ "Max": 0.5,
+ "Median": 0.374,
+ "Standard deviation": 0.103
}
}
\ No newline at end of file
diff --git a/resources_servers/instruction_following/configs/instruction_following.yaml b/resources_servers/instruction_following/configs/instruction_following.yaml
index 090b12c3e..96da26c1a 100644
--- a/resources_servers/instruction_following/configs/instruction_following.yaml
+++ b/resources_servers/instruction_following/configs/instruction_following.yaml
@@ -24,8 +24,10 @@ instruction_following_simple_agent:
dataset_name: instruction_following
version: 0.0.1
artifact_fpath: instruction_following.jsonl
+ huggingface_identifier:
+ repo_id: nvidia/Nemotron-RL-instruction_following
+ artifact_fpath: instruction_following.jsonl
license: Apache 2.0
- dataset_url: https://huggingface.co/datasets/nvidia/Nemotron-RL-instruction_following
- name: example
type: example
jsonl_fpath: resources_servers/instruction_following/data/example.jsonl
diff --git a/resources_servers/instruction_following/data/example_metrics.json b/resources_servers/instruction_following/data/example_metrics.json
index c77501e88..e7b98c9bf 100644
--- a/resources_servers/instruction_following/data/example_metrics.json
+++ b/resources_servers/instruction_following/data/example_metrics.json
@@ -2,7 +2,9 @@
"name": "example",
"type": "example",
"jsonl_fpath": "resources_servers/instruction_following/data/example.jsonl",
+ "num_repeats": 1,
"gitlab_identifier": null,
+ "huggingface_identifier": null,
"license": null,
"Number of examples": 5,
"Number of tools": {
diff --git a/resources_servers/math_advanced_calculations/configs/math_advanced_calculations.yaml b/resources_servers/math_advanced_calculations/configs/math_advanced_calculations.yaml
index 2f13d0328..e5e15cc40 100644
--- a/resources_servers/math_advanced_calculations/configs/math_advanced_calculations.yaml
+++ b/resources_servers/math_advanced_calculations/configs/math_advanced_calculations.yaml
@@ -24,8 +24,9 @@ math_advanced_calculations_simple_agent:
dataset_name: math_advanced_calculations
version: 0.0.1
artifact_fpath: train.jsonl
+ huggingface_identifier:
+ repo_id: nvidia/Nemotron-RL-math-advanced_calculations
license: Apache 2.0
- dataset_url: https://huggingface.co/datasets/nvidia/Nemotron-RL-math-advanced_calculations
- name: example
type: example
jsonl_fpath: resources_servers/math_advanced_calculations/data/example.jsonl
diff --git a/resources_servers/math_advanced_calculations/data/example_metrics.json b/resources_servers/math_advanced_calculations/data/example_metrics.json
index 74d9c36f5..1f3228a86 100644
--- a/resources_servers/math_advanced_calculations/data/example_metrics.json
+++ b/resources_servers/math_advanced_calculations/data/example_metrics.json
@@ -2,7 +2,9 @@
"name": "example",
"type": "example",
"jsonl_fpath": "resources_servers/math_advanced_calculations/data/example.jsonl",
+ "num_repeats": 1,
"gitlab_identifier": null,
+ "huggingface_identifier": null,
"license": null,
"Number of examples": 5,
"Number of tools": {
diff --git a/resources_servers/math_with_judge/configs/math_with_judge.yaml b/resources_servers/math_with_judge/configs/math_with_judge.yaml
index eba000bab..9998bc15c 100644
--- a/resources_servers/math_with_judge/configs/math_with_judge.yaml
+++ b/resources_servers/math_with_judge/configs/math_with_judge.yaml
@@ -31,8 +31,10 @@ math_with_judge_simple_agent:
dataset_name: math_open_math_reasoning
version: 0.0.1
artifact_fpath: open_math_reasoning_problems.jsonl
+ huggingface_identifier:
+ repo_id: nvidia/Nemotron-RL-math-OpenMathReasoning
+ artifact_fpath: open_math_reasoning_problems.jsonl
license: Creative Commons Attribution 4.0 International
- dataset_url: https://huggingface.co/datasets/nvidia/Nemotron-RL-math-OpenMathReasoning
- name: validation
type: validation
jsonl_fpath: resources_servers/math_with_judge/data/aime24_validation.jsonl
@@ -40,6 +42,9 @@ math_with_judge_simple_agent:
dataset_name: aime24
version: 0.0.1
artifact_fpath: aime24.jsonl
+ huggingface_identifier:
+ repo_id: nvidia/Nemotron-RL-math-OpenMathReasoning
+ artifact_fpath: aime24_validation.jsonl
license: Apache 2.0
- name: example
type: example
diff --git a/resources_servers/math_with_judge/data/example_metrics.json b/resources_servers/math_with_judge/data/example_metrics.json
index a56ece776..bc686e973 100644
--- a/resources_servers/math_with_judge/data/example_metrics.json
+++ b/resources_servers/math_with_judge/data/example_metrics.json
@@ -4,6 +4,7 @@
"jsonl_fpath": "resources_servers/math_with_judge/data/example.jsonl",
"num_repeats": 1,
"gitlab_identifier": null,
+ "huggingface_identifier": null,
"license": null,
"Number of examples": 5,
"Number of tools": {
diff --git a/resources_servers/mcqa/configs/mcqa.yaml b/resources_servers/mcqa/configs/mcqa.yaml
index 97a0116db..c937a97d0 100644
--- a/resources_servers/mcqa/configs/mcqa.yaml
+++ b/resources_servers/mcqa/configs/mcqa.yaml
@@ -24,8 +24,9 @@ mcqa_simple_agent:
dataset_name: syn_gpqa_v1.1
version: 1.1.2
artifact_fpath: filtered_decontaminated.jsonl
+ huggingface_identifier:
+ repo_id: nvidia/Nemotron-RL-knowledge-mcqa
license: Apache 2.0
- dataset_url: https://huggingface.co/datasets/nvidia/Nemotron-RL-knowledge-mcqa
- name: example
type: example
jsonl_fpath: resources_servers/mcqa/data/example.jsonl
diff --git a/resources_servers/mcqa/data/example_metrics.json b/resources_servers/mcqa/data/example_metrics.json
index 38b182e0e..97f64483b 100644
--- a/resources_servers/mcqa/data/example_metrics.json
+++ b/resources_servers/mcqa/data/example_metrics.json
@@ -2,7 +2,9 @@
"name": "example",
"type": "example",
"jsonl_fpath": "resources_servers/mcqa/data/example.jsonl",
+ "num_repeats": 1,
"gitlab_identifier": null,
+ "huggingface_identifier": null,
"license": null,
"Number of examples": 5,
"Number of tools": {
diff --git a/resources_servers/mcqa/data/example_with_template_metadata_metrics.json b/resources_servers/mcqa/data/example_with_template_metadata_metrics.json
index 76dbd4d4a..990ca8f25 100644
--- a/resources_servers/mcqa/data/example_with_template_metadata_metrics.json
+++ b/resources_servers/mcqa/data/example_with_template_metadata_metrics.json
@@ -4,6 +4,7 @@
"jsonl_fpath": "resources_servers/mcqa/data/example_with_template_metadata.jsonl",
"num_repeats": 1,
"gitlab_identifier": null,
+ "huggingface_identifier": null,
"license": null,
"Number of examples": 5,
"Number of tools": {
diff --git a/resources_servers/mini_swe_agent/configs/mini_swe_agent.yaml b/resources_servers/mini_swe_agent/configs/mini_swe_agent.yaml
index 7526ea7e7..a537a690e 100644
--- a/resources_servers/mini_swe_agent/configs/mini_swe_agent.yaml
+++ b/resources_servers/mini_swe_agent/configs/mini_swe_agent.yaml
@@ -20,11 +20,12 @@ mini_swe_simple_agent:
- name: train
type: train
jsonl_fpath: resources_servers/mini_swe_agent/data/train.jsonl
- dataset_url: https://huggingface.co/datasets/princeton-nlp/SWE-bench_Verified
gitlab_identifier:
dataset_name: mini_swe_agent
version: 0.0.1
artifact_fpath: train.jsonl
+ huggingface_identifier:
+ repo_id: SWE-Gym/SWE-Gym
license: MIT
- name: validation
type: validation
@@ -33,6 +34,8 @@ mini_swe_simple_agent:
dataset_name: mini_swe_agent
version: 0.0.1
artifact_fpath: validation.jsonl
+ huggingface_identifier:
+ repo_id: princeton-nlp/SWE-bench_Verified
license: MIT
- name: example
type: example
diff --git a/resources_servers/mini_swe_agent/data/example_metrics.json b/resources_servers/mini_swe_agent/data/example_metrics.json
index 0ec14ec2c..5a7e4113a 100644
--- a/resources_servers/mini_swe_agent/data/example_metrics.json
+++ b/resources_servers/mini_swe_agent/data/example_metrics.json
@@ -3,6 +3,7 @@
"type": "example",
"jsonl_fpath": "resources_servers/mini_swe_agent/data/example.jsonl",
"gitlab_identifier": null,
+ "huggingface_identifier": null,
"license": null,
"Number of examples": 5,
"Number of tools": {
diff --git a/resources_servers/structured_outputs/configs/structured_outputs_json.yaml b/resources_servers/structured_outputs/configs/structured_outputs_json.yaml
index 78a84a14a..2c43a5772 100644
--- a/resources_servers/structured_outputs/configs/structured_outputs_json.yaml
+++ b/resources_servers/structured_outputs/configs/structured_outputs_json.yaml
@@ -24,8 +24,10 @@ structured_outputs_simple_agent:
dataset_name: structured_outputs_251027_nano_v3_sdg_json_train
version: 0.0.2
artifact_fpath: structured_outputs_251027_nano_v3_sdg_json_train.jsonl
+ huggingface_identifier:
+ repo_id: nvidia/Nemotron-RL-instruction_following-structured_outputs
+ artifact_fpath: structured_outputs_251027_nano_v3_sdg_json_train.jsonl
license: Apache 2.0
- dataset_url: https://huggingface.co/datasets/nvidia/Nemotron-RL-instruction_following-structured_outputs
- name: validation
type: validation
jsonl_fpath: resources_servers/structured_outputs/data/structured_outputs_251027_nano_v3_sdg_json_val.jsonl
@@ -33,6 +35,9 @@ structured_outputs_simple_agent:
dataset_name: structured_outputs_251027_nano_v3_sdg_json_val
version: 0.0.2
artifact_fpath: structured_outputs_251027_nano_v3_sdg_json_val.jsonl
+ huggingface_identifier:
+ repo_id: nvidia/Nemotron-RL-instruction_following-structured_outputs
+ artifact_fpath: structured_outputs_251027_nano_v3_sdg_json_val.jsonl
license: Apache 2.0
- name: example
type: example
diff --git a/resources_servers/structured_outputs/data/example_metrics.json b/resources_servers/structured_outputs/data/example_metrics.json
index 868317963..e878b7a0f 100644
--- a/resources_servers/structured_outputs/data/example_metrics.json
+++ b/resources_servers/structured_outputs/data/example_metrics.json
@@ -4,6 +4,7 @@
"jsonl_fpath": "resources_servers/structured_outputs/data/example.jsonl",
"num_repeats": 1,
"gitlab_identifier": null,
+ "huggingface_identifier": null,
"license": null,
"Number of examples": 5,
"Number of tools": {
diff --git a/resources_servers/workplace_assistant/configs/workplace_assistant.yaml b/resources_servers/workplace_assistant/configs/workplace_assistant.yaml
index 5377b3e77..bae8340db 100644
--- a/resources_servers/workplace_assistant/configs/workplace_assistant.yaml
+++ b/resources_servers/workplace_assistant/configs/workplace_assistant.yaml
@@ -20,18 +20,15 @@ workplace_assistant_simple_agent:
- name: train
type: train
jsonl_fpath: resources_servers/workplace_assistant/data/train.jsonl
- gitlab_identifier:
- dataset_name: workplace_assistant
- version: 0.0.4
+ huggingface_identifier:
+ repo_id: nvidia/Nemotron-RL-agent-workplace_assistant
artifact_fpath: train.jsonl
license: Apache 2.0
- dataset_url: https://huggingface.co/datasets/nvidia/Nemotron-RL-agent-workplace_assistant
- name: validation
type: validation
jsonl_fpath: resources_servers/workplace_assistant/data/validation.jsonl
- gitlab_identifier:
- dataset_name: workplace_assistant
- version: 0.0.4
+ huggingface_identifier:
+ repo_id: nvidia/Nemotron-RL-agent-workplace_assistant
artifact_fpath: validation.jsonl
license: Apache 2.0
- name: example
diff --git a/resources_servers/workplace_assistant/data/example_metrics.json b/resources_servers/workplace_assistant/data/example_metrics.json
index 1e8e35e3b..44386b6f6 100644
--- a/resources_servers/workplace_assistant/data/example_metrics.json
+++ b/resources_servers/workplace_assistant/data/example_metrics.json
@@ -2,7 +2,9 @@
"name": "example",
"type": "example",
"jsonl_fpath": "resources_servers/workplace_assistant/data/example.jsonl",
+ "num_repeats": 1,
"gitlab_identifier": null,
+ "huggingface_identifier": null,
"license": null,
"Number of examples": 5,
"Number of tools": {
diff --git a/resources_servers/workplace_assistant/data/train_metrics.json b/resources_servers/workplace_assistant/data/train_metrics.json
index 0e1d105fe..2b66a67c2 100644
--- a/resources_servers/workplace_assistant/data/train_metrics.json
+++ b/resources_servers/workplace_assistant/data/train_metrics.json
@@ -2,9 +2,10 @@
"name": "train",
"type": "train",
"jsonl_fpath": "resources_servers/workplace_assistant/data/train.jsonl",
- "gitlab_identifier": {
- "dataset_name": "workplace_assistant",
- "version": "0.0.4",
+ "num_repeats": 1,
+ "gitlab_identifier": null,
+ "huggingface_identifier": {
+ "repo_id": "nvidia/Nemotron-RL-agent-workplace_assistant",
"artifact_fpath": "train.jsonl"
},
"license": "Apache 2.0",
diff --git a/resources_servers/workplace_assistant/data/validation_metrics.json b/resources_servers/workplace_assistant/data/validation_metrics.json
index eb1957ac5..063c2616c 100644
--- a/resources_servers/workplace_assistant/data/validation_metrics.json
+++ b/resources_servers/workplace_assistant/data/validation_metrics.json
@@ -2,9 +2,10 @@
"name": "validation",
"type": "validation",
"jsonl_fpath": "resources_servers/workplace_assistant/data/validation.jsonl",
- "gitlab_identifier": {
- "dataset_name": "workplace_assistant",
- "version": "0.0.4",
+ "num_repeats": 1,
+ "gitlab_identifier": null,
+ "huggingface_identifier": {
+ "repo_id": "nvidia/Nemotron-RL-agent-workplace_assistant",
"artifact_fpath": "validation.jsonl"
},
"license": "Apache 2.0",
diff --git a/responses_api_models/vllm_model/app.py b/responses_api_models/vllm_model/app.py
index b263dd965..fdd713708 100644
--- a/responses_api_models/vllm_model/app.py
+++ b/responses_api_models/vllm_model/app.py
@@ -14,7 +14,7 @@
# limitations under the License.
import re
from time import time
-from typing import ClassVar, Dict, List, Optional, Tuple, Union
+from typing import Any, ClassVar, Dict, List, Optional, Tuple, Union
from uuid import uuid4
from aiohttp.client_exceptions import ClientResponseError
@@ -66,6 +66,9 @@ class VLLMModelConfig(BaseResponsesAPIModelConfig):
uses_reasoning_parser: bool
replace_developer_role_with_system: bool = False
+ # Corresponds to the extra_body of OpenAI Client.
+ extra_body: Optional[Dict[str, Any]] = None
+
def model_post_init(self, context):
if isinstance(self.base_url, str):
self.base_url = [self.base_url]
@@ -199,6 +202,9 @@ async def chat_completions(
else:
raise NotImplementedError
+ if self.config.extra_body:
+ create_params = self.config.extra_body | create_params
+
try:
chat_completion_dict = await client.create_chat_completion(**create_params)
except ClientResponseError as e:
diff --git a/scripts/update_resource_servers.py b/scripts/update_resource_servers.py
index 1bcd0015a..4aecb3f71 100644
--- a/scripts/update_resource_servers.py
+++ b/scripts/update_resource_servers.py
@@ -54,12 +54,12 @@ class AgentDatasetsMetadata:
license: str | None = None
types: list[str] = field(default_factory=list)
- dataset_url: Optional[str] = None
+ huggingface_repo_id: Optional[str] = None
def to_dict(self) -> dict[str, str | list[str] | None]: # pragma: no cover
"""Convert to dict for backward compatibility."""
return {
- "dataset_url": self.dataset_url,
+ "huggingface_repo_id": self.huggingface_repo_id,
"license": self.license,
"types": self.types,
}
@@ -69,7 +69,7 @@ def to_dict(self) -> dict[str, str | list[str] | None]: # pragma: no cover
class ConfigMetadata:
"""Combined metadata from YAML configuration file."""
- dataset_url: Optional[str] = None
+ huggingface_repo_id: Optional[str] = None
domain: Optional[str] = None
description: Optional[str] = None
verified: bool = False
@@ -89,7 +89,7 @@ def from_yaml_data(
verified=resource.verified,
verified_url=resource.verified_url,
value=resource.value,
- dataset_url=agent.dataset_url,
+ huggingface_repo_id=agent.huggingface_repo_id,
license=agent.license,
types=agent.types,
)
@@ -108,8 +108,8 @@ class ServerInfo:
yaml_file: Path
@property
- def dataset_url(self) -> str | None: # pragma: no cover
- return self.config_metadata.dataset_url
+ def huggingface_repo_id(self) -> str | None: # pragma: no cover
+ return self.config_metadata.huggingface_repo_id
@property
def domain(self) -> str | None: # pragma: no cover
@@ -154,10 +154,12 @@ def get_validation_mark(self) -> str: # pragma: no cover
return "✓" if "validation" in set(self.config_metadata.types) else "-"
def get_dataset_link(self) -> str: # pragma: no cover
- if not self.config_metadata.dataset_url:
+ if not self.config_metadata.huggingface_repo_id:
return "-"
- dataset_name = self.config_metadata.dataset_url.split("/")[-1]
- return f"{dataset_name}"
+ repo_id = self.config_metadata.huggingface_repo_id
+ dataset_name = repo_id.split("/")[-1]
+ dataset_url = f"https://huggingface.co/datasets/{repo_id}"
+ return f"{dataset_name}"
def get_config_link(self, use_filename: bool = True) -> str: # pragma: no cover
return f"{self.config_filename if use_filename else 'config'}"
@@ -170,7 +172,6 @@ def visit_resource_server(data: dict, level: int = 1) -> ResourceServerMetadata:
"""Extract resource server metadata from YAML data."""
resource = ResourceServerMetadata()
if level == 4:
- resource.dataset_url = data.get("dataset_url")
resource.domain = data.get("domain")
resource.description = data.get("description")
resource.verified = data.get("verified", False)
@@ -201,7 +202,9 @@ def visit_agent_datasets(data: dict) -> AgentDatasetsMetadata: # pragma: no cov
agent.types.append(entry.get("type"))
if entry.get("type") == "train":
agent.license = entry.get("license")
- agent.dataset_url = entry.get("dataset_url")
+ hf_id = entry.get("huggingface_identifier")
+ if hf_id and isinstance(hf_id, dict):
+ agent.huggingface_repo_id = hf_id.get("repo_id")
return agent
@@ -223,7 +226,9 @@ def extract_config_metadata(yaml_path: Path) -> ConfigMetadata: # pragma: no co
- name: train
type: {example_type_1}
license: {example_license_1}
- dataset_url: {example_dataset_url}
+ huggingface_identifier:
+ repo_id: {example_repo_id_1}
+ artifact_fpath: {example_artifact_fpath_1}
- name: validation
type: {example_type_2}
license: {example_license_2}
@@ -260,7 +265,7 @@ def get_example_and_training_server_info() -> tuple[list[ServerInfo], list[Serve
server_name = subdir.name
is_example_only = server_name.startswith("example_")
- if not is_example_only and not yaml_data.dataset_url:
+ if not is_example_only and not yaml_data.huggingface_repo_id:
continue
display_name = (
diff --git a/tests/unit_tests/test_train_data_utils.py b/tests/unit_tests/test_train_data_utils.py
index e58331d9a..350d589ab 100644
--- a/tests/unit_tests/test_train_data_utils.py
+++ b/tests/unit_tests/test_train_data_utils.py
@@ -83,6 +83,7 @@ def test_load_and_validate_server_instance_configs_sanity(self, monkeypatch: Mon
"jsonl_fpath": "resources_servers/example_multi_step/data/example.jsonl",
"num_repeats": 1,
"gitlab_identifier": None,
+ "huggingface_identifier": None,
"license": None,
}
],
diff --git a/uv.lock b/uv.lock
index c740df1b5..2f2697bdf 100644
--- a/uv.lock
+++ b/uv.lock
@@ -656,6 +656,31 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/0e/70/4bd71d09b7d7f7bc9b4d0ceb20a020fd4f667d82aafc43e4d115bd41989e/databricks_sdk-0.65.0-py3-none-any.whl", hash = "sha256:594e61138071d7ae830412cfd3fbc5bd16aba9b67a423f44f4c13ca70c493a9f", size = 705907, upload-time = "2025-09-02T10:50:40.619Z" },
]
+[[package]]
+name = "datasets"
+version = "4.4.1"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+ { name = "dill" },
+ { name = "filelock" },
+ { name = "fsspec", extra = ["http"] },
+ { name = "httpx" },
+ { name = "huggingface-hub" },
+ { name = "multiprocess" },
+ { name = "numpy" },
+ { name = "packaging" },
+ { name = "pandas" },
+ { name = "pyarrow" },
+ { name = "pyyaml" },
+ { name = "requests" },
+ { name = "tqdm" },
+ { name = "xxhash" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/93/bf/0dae295d6d1ba0b1a200a9dd216838464b5bbd05da01407cb1330b377445/datasets-4.4.1.tar.gz", hash = "sha256:80322699aa8c0bbbdb7caa87906da689c3c2e29523cff698775c67f28fdab1fc", size = 585341, upload-time = "2025-11-05T16:00:38.162Z" }
+wheels = [
+ { url = "https://files.pythonhosted.org/packages/3b/5e/6f8d874366788ad5d549e9ba258037d974dda6e004843be1bda794571701/datasets-4.4.1-py3-none-any.whl", hash = "sha256:c1163de5211e42546079ab355cc0250c7e6db16eb209ac5ac6252f801f596c44", size = 511591, upload-time = "2025-11-05T16:00:36.365Z" },
+]
+
[[package]]
name = "devtools"
version = "0.12.2"
@@ -670,6 +695,15 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/d1/ae/afb1487556e2dc827a17097aac8158a25b433a345386f0e249f6d2694ccb/devtools-0.12.2-py3-none-any.whl", hash = "sha256:c366e3de1df4cdd635f1ad8cbcd3af01a384d7abda71900e68d43b04eb6aaca7", size = 19411, upload-time = "2023-09-03T16:56:59.049Z" },
]
+[[package]]
+name = "dill"
+version = "0.4.0"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/12/80/630b4b88364e9a8c8c5797f4602d0f76ef820909ee32f0bacb9f90654042/dill-0.4.0.tar.gz", hash = "sha256:0633f1d2df477324f53a895b02c901fb961bdbf65a17122586ea7019292cbcf0", size = 186976, upload-time = "2025-04-16T00:41:48.867Z" }
+wheels = [
+ { url = "https://files.pythonhosted.org/packages/50/3d/9373ad9c56321fdab5b41197068e1d8c25883b3fea29dd361f9b55116869/dill-0.4.0-py3-none-any.whl", hash = "sha256:44f54bf6412c2c8464c14e8243eb163690a9800dbe2c367330883b19c7561049", size = 119668, upload-time = "2025-04-16T00:41:47.671Z" },
+]
+
[[package]]
name = "distlib"
version = "0.4.0"
@@ -888,6 +922,11 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/47/71/70db47e4f6ce3e5c37a607355f80da8860a33226be640226ac52cb05ef2e/fsspec-2025.9.0-py3-none-any.whl", hash = "sha256:530dc2a2af60a414a832059574df4a6e10cce927f6f4a78209390fe38955cfb7", size = 199289, upload-time = "2025-09-02T19:10:47.708Z" },
]
+[package.optional-dependencies]
+http = [
+ { name = "aiohttp" },
+]
+
[[package]]
name = "gitdb"
version = "4.0.12"
@@ -1776,6 +1815,23 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/fd/69/b547032297c7e63ba2af494edba695d781af8a0c6e89e4d06cf848b21d80/multidict-6.6.4-py3-none-any.whl", hash = "sha256:27d8f8e125c07cb954e54d75d04905a9bba8a439c1d84aca94949d4d03d8601c", size = 12313, upload-time = "2025-08-11T12:08:46.891Z" },
]
+[[package]]
+name = "multiprocess"
+version = "0.70.18"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+ { name = "dill" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/72/fd/2ae3826f5be24c6ed87266bc4e59c46ea5b059a103f3d7e7eb76a52aeecb/multiprocess-0.70.18.tar.gz", hash = "sha256:f9597128e6b3e67b23956da07cf3d2e5cba79e2f4e0fba8d7903636663ec6d0d", size = 1798503, upload-time = "2025-04-17T03:11:27.742Z" }
+wheels = [
+ { url = "https://files.pythonhosted.org/packages/ba/d8/0cba6cf51a1a31f20471fbc823a716170c73012ddc4fb85d706630ed6e8f/multiprocess-0.70.18-py310-none-any.whl", hash = "sha256:60c194974c31784019c1f459d984e8f33ee48f10fcf42c309ba97b30d9bd53ea", size = 134948, upload-time = "2025-04-17T03:11:20.223Z" },
+ { url = "https://files.pythonhosted.org/packages/4b/88/9039f2fed1012ef584751d4ceff9ab4a51e5ae264898f0b7cbf44340a859/multiprocess-0.70.18-py311-none-any.whl", hash = "sha256:5aa6eef98e691281b3ad923be2832bf1c55dd2c859acd73e5ec53a66aae06a1d", size = 144462, upload-time = "2025-04-17T03:11:21.657Z" },
+ { url = "https://files.pythonhosted.org/packages/bf/b6/5f922792be93b82ec6b5f270bbb1ef031fd0622847070bbcf9da816502cc/multiprocess-0.70.18-py312-none-any.whl", hash = "sha256:9b78f8e5024b573730bfb654783a13800c2c0f2dfc0c25e70b40d184d64adaa2", size = 150287, upload-time = "2025-04-17T03:11:22.69Z" },
+ { url = "https://files.pythonhosted.org/packages/ee/25/7d7e78e750bc1aecfaf0efbf826c69a791d2eeaf29cf20cba93ff4cced78/multiprocess-0.70.18-py313-none-any.whl", hash = "sha256:871743755f43ef57d7910a38433cfe41319e72be1bbd90b79c7a5ac523eb9334", size = 151917, upload-time = "2025-04-17T03:11:24.044Z" },
+ { url = "https://files.pythonhosted.org/packages/3b/c3/ca84c19bd14cdfc21c388fdcebf08b86a7a470ebc9f5c3c084fc2dbc50f7/multiprocess-0.70.18-py38-none-any.whl", hash = "sha256:dbf705e52a154fe5e90fb17b38f02556169557c2dd8bb084f2e06c2784d8279b", size = 132636, upload-time = "2025-04-17T03:11:24.936Z" },
+ { url = "https://files.pythonhosted.org/packages/6c/28/dd72947e59a6a8c856448a5e74da6201cb5502ddff644fbc790e4bd40b9a/multiprocess-0.70.18-py39-none-any.whl", hash = "sha256:e78ca805a72b1b810c690b6b4cc32579eba34f403094bbbae962b7b5bf9dfcb8", size = 133478, upload-time = "2025-04-17T03:11:26.253Z" },
+]
+
[[package]]
name = "mypy"
version = "1.17.1"
@@ -1839,6 +1895,7 @@ name = "nemo-gym"
source = { editable = "." }
dependencies = [
{ name = "aiohttp" },
+ { name = "datasets" },
{ name = "devtools" },
{ name = "fastapi" },
{ name = "gradio" },
@@ -1887,6 +1944,7 @@ docs = [
requires-dist = [
{ name = "aiohttp" },
{ name = "coverage", extras = ["toml"], marker = "extra == 'dev'" },
+ { name = "datasets" },
{ name = "devtools" },
{ name = "fastapi" },
{ name = "gradio" },
@@ -3704,6 +3762,89 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/00/5c/c34575f96a0a038579683c7f10fca943c15c7946037d1d254ab9db1536ec/wrapt-2.0.0-py3-none-any.whl", hash = "sha256:02482fb0df89857e35427dfb844319417e14fae05878f295ee43fa3bf3b15502", size = 43998, upload-time = "2025-10-19T23:47:52.858Z" },
]
+[[package]]
+name = "xxhash"
+version = "3.6.0"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/02/84/30869e01909fb37a6cc7e18688ee8bf1e42d57e7e0777636bd47524c43c7/xxhash-3.6.0.tar.gz", hash = "sha256:f0162a78b13a0d7617b2845b90c763339d1f1d82bb04a4b07f4ab535cc5e05d6", size = 85160, upload-time = "2025-10-02T14:37:08.097Z" }
+wheels = [
+ { url = "https://files.pythonhosted.org/packages/9a/07/d9412f3d7d462347e4511181dea65e47e0d0e16e26fbee2ea86a2aefb657/xxhash-3.6.0-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:01362c4331775398e7bb34e3ab403bc9ee9f7c497bc7dee6272114055277dd3c", size = 32744, upload-time = "2025-10-02T14:34:34.622Z" },
+ { url = "https://files.pythonhosted.org/packages/79/35/0429ee11d035fc33abe32dca1b2b69e8c18d236547b9a9b72c1929189b9a/xxhash-3.6.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:b7b2df81a23f8cb99656378e72501b2cb41b1827c0f5a86f87d6b06b69f9f204", size = 30816, upload-time = "2025-10-02T14:34:36.043Z" },
+ { url = "https://files.pythonhosted.org/packages/b7/f2/57eb99aa0f7d98624c0932c5b9a170e1806406cdbcdb510546634a1359e0/xxhash-3.6.0-cp312-cp312-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:dc94790144e66b14f67b10ac8ed75b39ca47536bf8800eb7c24b50271ea0c490", size = 194035, upload-time = "2025-10-02T14:34:37.354Z" },
+ { url = "https://files.pythonhosted.org/packages/4c/ed/6224ba353690d73af7a3f1c7cdb1fc1b002e38f783cb991ae338e1eb3d79/xxhash-3.6.0-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:93f107c673bccf0d592cdba077dedaf52fe7f42dcd7676eba1f6d6f0c3efffd2", size = 212914, upload-time = "2025-10-02T14:34:38.6Z" },
+ { url = "https://files.pythonhosted.org/packages/38/86/fb6b6130d8dd6b8942cc17ab4d90e223653a89aa32ad2776f8af7064ed13/xxhash-3.6.0-cp312-cp312-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:2aa5ee3444c25b69813663c9f8067dcfaa2e126dc55e8dddf40f4d1c25d7effa", size = 212163, upload-time = "2025-10-02T14:34:39.872Z" },
+ { url = "https://files.pythonhosted.org/packages/ee/dc/e84875682b0593e884ad73b2d40767b5790d417bde603cceb6878901d647/xxhash-3.6.0-cp312-cp312-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:f7f99123f0e1194fa59cc69ad46dbae2e07becec5df50a0509a808f90a0f03f0", size = 445411, upload-time = "2025-10-02T14:34:41.569Z" },
+ { url = "https://files.pythonhosted.org/packages/11/4f/426f91b96701ec2f37bb2b8cec664eff4f658a11f3fa9d94f0a887ea6d2b/xxhash-3.6.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:49e03e6fe2cac4a1bc64952dd250cf0dbc5ef4ebb7b8d96bce82e2de163c82a2", size = 193883, upload-time = "2025-10-02T14:34:43.249Z" },
+ { url = "https://files.pythonhosted.org/packages/53/5a/ddbb83eee8e28b778eacfc5a85c969673e4023cdeedcfcef61f36731610b/xxhash-3.6.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:bd17fede52a17a4f9a7bc4472a5867cb0b160deeb431795c0e4abe158bc784e9", size = 210392, upload-time = "2025-10-02T14:34:45.042Z" },
+ { url = "https://files.pythonhosted.org/packages/1e/c2/ff69efd07c8c074ccdf0a4f36fcdd3d27363665bcdf4ba399abebe643465/xxhash-3.6.0-cp312-cp312-musllinux_1_2_i686.whl", hash = "sha256:6fb5f5476bef678f69db04f2bd1efbed3030d2aba305b0fc1773645f187d6a4e", size = 197898, upload-time = "2025-10-02T14:34:46.302Z" },
+ { url = "https://files.pythonhosted.org/packages/58/ca/faa05ac19b3b622c7c9317ac3e23954187516298a091eb02c976d0d3dd45/xxhash-3.6.0-cp312-cp312-musllinux_1_2_ppc64le.whl", hash = "sha256:843b52f6d88071f87eba1631b684fcb4b2068cd2180a0224122fe4ef011a9374", size = 210655, upload-time = "2025-10-02T14:34:47.571Z" },
+ { url = "https://files.pythonhosted.org/packages/d4/7a/06aa7482345480cc0cb597f5c875b11a82c3953f534394f620b0be2f700c/xxhash-3.6.0-cp312-cp312-musllinux_1_2_s390x.whl", hash = "sha256:7d14a6cfaf03b1b6f5f9790f76880601ccc7896aff7ab9cd8978a939c1eb7e0d", size = 414001, upload-time = "2025-10-02T14:34:49.273Z" },
+ { url = "https://files.pythonhosted.org/packages/23/07/63ffb386cd47029aa2916b3d2f454e6cc5b9f5c5ada3790377d5430084e7/xxhash-3.6.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:418daf3db71e1413cfe211c2f9a528456936645c17f46b5204705581a45390ae", size = 191431, upload-time = "2025-10-02T14:34:50.798Z" },
+ { url = "https://files.pythonhosted.org/packages/0f/93/14fde614cadb4ddf5e7cebf8918b7e8fac5ae7861c1875964f17e678205c/xxhash-3.6.0-cp312-cp312-win32.whl", hash = "sha256:50fc255f39428a27299c20e280d6193d8b63b8ef8028995323bf834a026b4fbb", size = 30617, upload-time = "2025-10-02T14:34:51.954Z" },
+ { url = "https://files.pythonhosted.org/packages/13/5d/0d125536cbe7565a83d06e43783389ecae0c0f2ed037b48ede185de477c0/xxhash-3.6.0-cp312-cp312-win_amd64.whl", hash = "sha256:c0f2ab8c715630565ab8991b536ecded9416d615538be8ecddce43ccf26cbc7c", size = 31534, upload-time = "2025-10-02T14:34:53.276Z" },
+ { url = "https://files.pythonhosted.org/packages/54/85/6ec269b0952ec7e36ba019125982cf11d91256a778c7c3f98a4c5043d283/xxhash-3.6.0-cp312-cp312-win_arm64.whl", hash = "sha256:eae5c13f3bc455a3bbb68bdc513912dc7356de7e2280363ea235f71f54064829", size = 27876, upload-time = "2025-10-02T14:34:54.371Z" },
+ { url = "https://files.pythonhosted.org/packages/33/76/35d05267ac82f53ae9b0e554da7c5e281ee61f3cad44c743f0fcd354f211/xxhash-3.6.0-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:599e64ba7f67472481ceb6ee80fa3bd828fd61ba59fb11475572cc5ee52b89ec", size = 32738, upload-time = "2025-10-02T14:34:55.839Z" },
+ { url = "https://files.pythonhosted.org/packages/31/a8/3fbce1cd96534a95e35d5120637bf29b0d7f5d8fa2f6374e31b4156dd419/xxhash-3.6.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:7d8b8aaa30fca4f16f0c84a5c8d7ddee0e25250ec2796c973775373257dde8f1", size = 30821, upload-time = "2025-10-02T14:34:57.219Z" },
+ { url = "https://files.pythonhosted.org/packages/0c/ea/d387530ca7ecfa183cb358027f1833297c6ac6098223fd14f9782cd0015c/xxhash-3.6.0-cp313-cp313-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:d597acf8506d6e7101a4a44a5e428977a51c0fadbbfd3c39650cca9253f6e5a6", size = 194127, upload-time = "2025-10-02T14:34:59.21Z" },
+ { url = "https://files.pythonhosted.org/packages/ba/0c/71435dcb99874b09a43b8d7c54071e600a7481e42b3e3ce1eb5226a5711a/xxhash-3.6.0-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:858dc935963a33bc33490128edc1c12b0c14d9c7ebaa4e387a7869ecc4f3e263", size = 212975, upload-time = "2025-10-02T14:35:00.816Z" },
+ { url = "https://files.pythonhosted.org/packages/84/7a/c2b3d071e4bb4a90b7057228a99b10d51744878f4a8a6dd643c8bd897620/xxhash-3.6.0-cp313-cp313-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:ba284920194615cb8edf73bf52236ce2e1664ccd4a38fdb543506413529cc546", size = 212241, upload-time = "2025-10-02T14:35:02.207Z" },
+ { url = "https://files.pythonhosted.org/packages/81/5f/640b6eac0128e215f177df99eadcd0f1b7c42c274ab6a394a05059694c5a/xxhash-3.6.0-cp313-cp313-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:4b54219177f6c6674d5378bd862c6aedf64725f70dd29c472eaae154df1a2e89", size = 445471, upload-time = "2025-10-02T14:35:03.61Z" },
+ { url = "https://files.pythonhosted.org/packages/5e/1e/3c3d3ef071b051cc3abbe3721ffb8365033a172613c04af2da89d5548a87/xxhash-3.6.0-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:42c36dd7dbad2f5238950c377fcbf6811b1cdb1c444fab447960030cea60504d", size = 193936, upload-time = "2025-10-02T14:35:05.013Z" },
+ { url = "https://files.pythonhosted.org/packages/2c/bd/4a5f68381939219abfe1c22a9e3a5854a4f6f6f3c4983a87d255f21f2e5d/xxhash-3.6.0-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:f22927652cba98c44639ffdc7aaf35828dccf679b10b31c4ad72a5b530a18eb7", size = 210440, upload-time = "2025-10-02T14:35:06.239Z" },
+ { url = "https://files.pythonhosted.org/packages/eb/37/b80fe3d5cfb9faff01a02121a0f4d565eb7237e9e5fc66e73017e74dcd36/xxhash-3.6.0-cp313-cp313-musllinux_1_2_i686.whl", hash = "sha256:b45fad44d9c5c119e9c6fbf2e1c656a46dc68e280275007bbfd3d572b21426db", size = 197990, upload-time = "2025-10-02T14:35:07.735Z" },
+ { url = "https://files.pythonhosted.org/packages/d7/fd/2c0a00c97b9e18f72e1f240ad4e8f8a90fd9d408289ba9c7c495ed7dc05c/xxhash-3.6.0-cp313-cp313-musllinux_1_2_ppc64le.whl", hash = "sha256:6f2580ffab1a8b68ef2b901cde7e55fa8da5e4be0977c68f78fc80f3c143de42", size = 210689, upload-time = "2025-10-02T14:35:09.438Z" },
+ { url = "https://files.pythonhosted.org/packages/93/86/5dd8076a926b9a95db3206aba20d89a7fc14dd5aac16e5c4de4b56033140/xxhash-3.6.0-cp313-cp313-musllinux_1_2_s390x.whl", hash = "sha256:40c391dd3cd041ebc3ffe6f2c862f402e306eb571422e0aa918d8070ba31da11", size = 414068, upload-time = "2025-10-02T14:35:11.162Z" },
+ { url = "https://files.pythonhosted.org/packages/af/3c/0bb129170ee8f3650f08e993baee550a09593462a5cddd8e44d0011102b1/xxhash-3.6.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:f205badabde7aafd1a31e8ca2a3e5a763107a71c397c4481d6a804eb5063d8bd", size = 191495, upload-time = "2025-10-02T14:35:12.971Z" },
+ { url = "https://files.pythonhosted.org/packages/e9/3a/6797e0114c21d1725e2577508e24006fd7ff1d8c0c502d3b52e45c1771d8/xxhash-3.6.0-cp313-cp313-win32.whl", hash = "sha256:2577b276e060b73b73a53042ea5bd5203d3e6347ce0d09f98500f418a9fcf799", size = 30620, upload-time = "2025-10-02T14:35:14.129Z" },
+ { url = "https://files.pythonhosted.org/packages/86/15/9bc32671e9a38b413a76d24722a2bf8784a132c043063a8f5152d390b0f9/xxhash-3.6.0-cp313-cp313-win_amd64.whl", hash = "sha256:757320d45d2fbcce8f30c42a6b2f47862967aea7bf458b9625b4bbe7ee390392", size = 31542, upload-time = "2025-10-02T14:35:15.21Z" },
+ { url = "https://files.pythonhosted.org/packages/39/c5/cc01e4f6188656e56112d6a8e0dfe298a16934b8c47a247236549a3f7695/xxhash-3.6.0-cp313-cp313-win_arm64.whl", hash = "sha256:457b8f85dec5825eed7b69c11ae86834a018b8e3df5e77783c999663da2f96d6", size = 27880, upload-time = "2025-10-02T14:35:16.315Z" },
+ { url = "https://files.pythonhosted.org/packages/f3/30/25e5321c8732759e930c555176d37e24ab84365482d257c3b16362235212/xxhash-3.6.0-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:a42e633d75cdad6d625434e3468126c73f13f7584545a9cf34e883aa1710e702", size = 32956, upload-time = "2025-10-02T14:35:17.413Z" },
+ { url = "https://files.pythonhosted.org/packages/9f/3c/0573299560d7d9f8ab1838f1efc021a280b5ae5ae2e849034ef3dee18810/xxhash-3.6.0-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:568a6d743219e717b07b4e03b0a828ce593833e498c3b64752e0f5df6bfe84db", size = 31072, upload-time = "2025-10-02T14:35:18.844Z" },
+ { url = "https://files.pythonhosted.org/packages/7a/1c/52d83a06e417cd9d4137722693424885cc9878249beb3a7c829e74bf7ce9/xxhash-3.6.0-cp313-cp313t-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:bec91b562d8012dae276af8025a55811b875baace6af510412a5e58e3121bc54", size = 196409, upload-time = "2025-10-02T14:35:20.31Z" },
+ { url = "https://files.pythonhosted.org/packages/e3/8e/c6d158d12a79bbd0b878f8355432075fc82759e356ab5a111463422a239b/xxhash-3.6.0-cp313-cp313t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:78e7f2f4c521c30ad5e786fdd6bae89d47a32672a80195467b5de0480aa97b1f", size = 215736, upload-time = "2025-10-02T14:35:21.616Z" },
+ { url = "https://files.pythonhosted.org/packages/bc/68/c4c80614716345d55071a396cf03d06e34b5f4917a467faf43083c995155/xxhash-3.6.0-cp313-cp313t-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:3ed0df1b11a79856df5ffcab572cbd6b9627034c1c748c5566fa79df9048a7c5", size = 214833, upload-time = "2025-10-02T14:35:23.32Z" },
+ { url = "https://files.pythonhosted.org/packages/7e/e9/ae27c8ffec8b953efa84c7c4a6c6802c263d587b9fc0d6e7cea64e08c3af/xxhash-3.6.0-cp313-cp313t-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:0e4edbfc7d420925b0dd5e792478ed393d6e75ff8fc219a6546fb446b6a417b1", size = 448348, upload-time = "2025-10-02T14:35:25.111Z" },
+ { url = "https://files.pythonhosted.org/packages/d7/6b/33e21afb1b5b3f46b74b6bd1913639066af218d704cc0941404ca717fc57/xxhash-3.6.0-cp313-cp313t-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:fba27a198363a7ef87f8c0f6b171ec36b674fe9053742c58dd7e3201c1ab30ee", size = 196070, upload-time = "2025-10-02T14:35:26.586Z" },
+ { url = "https://files.pythonhosted.org/packages/96/b6/fcabd337bc5fa624e7203aa0fa7d0c49eed22f72e93229431752bddc83d9/xxhash-3.6.0-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:794fe9145fe60191c6532fa95063765529770edcdd67b3d537793e8004cabbfd", size = 212907, upload-time = "2025-10-02T14:35:28.087Z" },
+ { url = "https://files.pythonhosted.org/packages/4b/d3/9ee6160e644d660fcf176c5825e61411c7f62648728f69c79ba237250143/xxhash-3.6.0-cp313-cp313t-musllinux_1_2_i686.whl", hash = "sha256:6105ef7e62b5ac73a837778efc331a591d8442f8ef5c7e102376506cb4ae2729", size = 200839, upload-time = "2025-10-02T14:35:29.857Z" },
+ { url = "https://files.pythonhosted.org/packages/0d/98/e8de5baa5109394baf5118f5e72ab21a86387c4f89b0e77ef3e2f6b0327b/xxhash-3.6.0-cp313-cp313t-musllinux_1_2_ppc64le.whl", hash = "sha256:f01375c0e55395b814a679b3eea205db7919ac2af213f4a6682e01220e5fe292", size = 213304, upload-time = "2025-10-02T14:35:31.222Z" },
+ { url = "https://files.pythonhosted.org/packages/7b/1d/71056535dec5c3177eeb53e38e3d367dd1d16e024e63b1cee208d572a033/xxhash-3.6.0-cp313-cp313t-musllinux_1_2_s390x.whl", hash = "sha256:d706dca2d24d834a4661619dcacf51a75c16d65985718d6a7d73c1eeeb903ddf", size = 416930, upload-time = "2025-10-02T14:35:32.517Z" },
+ { url = "https://files.pythonhosted.org/packages/dc/6c/5cbde9de2cd967c322e651c65c543700b19e7ae3e0aae8ece3469bf9683d/xxhash-3.6.0-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:5f059d9faeacd49c0215d66f4056e1326c80503f51a1532ca336a385edadd033", size = 193787, upload-time = "2025-10-02T14:35:33.827Z" },
+ { url = "https://files.pythonhosted.org/packages/19/fa/0172e350361d61febcea941b0cc541d6e6c8d65d153e85f850a7b256ff8a/xxhash-3.6.0-cp313-cp313t-win32.whl", hash = "sha256:1244460adc3a9be84731d72b8e80625788e5815b68da3da8b83f78115a40a7ec", size = 30916, upload-time = "2025-10-02T14:35:35.107Z" },
+ { url = "https://files.pythonhosted.org/packages/ad/e6/e8cf858a2b19d6d45820f072eff1bea413910592ff17157cabc5f1227a16/xxhash-3.6.0-cp313-cp313t-win_amd64.whl", hash = "sha256:b1e420ef35c503869c4064f4a2f2b08ad6431ab7b229a05cce39d74268bca6b8", size = 31799, upload-time = "2025-10-02T14:35:36.165Z" },
+ { url = "https://files.pythonhosted.org/packages/56/15/064b197e855bfb7b343210e82490ae672f8bc7cdf3ddb02e92f64304ee8a/xxhash-3.6.0-cp313-cp313t-win_arm64.whl", hash = "sha256:ec44b73a4220623235f67a996c862049f375df3b1052d9899f40a6382c32d746", size = 28044, upload-time = "2025-10-02T14:35:37.195Z" },
+ { url = "https://files.pythonhosted.org/packages/7e/5e/0138bc4484ea9b897864d59fce9be9086030825bc778b76cb5a33a906d37/xxhash-3.6.0-cp314-cp314-macosx_10_13_x86_64.whl", hash = "sha256:a40a3d35b204b7cc7643cbcf8c9976d818cb47befcfac8bbefec8038ac363f3e", size = 32754, upload-time = "2025-10-02T14:35:38.245Z" },
+ { url = "https://files.pythonhosted.org/packages/18/d7/5dac2eb2ec75fd771957a13e5dda560efb2176d5203f39502a5fc571f899/xxhash-3.6.0-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:a54844be970d3fc22630b32d515e79a90d0a3ddb2644d8d7402e3c4c8da61405", size = 30846, upload-time = "2025-10-02T14:35:39.6Z" },
+ { url = "https://files.pythonhosted.org/packages/fe/71/8bc5be2bb00deb5682e92e8da955ebe5fa982da13a69da5a40a4c8db12fb/xxhash-3.6.0-cp314-cp314-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:016e9190af8f0a4e3741343777710e3d5717427f175adfdc3e72508f59e2a7f3", size = 194343, upload-time = "2025-10-02T14:35:40.69Z" },
+ { url = "https://files.pythonhosted.org/packages/e7/3b/52badfb2aecec2c377ddf1ae75f55db3ba2d321c5e164f14461c90837ef3/xxhash-3.6.0-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:4f6f72232f849eb9d0141e2ebe2677ece15adfd0fa599bc058aad83c714bb2c6", size = 213074, upload-time = "2025-10-02T14:35:42.29Z" },
+ { url = "https://files.pythonhosted.org/packages/a2/2b/ae46b4e9b92e537fa30d03dbc19cdae57ed407e9c26d163895e968e3de85/xxhash-3.6.0-cp314-cp314-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:63275a8aba7865e44b1813d2177e0f5ea7eadad3dd063a21f7cf9afdc7054063", size = 212388, upload-time = "2025-10-02T14:35:43.929Z" },
+ { url = "https://files.pythonhosted.org/packages/f5/80/49f88d3afc724b4ac7fbd664c8452d6db51b49915be48c6982659e0e7942/xxhash-3.6.0-cp314-cp314-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:3cd01fa2aa00d8b017c97eb46b9a794fbdca53fc14f845f5a328c71254b0abb7", size = 445614, upload-time = "2025-10-02T14:35:45.216Z" },
+ { url = "https://files.pythonhosted.org/packages/ed/ba/603ce3961e339413543d8cd44f21f2c80e2a7c5cfe692a7b1f2cccf58f3c/xxhash-3.6.0-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:0226aa89035b62b6a86d3c68df4d7c1f47a342b8683da2b60cedcddb46c4d95b", size = 194024, upload-time = "2025-10-02T14:35:46.959Z" },
+ { url = "https://files.pythonhosted.org/packages/78/d1/8e225ff7113bf81545cfdcd79eef124a7b7064a0bba53605ff39590b95c2/xxhash-3.6.0-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:c6e193e9f56e4ca4923c61238cdaced324f0feac782544eb4c6d55ad5cc99ddd", size = 210541, upload-time = "2025-10-02T14:35:48.301Z" },
+ { url = "https://files.pythonhosted.org/packages/6f/58/0f89d149f0bad89def1a8dd38feb50ccdeb643d9797ec84707091d4cb494/xxhash-3.6.0-cp314-cp314-musllinux_1_2_i686.whl", hash = "sha256:9176dcaddf4ca963d4deb93866d739a343c01c969231dbe21680e13a5d1a5bf0", size = 198305, upload-time = "2025-10-02T14:35:49.584Z" },
+ { url = "https://files.pythonhosted.org/packages/11/38/5eab81580703c4df93feb5f32ff8fa7fe1e2c51c1f183ee4e48d4bb9d3d7/xxhash-3.6.0-cp314-cp314-musllinux_1_2_ppc64le.whl", hash = "sha256:c1ce4009c97a752e682b897aa99aef84191077a9433eb237774689f14f8ec152", size = 210848, upload-time = "2025-10-02T14:35:50.877Z" },
+ { url = "https://files.pythonhosted.org/packages/5e/6b/953dc4b05c3ce678abca756416e4c130d2382f877a9c30a20d08ee6a77c0/xxhash-3.6.0-cp314-cp314-musllinux_1_2_s390x.whl", hash = "sha256:8cb2f4f679b01513b7adbb9b1b2f0f9cdc31b70007eaf9d59d0878809f385b11", size = 414142, upload-time = "2025-10-02T14:35:52.15Z" },
+ { url = "https://files.pythonhosted.org/packages/08/a9/238ec0d4e81a10eb5026d4a6972677cbc898ba6c8b9dbaec12ae001b1b35/xxhash-3.6.0-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:653a91d7c2ab54a92c19ccf43508b6a555440b9be1bc8be553376778be7f20b5", size = 191547, upload-time = "2025-10-02T14:35:53.547Z" },
+ { url = "https://files.pythonhosted.org/packages/f1/ee/3cf8589e06c2164ac77c3bf0aa127012801128f1feebf2a079272da5737c/xxhash-3.6.0-cp314-cp314-win32.whl", hash = "sha256:a756fe893389483ee8c394d06b5ab765d96e68fbbfe6fde7aa17e11f5720559f", size = 31214, upload-time = "2025-10-02T14:35:54.746Z" },
+ { url = "https://files.pythonhosted.org/packages/02/5d/a19552fbc6ad4cb54ff953c3908bbc095f4a921bc569433d791f755186f1/xxhash-3.6.0-cp314-cp314-win_amd64.whl", hash = "sha256:39be8e4e142550ef69629c9cd71b88c90e9a5db703fecbcf265546d9536ca4ad", size = 32290, upload-time = "2025-10-02T14:35:55.791Z" },
+ { url = "https://files.pythonhosted.org/packages/b1/11/dafa0643bc30442c887b55baf8e73353a344ee89c1901b5a5c54a6c17d39/xxhash-3.6.0-cp314-cp314-win_arm64.whl", hash = "sha256:25915e6000338999236f1eb68a02a32c3275ac338628a7eaa5a269c401995679", size = 28795, upload-time = "2025-10-02T14:35:57.162Z" },
+ { url = "https://files.pythonhosted.org/packages/2c/db/0e99732ed7f64182aef4a6fb145e1a295558deec2a746265dcdec12d191e/xxhash-3.6.0-cp314-cp314t-macosx_10_13_x86_64.whl", hash = "sha256:c5294f596a9017ca5a3e3f8884c00b91ab2ad2933cf288f4923c3fd4346cf3d4", size = 32955, upload-time = "2025-10-02T14:35:58.267Z" },
+ { url = "https://files.pythonhosted.org/packages/55/f4/2a7c3c68e564a099becfa44bb3d398810cc0ff6749b0d3cb8ccb93f23c14/xxhash-3.6.0-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:1cf9dcc4ab9cff01dfbba78544297a3a01dafd60f3bde4e2bfd016cf7e4ddc67", size = 31072, upload-time = "2025-10-02T14:35:59.382Z" },
+ { url = "https://files.pythonhosted.org/packages/c6/d9/72a29cddc7250e8a5819dad5d466facb5dc4c802ce120645630149127e73/xxhash-3.6.0-cp314-cp314t-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:01262da8798422d0685f7cef03b2bd3f4f46511b02830861df548d7def4402ad", size = 196579, upload-time = "2025-10-02T14:36:00.838Z" },
+ { url = "https://files.pythonhosted.org/packages/63/93/b21590e1e381040e2ca305a884d89e1c345b347404f7780f07f2cdd47ef4/xxhash-3.6.0-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:51a73fb7cb3a3ead9f7a8b583ffd9b8038e277cdb8cb87cf890e88b3456afa0b", size = 215854, upload-time = "2025-10-02T14:36:02.207Z" },
+ { url = "https://files.pythonhosted.org/packages/ce/b8/edab8a7d4fa14e924b29be877d54155dcbd8b80be85ea00d2be3413a9ed4/xxhash-3.6.0-cp314-cp314t-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:b9c6df83594f7df8f7f708ce5ebeacfc69f72c9fbaaababf6cf4758eaada0c9b", size = 214965, upload-time = "2025-10-02T14:36:03.507Z" },
+ { url = "https://files.pythonhosted.org/packages/27/67/dfa980ac7f0d509d54ea0d5a486d2bb4b80c3f1bb22b66e6a05d3efaf6c0/xxhash-3.6.0-cp314-cp314t-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:627f0af069b0ea56f312fd5189001c24578868643203bca1abbc2c52d3a6f3ca", size = 448484, upload-time = "2025-10-02T14:36:04.828Z" },
+ { url = "https://files.pythonhosted.org/packages/8c/63/8ffc2cc97e811c0ca5d00ab36604b3ea6f4254f20b7bc658ca825ce6c954/xxhash-3.6.0-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:aa912c62f842dfd013c5f21a642c9c10cd9f4c4e943e0af83618b4a404d9091a", size = 196162, upload-time = "2025-10-02T14:36:06.182Z" },
+ { url = "https://files.pythonhosted.org/packages/4b/77/07f0e7a3edd11a6097e990f6e5b815b6592459cb16dae990d967693e6ea9/xxhash-3.6.0-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:b465afd7909db30168ab62afe40b2fcf79eedc0b89a6c0ab3123515dc0df8b99", size = 213007, upload-time = "2025-10-02T14:36:07.733Z" },
+ { url = "https://files.pythonhosted.org/packages/ae/d8/bc5fa0d152837117eb0bef6f83f956c509332ce133c91c63ce07ee7c4873/xxhash-3.6.0-cp314-cp314t-musllinux_1_2_i686.whl", hash = "sha256:a881851cf38b0a70e7c4d3ce81fc7afd86fbc2a024f4cfb2a97cf49ce04b75d3", size = 200956, upload-time = "2025-10-02T14:36:09.106Z" },
+ { url = "https://files.pythonhosted.org/packages/26/a5/d749334130de9411783873e9b98ecc46688dad5db64ca6e04b02acc8b473/xxhash-3.6.0-cp314-cp314t-musllinux_1_2_ppc64le.whl", hash = "sha256:9b3222c686a919a0f3253cfc12bb118b8b103506612253b5baeaac10d8027cf6", size = 213401, upload-time = "2025-10-02T14:36:10.585Z" },
+ { url = "https://files.pythonhosted.org/packages/89/72/abed959c956a4bfc72b58c0384bb7940663c678127538634d896b1195c10/xxhash-3.6.0-cp314-cp314t-musllinux_1_2_s390x.whl", hash = "sha256:c5aa639bc113e9286137cec8fadc20e9cd732b2cc385c0b7fa673b84fc1f2a93", size = 417083, upload-time = "2025-10-02T14:36:12.276Z" },
+ { url = "https://files.pythonhosted.org/packages/0c/b3/62fd2b586283b7d7d665fb98e266decadf31f058f1cf6c478741f68af0cb/xxhash-3.6.0-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:5c1343d49ac102799905e115aee590183c3921d475356cb24b4de29a4bc56518", size = 193913, upload-time = "2025-10-02T14:36:14.025Z" },
+ { url = "https://files.pythonhosted.org/packages/9a/9a/c19c42c5b3f5a4aad748a6d5b4f23df3bed7ee5445accc65a0fb3ff03953/xxhash-3.6.0-cp314-cp314t-win32.whl", hash = "sha256:5851f033c3030dd95c086b4a36a2683c2ff4a799b23af60977188b057e467119", size = 31586, upload-time = "2025-10-02T14:36:15.603Z" },
+ { url = "https://files.pythonhosted.org/packages/03/d6/4cc450345be9924fd5dc8c590ceda1db5b43a0a889587b0ae81a95511360/xxhash-3.6.0-cp314-cp314t-win_amd64.whl", hash = "sha256:0444e7967dac37569052d2409b00a8860c2135cff05502df4da80267d384849f", size = 32526, upload-time = "2025-10-02T14:36:16.708Z" },
+ { url = "https://files.pythonhosted.org/packages/0f/c9/7243eb3f9eaabd1a88a5a5acadf06df2d83b100c62684b7425c6a11bcaa8/xxhash-3.6.0-cp314-cp314t-win_arm64.whl", hash = "sha256:bb79b1e63f6fd84ec778a4b1916dfe0a7c3fdb986c06addd5db3a0d413819d95", size = 28898, upload-time = "2025-10-02T14:36:17.843Z" },
+]
+
[[package]]
name = "yappi"
version = "1.6.10"