Schema overrides for tool-calling by gwarmstrong · Pull Request #1118 · NVIDIA-NeMo/Skills

gwarmstrong · 2025-12-16T19:52:54Z

Schema Overrides for Tool Calling

Summary

This PR adds support for schema overrides, which allow customizing how tool schemas are presented to the model without modifying the underlying tool implementation.

How It Works

Schema overrides transform the tool schema sent to the model while remapping the model's responses back to the original parameter names for execution.

Original Schema                              Model Sees                        Model Calls
┌───────────────────────────────┐           ┌──────────────────────┐           ┌─────────────────────┐
│ name: "stateful_python_..."   │  ──────▶  │ name: "python"       │  ──────▶  │ python(script="...")│
│ params: {code: ...}           │  override │ params: {script:... }│           └─────────┬───────────┘
└───────────────────────────────┘           └──────────────────────┘                     │
                                                                                         │ remap
      Execution                                                                          ▼
┌───────────────────────────────┐                                           ┌─────────────────────┐
│ stateful_python_...(code="…") │  ◀──────────────────────────────────────  │ code="..."          │
└───────────────────────────────┘                                           └─────────────────────┘

Configuration

YAML Config File

Create a config file (e.g., schema_overrides.yaml):

# Schema overrides keyed by provider class name, then tool name
schema_overrides:
  PythonTool:
    stateful_python_code_exec:
      name: "python"
      description: "Run Python code and return the result"
      parameters:
        code:
          name: "script"
          description: "Python script to run"

Mapping Example

The PythonTool has the following original schema:

{
  "name": "stateful_python_code_exec",
  "description": "Call this function to execute Python code in a stateful Jupyter notebook environment. Python will respond with the output of the execution or time out after 120.0 seconds.",
  "input_schema": {
    "properties": {
      "code": {"type": "string", "description": "Code to execute"}
    },
    "required": ["code"]
  }
}

With the above overrides, the model sees:

{
  "name": "python",
  "description": "Run Python code and return the result",
  "input_schema": {
    "properties": {
      "script": {"type": "string", "description": "Python script to run"}
    },
    "required": ["script"]
  }
}

When the model calls python(script="print(1)"), the framework automatically remaps it to stateful_python_code_exec(code="print(1)") for execution.

Usage

Command Line

Use --config-path and --config-name to load your override config:

ns generate \
    --cluster=slurm \
    --model=Qwen/Qwen3-8B \
    --server_type=vllm \
    --server_gpus=1 \
    --server_args='--enable-auto-tool-choice --tool-call-parser hermes' \
    --input_file=data.jsonl \
    --output_dir=outputs \
    --with_sandbox=true \
    ++tool_modules=[nemo_skills.mcp.servers.python_tool.PythonTool] \
    --config-path=/nemo_run/code/configs \
    --config-name=schema_overrides

Python API

from nemo_skills.pipeline.cli import generate, wrap_arguments

generate(
    ctx=wrap_arguments(
        "++tool_modules=[nemo_skills.mcp.servers.python_tool.PythonTool] "
        "--config-path /nemo_run/code/configs "
        "--config-name schema_overrides"
    ),
    cluster='slurm',
    model='Qwen/Qwen3-8B',
    server_type='vllm',
    server_gpus=1,
    server_args='--enable-auto-tool-choice --tool-call-parser hermes',
    input_file='data.jsonl',
    output_dir='outputs',
    with_sandbox=True,
)

Config File Location

Option A: Commit to git - Files committed to your repository are automatically packaged and available at /nemo_run/code on the cluster. See Code Packaging for details.

Option B: Mount from cluster storage - Mount config files from a known location on your cluster filesystem. Update your cluster config to mount the directory containing your override files, then reference the mounted path:

--config-path=/mounted/configs \
--config-name=schema_overrides

Output Format

The generation output includes the transformed tool schema in the tools field:

{
  "conversation": [
    {"role": "user", "content": "Calculate 2+2"},
    {"role": "assistant", "content": "...", "tool_calls": [...]},
    {"role": "tool", "content": "4"}
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "python",
        "description": "Run Python code and return the result",
        "parameters": {
          "properties": {"script": {"type": "string", "description": "Python script to run"}},
          "required": ["script"]
        }
      }
    }
  ],
  "num_tool_calls": 1,
  "generation": "..."
}

Validation

Schema overrides are validated at startup:

Non-existent parameters: Attempting to override a parameter that doesn't exist in the original schema raises an error
Hidden parameters: Parameters removed via hide_args cannot be overridden (they're not in the schema)

# This will fail if 'nonexistent_param' is not in the tool schema:
schema_overrides:
  PythonTool:
    stateful_python_code_exec:
      parameters:
        nonexistent_param:  # ValueError: Parameter 'nonexistent_param' not in schema
          name: "new_name"

Summary by CodeRabbit

Release Notes

New Features
- Added schema override configuration option to customize tool schemas during inference, enabling parameter renaming and schema adjustments.
- Enhanced tool parameter handling with improved error reporting for invalid tool arguments.
Tests
- Added tests for schema override functionality and validation.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Signed-off-by: George Armstrong <georgea@nvidia.com>

coderabbitai · 2025-12-16T19:56:33Z

📝 Walkthrough

Walkthrough

Introduces schema customization for tool-based inference by adding a schema_overrides configuration field that propagates through the model setup pipeline. The feature enables tool schema transformation with parameter remapping during tool execution to map model-provided tool calls back to original names.

Changes

Cohort / File(s)	Summary
Configuration layer `nemo_skills/inference/generate.py`	Added `schema_overrides: dict \| None` field to `GenerateSolutionsConfig` with default empty dict and documentation; field is passed to `get_tool_calling_model` during LLM setup.
Model interface `nemo_skills/inference/model/__init__.py`	Updated `get_tool_calling_model` function signature to include optional `schema_overrides: dict \| None = None` parameter and forward it to `ToolCallingWrapper`.
Tool execution `nemo_skills/inference/model/tool_call.py`	Enhanced `ToolCallingWrapper.__init__` to accept and store `schema_overrides` via `load_schema_overrides`; updated `generate_async` to apply overrides when listing tools and remap tool calls back to original names/arguments before execution; added error handling for tool argument JSON parsing failures.
MCP adapter utilities `nemo_skills/mcp/adapters.py`	Added three new utility functions: `load_schema_overrides`, `apply_schema_overrides`, `remap_tool_call`. Modified `format_tool_list_by_endpoint_type` to accept optional `schema_overrides` parameter, transform tool definitions, accumulate remapping metadata, and return tuple of `(formatted_tools, mappings)`. Added imports: `copy`, `typing` module types, `DictConfig`, `OmegaConf`.
Test coverage `tests/test_mcp_clients.py`	Added two new async test methods: `test_tool_manager_with_schema_overrides` (validates override integration and parameter remapping) and `test_schema_override_nonexistent_param_fails` (validates error handling for invalid overrides).

Sequence Diagram

sequenceDiagram
    actor User
    participant Config as GenerateSolutionsConfig
    participant Model as ToolCallingWrapper
    participant Adapter as MCP Adapter
    participant LLM as Language Model
    participant Tool as Tool Execution

    User->>Config: Provide schema_overrides
    Config->>Model: Pass schema_overrides to ToolCallingWrapper
    Model->>Adapter: Call format_tool_list_by_endpoint_type(tools, schema_overrides)
    Adapter->>Adapter: Apply schema_overrides to each tool<br/>(rename params, update required fields)
    Adapter->>Adapter: Build parameter remapping metadata
    Adapter-->>Model: Return (formatted_tools, mappings)
    Model->>LLM: Send formatted_tools with renamed parameters
    LLM-->>Model: Return tool_call (with renamed param names)
    Model->>Adapter: Call remap_tool_call(tool_name, args, mappings)
    Adapter->>Adapter: Restore original tool name<br/>and parameter names
    Adapter-->>Model: Return (original_tool_name, original_args)
    Model->>Tool: Execute tool with original names/args

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Areas requiring attention:

Schema transformation logic in apply_schema_overrides — verify parameter renaming and required-field updates are correctly applied and do not lose data
Remapping correctness in remap_tool_call — ensure bidirectional mapping between overridden and original names/arguments is accurate and handles edge cases
Integration points across tool_call.py — confirm that load_schema_overrides and remapping are called in the correct sequence during generation and tool execution
Type consistency — validate that OmegaConf/Hydra config structures are properly normalized to plain dicts throughout the pipeline
Test coverage — verify test assertions thoroughly validate both the happy path and error scenarios for schema override application

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 46.15% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Schema overrides for tool-calling' directly describes the main feature added in this PR: support for schema overrides in tool-calling infrastructure.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch georgea/override-tool-schema

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e09953e and 3f07eca.

📒 Files selected for processing (5)

nemo_skills/inference/generate.py (2 hunks)
nemo_skills/inference/model/__init__.py (2 hunks)
nemo_skills/inference/model/tool_call.py (5 hunks)
nemo_skills/mcp/adapters.py (3 hunks)
tests/test_mcp_clients.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (4)

tests/test_mcp_clients.py (3)

nemo_skills/inference/model/base.py (1)

EndpointType (38-41)

nemo_skills/mcp/adapters.py (3)

format_tool_list_by_endpoint_type (139-195)

load_schema_overrides (54-89)

apply_schema_overrides (92-128)

nemo_skills/mcp/tool_manager.py (2)

ToolManager (67-168)

list_all_tools (115-155)

nemo_skills/mcp/adapters.py (2)

nemo_skills/inference/chat_interface/core.py (2)

cfg (181-182)

get (136-151)

nemo_skills/inference/model/base.py (1)

EndpointType (38-41)

nemo_skills/inference/generate.py (1)

nemo_skills/inference/chat_interface/core.py (1)

cfg (181-182)

nemo_skills/inference/model/tool_call.py (1)

nemo_skills/mcp/adapters.py (4)

load_schema_overrides (54-89)

remap_tool_call (131-136)

get_tool_details_by_endpoint_type (233-243)

format_tool_list_by_endpoint_type (139-195)

🪛 Ruff (0.14.8)

nemo_skills/mcp/adapters.py

72-72: Prefer TypeError exception for invalid type

(TRY004)

72-72: Avoid specifying long messages outside the exception class

(TRY003)

77-77: Prefer TypeError exception for invalid type

(TRY004)

77-77: Avoid specifying long messages outside the exception class

(TRY003)

82-82: Prefer TypeError exception for invalid type

(TRY004)

82-82: Avoid specifying long messages outside the exception class

(TRY003)

113-113: Avoid specifying long messages outside the exception class

(TRY003)

115-115: Prefer TypeError exception for invalid type

(TRY004)

115-115: Avoid specifying long messages outside the exception class

(TRY003)

193-193: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: unit-tests
GitHub Check: pre-commit

🔇 Additional comments (14)

nemo_skills/mcp/adapters.py (4)

54-89: LGTM!

The function properly normalizes schema overrides with thorough validation at each level (provider, tool, and config). The handling of Hydra/OmegaConf conversion is correct.

Note: Static analysis suggests using TypeError instead of ValueError, but ValueError is the appropriate choice here for configuration validation errors.

92-128: LGTM!

The override application logic is well-structured:

Deep copy prevents unintended mutations

Validates parameters exist before applying overrides (prevents errors for non-existent parameters)

Correctly builds the parameter mapping for remapping tool calls

Properly updates the required fields with renamed parameter names

131-136: LGTM!

The remapping logic is correct. It properly maps both tool names and parameter names from the model-facing schema back to the original tool schema for execution. The fallback behavior (using the original value if no mapping exists) is appropriate for tools without overrides.

139-195: LGTM!

The updated function correctly:

Applies schema overrides to each tool before formatting

Builds comprehensive mappings for both tool names and parameters

Uses the transformed tools (with overrides applied) for endpoint-specific formatting

Returns both the formatted tools and mappings needed for remapping during execution

The signature change (adding schema_overrides parameter and returning a tuple) is a breaking API change, but this is intentional and aligns with the PR objectives.

nemo_skills/inference/model/__init__.py (1)

125-125: LGTM!

Clean parameter propagation. The schema_overrides parameter is properly added to the function signature and forwarded to ToolCallingWrapper.

Also applies to: 135-135

nemo_skills/inference/generate.py (2)

180-198: Excellent documentation!

The schema_overrides field is well-documented with clear examples showing:

The configuration structure (ProviderClassName → tool_name → overrides)

A concrete YAML example demonstrating parameter renaming

Hydra CLI usage instructions

This will make the feature easy to adopt.

409-409: LGTM!

The schema_overrides configuration is properly propagated through the tool-calling model setup path.

tests/test_mcp_clients.py (2)

561-588: Comprehensive integration test!

This test validates the complete schema override flow:

Tool and parameter renaming

Mapping generation (both tool_names and parameters)

Integration with ToolManager and format_tool_list_by_endpoint_type

The assertions correctly verify that:

The renamed tool appears in the formatted output

Parameter names are transformed in the schema

Mappings are accurate for remapping during execution

591-607: Good error-path coverage!

This test ensures that attempting to override a non-existent parameter fails early with a clear error message. The docstring helpfully notes that this also covers the case where hide_args removes a parameter before overrides are applied.

nemo_skills/inference/model/tool_call.py (5)

51-51: LGTM!

The schema_overrides parameter is properly added to the constructor and initialized correctly. The schema_mappings dictionary is set up to store mappings populated when tools are listed.

Also applies to: 63-64

74-79: Improved error handling!

Adding explicit error handling for JSON parsing failures with logging makes debugging easier when malformed tool arguments are provided.

82-89: Correct remapping implementation!

The flow is correct:

Parse tool name and arguments from the model's response (model-facing names)

Remap to original tool and parameter names using remap_tool_call

Execute the tool using the original name and remapped arguments

This ensures the model sees the customized schema while the tool receives the expected original parameter names.

121-123: LGTM!

The unpacking correctly captures both the formatted tools (with overrides applied) and the mappings needed for remapping tool calls during execution. The mappings are stored in self.schema_mappings for use in _execute_tool_call.

170-170: Good transparency!

Adding the tools to the result provides visibility into the actual schema sent to the model (with overrides applied), which is helpful for debugging and understanding what the model saw.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

wedu-nvidia

Thanks, it works good to me as I tested and verfied.

Signed-off-by: George Armstrong <georgea@nvidia.com>

Signed-off-by: George Armstrong <georgea@nvidia.com> Signed-off-by: wasiahmad <wasiahmad@ucla.edu>

Signed-off-by: George Armstrong <georgea@nvidia.com> Signed-off-by: dlord <dlord@nvidia.com>

Signed-off-by: George Armstrong <georgea@nvidia.com> Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com>

Signed-off-by: George Armstrong <georgea@nvidia.com>

Signed-off-by: George Armstrong <georgea@nvidia.com> Signed-off-by: dgitman <dgitman@nvidia.com>

gwarmstrong added 17 commits December 15, 2025 14:58

schema rename wip

c9c884d

Signed-off-by: George Armstrong <georgea@nvidia.com>

WIP update schema overrides

b424fc7

Signed-off-by: George Armstrong <georgea@nvidia.com>

schema rename wip

2ccee0c

Signed-off-by: George Armstrong <georgea@nvidia.com>

update generate defaults

605a180

Signed-off-by: George Armstrong <georgea@nvidia.com>

FIX parameter order

fb860fe

Signed-off-by: George Armstrong <georgea@nvidia.com>

update schema test

c50c832

Signed-off-by: George Armstrong <georgea@nvidia.com>

WIP shorten schema overrides

4211032

Signed-off-by: George Armstrong <georgea@nvidia.com>

TST update test

b4f57b5

Signed-off-by: George Armstrong <georgea@nvidia.com>

make adapter modification more concise

ee641f3

Signed-off-by: George Armstrong <georgea@nvidia.com>

use schema mapping helpers

483d212

Signed-off-by: George Armstrong <georgea@nvidia.com>

convert tool schema conversion to functions

83f58a2

Signed-off-by: George Armstrong <georgea@nvidia.com>

Small fix

27c6122

Signed-off-by: George Armstrong <georgea@nvidia.com>

move files to adapters.py

3dbbfe6

Signed-off-by: George Armstrong <georgea@nvidia.com>

TST minimal but sufficient tests

8164c99

Signed-off-by: George Armstrong <georgea@nvidia.com>

maint: add back comments

24fbc9d

Signed-off-by: George Armstrong <georgea@nvidia.com>

MAINT update schema overrides

c069cd1

Signed-off-by: George Armstrong <georgea@nvidia.com>

MAINT dump tools to conversation

3f07eca

Signed-off-by: George Armstrong <georgea@nvidia.com>

Merge branch 'main' into georgea/override-tool-schema

88b0c21

gwarmstrong requested a review from wedu-nvidia December 16, 2025 21:51

Merge branch 'main' into georgea/override-tool-schema

266b67f

wedu-nvidia approved these changes Dec 16, 2025

View reviewed changes

gwarmstrong merged commit b7116c4 into main Dec 16, 2025
5 checks passed

gwarmstrong deleted the georgea/override-tool-schema branch December 16, 2025 22:13

wasiahmad pushed a commit that referenced this pull request Dec 19, 2025

Schema overrides for tool-calling (#1118)

a7f1025

Signed-off-by: George Armstrong <georgea@nvidia.com>

wasiahmad pushed a commit that referenced this pull request Dec 19, 2025

Schema overrides for tool-calling (#1118)

ad51e99

Signed-off-by: George Armstrong <georgea@nvidia.com> Signed-off-by: wasiahmad <wasiahmad@ucla.edu>

blahblahasdf pushed a commit to blahblahasdf/Skills that referenced this pull request Jan 8, 2026

Schema overrides for tool-calling (NVIDIA-NeMo#1118)

1510216

Signed-off-by: George Armstrong <georgea@nvidia.com> Signed-off-by: dlord <dlord@nvidia.com>

hsiehjackson pushed a commit that referenced this pull request Jan 13, 2026

Schema overrides for tool-calling (#1118)

7965d51

Signed-off-by: George Armstrong <georgea@nvidia.com> Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com>

wasiahmad pushed a commit that referenced this pull request Feb 4, 2026

Schema overrides for tool-calling (#1118)

9b3c571

Signed-off-by: George Armstrong <georgea@nvidia.com>

dgtm777 pushed a commit that referenced this pull request Mar 18, 2026

Schema overrides for tool-calling (#1118)

88893bd

Signed-off-by: George Armstrong <georgea@nvidia.com>

dgtm777 pushed a commit that referenced this pull request Mar 18, 2026

Schema overrides for tool-calling (#1118)

eb31507

Signed-off-by: George Armstrong <georgea@nvidia.com> Signed-off-by: dgitman <dgitman@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Schema overrides for tool-calling#1118

Schema overrides for tool-calling#1118
gwarmstrong merged 19 commits intomainfrom
georgea/override-tool-schema

gwarmstrong commented Dec 16, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Dec 16, 2025

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Uh oh!

wedu-nvidia left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gwarmstrong commented Dec 16, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Schema Overrides for Tool Calling

Summary

How It Works

Configuration

YAML Config File

Mapping Example

Usage

Command Line

Python API

Config File Location

Output Format

Validation

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Dec 16, 2025

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

wedu-nvidia left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gwarmstrong commented Dec 16, 2025 •

edited by coderabbitai bot

Loading