Skip to content

feat: add sagemaker_nova provider for Amazon Nova models on SageMaker#21542

Merged
4 commits merged intoBerriAI:mainfrom
ryanh-ai:feat/nova-sagemaker
Mar 14, 2026
Merged

feat: add sagemaker_nova provider for Amazon Nova models on SageMaker#21542
4 commits merged intoBerriAI:mainfrom
ryanh-ai:feat/nova-sagemaker

Conversation

@ryanh-ai
Copy link
Copy Markdown
Contributor

@ryanh-ai ryanh-ai commented Feb 19, 2026

Relevant issues

Fixes #21541

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.

Type

🆕 New Feature

Changes

Adds a dedicated sagemaker_nova provider for Amazon Nova models deployed on SageMaker Inference endpoints.

Key changes:

  • New SagemakerNovaConfig class in litellm/llms/sagemaker/nova/transformation.py — extends SagemakerChatConfig with:
    • Nova-specific parameters: top_k, reasoning_effort, allowed_token_ids, truncate_prompt_tokens
    • stream: true in request body for streaming
    • Strips model field from request body (not accepted by Nova endpoints)
  • Provider registration in __init__.py, constants.py, utils.py, main.py
  • Documentation added to docs/my-website/docs/providers/aws_sagemaker.md
  • Unit tests (21 tests) in tests/litellm/llms/sagemaker/
  • Integration tests in tests/llm_translation/

Usage: model="sagemaker_nova/<endpoint-name>"

@vercel
Copy link
Copy Markdown

vercel bot commented Feb 19, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Mar 14, 2026 10:12pm

Request Review

@ryanh-ai
Copy link
Copy Markdown
Contributor Author

ryanh-ai commented Feb 19, 2026

@greptileai

@ryanh-ai ryanh-ai marked this pull request as draft February 19, 2026 06:32
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Feb 19, 2026

Greptile Summary

This PR adds a new sagemaker_nova provider for Amazon Nova models (Nova Micro, Nova Lite, Nova 2 Lite) deployed on SageMaker Inference real-time endpoints. The implementation is clean and well-tested: a new SagemakerNovaConfig class extends the existing SagemakerChatConfig, the provider is correctly registered across LlmProviders, constants.py, utils.py, __init__.py, and the lazy-import registry, and the routing in main.py is minimal. There are 21 unit tests and a set of skip-by-default integration tests.

Key findings:

  • Missing sagemaker_nova in exception mapping (litellm/litellm_core_utils/exception_mapping_utils.py): The elif block that translates AWS-specific error strings (e.g. "Unable to locate credentials", context-window exceeded) into typed LiteLLM exceptions only handles "sagemaker" and "sagemaker_chat". Because "sagemaker_nova" was not added to this condition, credential and context-window errors from Nova endpoints will surface as unmapped raw exceptions rather than BadRequestError / ContextWindowExceededError.
  • Unused imports in both new test files: import io and import json in tests/local_testing/test_sagemaker_nova_integration.py; import json in tests/test_litellm/llms/sagemaker/test_sagemaker_nova_transformation.py.

Confidence Score: 3/5

  • Safe to merge with minor fixes — the exception mapping gap will cause poorly-typed errors for Nova users but won't break existing functionality.
  • The provider integration is architecturally sound and the core transformation logic is correct. The score is lowered primarily because exception_mapping_utils.py was not updated to include sagemaker_nova, meaning credential and context-length errors won't be mapped to proper LiteLLM exception types for Nova callers. The unused imports are minor but indicate incomplete cleanup.
  • litellm/main.py (missing companion change to litellm/litellm_core_utils/exception_mapping_utils.py); test files have unused imports to clean up.

Important Files Changed

Filename Overview
litellm/llms/sagemaker/nova/transformation.py New SagemakerNovaConfig class extending SagemakerChatConfig; adds Nova-specific params (top_k, reasoning_effort, allowed_token_ids, truncate_prompt_tokens), strips model from request body, and sets supports_stream_param_in_request_body=True.
litellm/llms/sagemaker/chat/transformation.py Minor change: CustomStreamWrapper now receives the actual custom_llm_provider string (instead of hardcoded "sagemaker_chat"), and get_async_custom_stream_wrapper resolves the provider to a LlmProviders enum with a fallback to SAGEMAKER_CHAT. Backwards-compatible for sagemaker_chat.
litellm/main.py Routes sagemaker_nova through base_llm_http_handler alongside sagemaker_chat and correctly passes custom_llm_provider through. However, exception_mapping_utils.py was not updated — sagemaker_nova errors won't be mapped to proper LiteLLM exception types.
litellm/types/utils.py Adds SAGEMAKER_NOVA = "sagemaker_nova" to LlmProviders enum. Clean and complete.
litellm/utils.py Registers SagemakerNovaConfig in ProviderConfigManager for SAGEMAKER_NOVA. Clean addition following the existing pattern.
litellm/constants.py Adds "sagemaker_nova" to LITELLM_CHAT_PROVIDERS. Straightforward and correct.
tests/local_testing/test_sagemaker_nova_integration.py Integration tests gated behind env-var SAGEMAKER_NOVA_ENDPOINT, correctly placed in tests/local_testing/. Unused io and json imports remain.

Sequence Diagram

sequenceDiagram
    participant User
    participant main.py
    participant BaseLLMHTTPHandler
    participant ProviderConfigManager
    participant SagemakerNovaConfig
    participant SageMakerEndpoint

    User->>main.py: litellm.completion(model="sagemaker_nova/my-endpoint", ...)
    main.py->>main.py: custom_llm_provider = "sagemaker_nova"
    main.py->>BaseLLMHTTPHandler: completion(custom_llm_provider="sagemaker_nova")
    BaseLLMHTTPHandler->>ProviderConfigManager: get_provider_config("sagemaker_nova")
    ProviderConfigManager-->>BaseLLMHTTPHandler: SagemakerNovaConfig()
    BaseLLMHTTPHandler->>SagemakerNovaConfig: get_complete_url(stream=False/True)
    SagemakerNovaConfig-->>BaseLLMHTTPHandler: https://runtime.sagemaker.{region}.amazonaws.com/endpoints/{model}/invocations[-response-stream]
    BaseLLMHTTPHandler->>SagemakerNovaConfig: transform_request(model, messages, optional_params)
    Note over SagemakerNovaConfig: Calls super().transform_request()<br/>then strips "model" from body
    SagemakerNovaConfig-->>BaseLLMHTTPHandler: request_body (no "model" field)
    BaseLLMHTTPHandler->>SagemakerNovaConfig: sign_request(headers, request_data, api_base)
    SagemakerNovaConfig-->>BaseLLMHTTPHandler: signed_headers, signed_body
    BaseLLMHTTPHandler->>SageMakerEndpoint: POST /invocations (signed)
    SageMakerEndpoint-->>BaseLLMHTTPHandler: OpenAI-compatible response
    BaseLLMHTTPHandler-->>User: ModelResponse
Loading

Last reviewed commit: 868f9d0

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

14 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@ryanh-ai ryanh-ai force-pushed the feat/nova-sagemaker branch from 2728e5c to e2d0b2a Compare February 19, 2026 06:36
Add support for custom/fine-tuned Amazon Nova models (Nova Micro, Nova Lite,
Nova 2 Lite) deployed on SageMaker Inference real-time endpoints.

Nova uses OpenAI-compatible request/response format with additional
Nova-specific parameters (top_k, reasoning_effort, allowed_token_ids,
truncate_prompt_tokens) and requires stream:true in the request body.
Nova endpoints also reject 'model' in the request body.

Changes:
- New provider: sagemaker_nova/<endpoint-name>
- SagemakerNovaConfig inherits from SagemakerChatConfig
- Override transform_request to strip 'model' from request body
- Override supports_stream_param_in_request_body (True for Nova)
- Extend get_supported_openai_params with Nova-specific params
- Refactored SagemakerChatConfig to use custom_llm_provider param
  instead of hardcoded strings (backwards-compatible)
- Consolidated main.py routing for sagemaker_chat and sagemaker_nova
- 22 unit tests + 9 integration tests (skip-gated)
- Documentation with SDK, streaming, multimodal, and proxy examples
- All tests verified against live SageMaker Nova endpoint
@ryanh-ai
Copy link
Copy Markdown
Contributor Author

@greptileai

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

12 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@ryanh-ai ryanh-ai marked this pull request as ready for review February 19, 2026 06:48
@ryanh-ai
Copy link
Copy Markdown
Contributor Author

addressed Greptile feedback, look forward to input!

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

12 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@ryanh-ai
Copy link
Copy Markdown
Contributor Author

@Sameerlite I think you reviewed my related PR, let me know if you see anything I should address.

@ryanh-ai
Copy link
Copy Markdown
Contributor Author

@Sameerlite checking back here to see if any feedback that needs to be addressed. The failing tests in the CI I believe are not something I can address as they don't seem to be related to my changes, but perhaps I am incorrect about that

@ryanh-ai
Copy link
Copy Markdown
Contributor Author

ryanh-ai commented Mar 3, 2026

@krrishdholakia i see you merged the other recent bedrock related PR; I did another look at this to see if the new class was really necessary and it seems it likely is (or some refactoring/branching/conditionals within the existing sagemaker_chat class would be required

return request_body


sagemaker_nova_config = SagemakerNovaConfig()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why create this here?

The sagemaker_nova_config singleton was never imported or used — the
ProviderConfigManager creates its own instance via the lambda registered
in utils.py. Removing this leftover boilerplate.
@ryanh-ai
Copy link
Copy Markdown
Contributor Author

Good catch on the module-level sagemaker_nova_config instance — it was leftover boilerplate. Nothing ever imported or used it; the ProviderConfigManager creates its own instance via the lambda registered in utils.py. Removed in cd4248b.

These tests require a live SageMaker Nova endpoint and AWS credentials.
They are skipped by default — run manually with:

pytest tests/test_litellm/llms/sagemaker/test_sagemaker_nova_integration.py -v --no-header -rN
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrong file path in docstring

The pytest command documented in this docstring references the old/incorrect path tests/test_litellm/llms/sagemaker/test_sagemaker_nova_integration.py, but the file now lives at tests/local_testing/test_sagemaker_nova_integration.py. Developers following these instructions will get a "file not found" error.

Suggested change
pytest tests/test_litellm/llms/sagemaker/test_sagemaker_nova_integration.py -v --no-header -rN
pytest tests/local_testing/test_sagemaker_nova_integration.py -v --no-header -rN

Comment on lines +37 to +44
nova_params = [
"top_k",
"reasoning_effort",
"allowed_token_ids",
"truncate_prompt_tokens",
]
for p in nova_params:
if p not in params:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reasoning_effort hardcoded as supported for all Nova endpoints

Per the PR description and AWS docs, reasoning_effort is only supported by Nova 2 Lite custom models — not Nova Micro or Nova Lite. However, it is unconditionally added to get_supported_openai_params for every sagemaker_nova endpoint.

This violates the project's custom instruction (rule 2605a1b1): model-specific capability flags should live in model_prices_and_context_window.json and be read via get_model_info, so that support can be updated without a LiteLLM code release.

Because SageMaker endpoint names are opaque (just a string like "my-nova-endpoint"), there is no static entry in the pricing file for these models. The recommended pattern is to rely on a capability flag (supports_reasoning) checked via get_model_info, or to document that passing reasoning_effort to a non-Nova-2-Lite endpoint will result in an API error, rather than advertising it as universally supported.

At minimum, a docstring clarifying the model restriction would prevent confusion, but the preferred fix per project policy is to make this capability opt-in or gate it on a runtime check.

Rule Used: What: Do not hardcode model-specific flags in the ... (source)

@ghost ghost merged commit 374c345 into BerriAI:main Mar 14, 2026
33 of 36 checks passed
return _model_response
response = _model_response
elif custom_llm_provider == "sagemaker_chat":
elif custom_llm_provider in ("sagemaker_chat", "sagemaker_nova"):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sagemaker_nova not added to exception mapping

litellm/litellm_core_utils/exception_mapping_utils.py maps AWS-specific error strings (e.g. "Unable to locate credentials", context-window exceeded) to proper LiteLLM exception types, but its condition only covers "sagemaker" and "sagemaker_chat":

elif (
    custom_llm_provider == "sagemaker"
    or custom_llm_provider == "sagemaker_chat"
):

Because "sagemaker_nova" is missing, credential errors and context-window exceeded errors from a Nova endpoint will not be translated into BadRequestError / ContextWindowExceededError — they'll surface as raw, unmapped exceptions instead. The fix is to add sagemaker_nova to that condition (in exception_mapping_utils.py):

elif (
    custom_llm_provider == "sagemaker"
    or custom_llm_provider == "sagemaker_chat"
    or custom_llm_provider == "sagemaker_nova"
):

Comment on lines +16 to +17
import io
import json
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused imports io and json

Neither io nor json are referenced anywhere in this test file. They can be removed to keep the imports clean.

Suggested change
import io
import json
import os

Unit tests for SageMaker Nova transformation config.
"""

import json
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused json import

json is imported but never referenced in any test in this file. It can be safely removed.

Suggested change
import json
import pytest

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Add sagemaker_nova provider for Amazon Nova models on SageMaker endpoints

1 participant