feat: add sagemaker_nova provider for Amazon Nova models on SageMaker#21542
feat: add sagemaker_nova provider for Amazon Nova models on SageMaker#215424 commits merged intoBerriAI:mainfrom
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR adds a new Key findings:
Confidence Score: 3/5
|
| Filename | Overview |
|---|---|
| litellm/llms/sagemaker/nova/transformation.py | New SagemakerNovaConfig class extending SagemakerChatConfig; adds Nova-specific params (top_k, reasoning_effort, allowed_token_ids, truncate_prompt_tokens), strips model from request body, and sets supports_stream_param_in_request_body=True. |
| litellm/llms/sagemaker/chat/transformation.py | Minor change: CustomStreamWrapper now receives the actual custom_llm_provider string (instead of hardcoded "sagemaker_chat"), and get_async_custom_stream_wrapper resolves the provider to a LlmProviders enum with a fallback to SAGEMAKER_CHAT. Backwards-compatible for sagemaker_chat. |
| litellm/main.py | Routes sagemaker_nova through base_llm_http_handler alongside sagemaker_chat and correctly passes custom_llm_provider through. However, exception_mapping_utils.py was not updated — sagemaker_nova errors won't be mapped to proper LiteLLM exception types. |
| litellm/types/utils.py | Adds SAGEMAKER_NOVA = "sagemaker_nova" to LlmProviders enum. Clean and complete. |
| litellm/utils.py | Registers SagemakerNovaConfig in ProviderConfigManager for SAGEMAKER_NOVA. Clean addition following the existing pattern. |
| litellm/constants.py | Adds "sagemaker_nova" to LITELLM_CHAT_PROVIDERS. Straightforward and correct. |
| tests/local_testing/test_sagemaker_nova_integration.py | Integration tests gated behind env-var SAGEMAKER_NOVA_ENDPOINT, correctly placed in tests/local_testing/. Unused io and json imports remain. |
Sequence Diagram
sequenceDiagram
participant User
participant main.py
participant BaseLLMHTTPHandler
participant ProviderConfigManager
participant SagemakerNovaConfig
participant SageMakerEndpoint
User->>main.py: litellm.completion(model="sagemaker_nova/my-endpoint", ...)
main.py->>main.py: custom_llm_provider = "sagemaker_nova"
main.py->>BaseLLMHTTPHandler: completion(custom_llm_provider="sagemaker_nova")
BaseLLMHTTPHandler->>ProviderConfigManager: get_provider_config("sagemaker_nova")
ProviderConfigManager-->>BaseLLMHTTPHandler: SagemakerNovaConfig()
BaseLLMHTTPHandler->>SagemakerNovaConfig: get_complete_url(stream=False/True)
SagemakerNovaConfig-->>BaseLLMHTTPHandler: https://runtime.sagemaker.{region}.amazonaws.com/endpoints/{model}/invocations[-response-stream]
BaseLLMHTTPHandler->>SagemakerNovaConfig: transform_request(model, messages, optional_params)
Note over SagemakerNovaConfig: Calls super().transform_request()<br/>then strips "model" from body
SagemakerNovaConfig-->>BaseLLMHTTPHandler: request_body (no "model" field)
BaseLLMHTTPHandler->>SagemakerNovaConfig: sign_request(headers, request_data, api_base)
SagemakerNovaConfig-->>BaseLLMHTTPHandler: signed_headers, signed_body
BaseLLMHTTPHandler->>SageMakerEndpoint: POST /invocations (signed)
SageMakerEndpoint-->>BaseLLMHTTPHandler: OpenAI-compatible response
BaseLLMHTTPHandler-->>User: ModelResponse
Last reviewed commit: 868f9d0
2728e5c to
e2d0b2a
Compare
Add support for custom/fine-tuned Amazon Nova models (Nova Micro, Nova Lite, Nova 2 Lite) deployed on SageMaker Inference real-time endpoints. Nova uses OpenAI-compatible request/response format with additional Nova-specific parameters (top_k, reasoning_effort, allowed_token_ids, truncate_prompt_tokens) and requires stream:true in the request body. Nova endpoints also reject 'model' in the request body. Changes: - New provider: sagemaker_nova/<endpoint-name> - SagemakerNovaConfig inherits from SagemakerChatConfig - Override transform_request to strip 'model' from request body - Override supports_stream_param_in_request_body (True for Nova) - Extend get_supported_openai_params with Nova-specific params - Refactored SagemakerChatConfig to use custom_llm_provider param instead of hardcoded strings (backwards-compatible) - Consolidated main.py routing for sagemaker_chat and sagemaker_nova - 22 unit tests + 9 integration tests (skip-gated) - Documentation with SDK, streaming, multimodal, and proxy examples - All tests verified against live SageMaker Nova endpoint
e2d0b2a to
aceb6c4
Compare
|
addressed Greptile feedback, look forward to input! |
|
@Sameerlite I think you reviewed my related PR, let me know if you see anything I should address. |
|
@Sameerlite checking back here to see if any feedback that needs to be addressed. The failing tests in the CI I believe are not something I can address as they don't seem to be related to my changes, but perhaps I am incorrect about that |
|
@krrishdholakia i see you merged the other recent bedrock related PR; I did another look at this to see if the new class was really necessary and it seems it likely is (or some refactoring/branching/conditionals within the existing sagemaker_chat class would be required |
| return request_body | ||
|
|
||
|
|
||
| sagemaker_nova_config = SagemakerNovaConfig() |
The sagemaker_nova_config singleton was never imported or used — the ProviderConfigManager creates its own instance via the lambda registered in utils.py. Removing this leftover boilerplate.
|
Good catch on the module-level |
| These tests require a live SageMaker Nova endpoint and AWS credentials. | ||
| They are skipped by default — run manually with: | ||
|
|
||
| pytest tests/test_litellm/llms/sagemaker/test_sagemaker_nova_integration.py -v --no-header -rN |
There was a problem hiding this comment.
Wrong file path in docstring
The pytest command documented in this docstring references the old/incorrect path tests/test_litellm/llms/sagemaker/test_sagemaker_nova_integration.py, but the file now lives at tests/local_testing/test_sagemaker_nova_integration.py. Developers following these instructions will get a "file not found" error.
| pytest tests/test_litellm/llms/sagemaker/test_sagemaker_nova_integration.py -v --no-header -rN | |
| pytest tests/local_testing/test_sagemaker_nova_integration.py -v --no-header -rN |
| nova_params = [ | ||
| "top_k", | ||
| "reasoning_effort", | ||
| "allowed_token_ids", | ||
| "truncate_prompt_tokens", | ||
| ] | ||
| for p in nova_params: | ||
| if p not in params: |
There was a problem hiding this comment.
reasoning_effort hardcoded as supported for all Nova endpoints
Per the PR description and AWS docs, reasoning_effort is only supported by Nova 2 Lite custom models — not Nova Micro or Nova Lite. However, it is unconditionally added to get_supported_openai_params for every sagemaker_nova endpoint.
This violates the project's custom instruction (rule 2605a1b1): model-specific capability flags should live in model_prices_and_context_window.json and be read via get_model_info, so that support can be updated without a LiteLLM code release.
Because SageMaker endpoint names are opaque (just a string like "my-nova-endpoint"), there is no static entry in the pricing file for these models. The recommended pattern is to rely on a capability flag (supports_reasoning) checked via get_model_info, or to document that passing reasoning_effort to a non-Nova-2-Lite endpoint will result in an API error, rather than advertising it as universally supported.
At minimum, a docstring clarifying the model restriction would prevent confusion, but the preferred fix per project policy is to make this capability opt-in or gate it on a runtime check.
Rule Used: What: Do not hardcode model-specific flags in the ... (source)
| return _model_response | ||
| response = _model_response | ||
| elif custom_llm_provider == "sagemaker_chat": | ||
| elif custom_llm_provider in ("sagemaker_chat", "sagemaker_nova"): |
There was a problem hiding this comment.
sagemaker_nova not added to exception mapping
litellm/litellm_core_utils/exception_mapping_utils.py maps AWS-specific error strings (e.g. "Unable to locate credentials", context-window exceeded) to proper LiteLLM exception types, but its condition only covers "sagemaker" and "sagemaker_chat":
elif (
custom_llm_provider == "sagemaker"
or custom_llm_provider == "sagemaker_chat"
):Because "sagemaker_nova" is missing, credential errors and context-window exceeded errors from a Nova endpoint will not be translated into BadRequestError / ContextWindowExceededError — they'll surface as raw, unmapped exceptions instead. The fix is to add sagemaker_nova to that condition (in exception_mapping_utils.py):
elif (
custom_llm_provider == "sagemaker"
or custom_llm_provider == "sagemaker_chat"
or custom_llm_provider == "sagemaker_nova"
):| import io | ||
| import json |
There was a problem hiding this comment.
Unused imports io and json
Neither io nor json are referenced anywhere in this test file. They can be removed to keep the imports clean.
| import io | |
| import json | |
| import os |
| Unit tests for SageMaker Nova transformation config. | ||
| """ | ||
|
|
||
| import json |
There was a problem hiding this comment.
Unused json import
json is imported but never referenced in any test in this file. It can be safely removed.
| import json | |
| import pytest |
Relevant issues
Fixes #21541
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unit@greptileaiand received a Confidence Score of at least 4/5 before requesting a maintainer reviewCI (LiteLLM team)
Branch creation CI run
Link:
CI run for the last commit
Link: https://github.com/BerriAI/litellm/actions/runs/22171339405
Merge / cherry-pick CI run
Links:
Type
🆕 New Feature
Changes
Adds a dedicated
sagemaker_novaprovider for Amazon Nova models deployed on SageMaker Inference endpoints.Key changes:
SagemakerNovaConfigclass inlitellm/llms/sagemaker/nova/transformation.py— extendsSagemakerChatConfigwith:top_k,reasoning_effort,allowed_token_ids,truncate_prompt_tokensstream: truein request body for streamingmodelfield from request body (not accepted by Nova endpoints)__init__.py,constants.py,utils.py,main.pydocs/my-website/docs/providers/aws_sagemaker.mdtests/litellm/llms/sagemaker/tests/llm_translation/Usage:
model="sagemaker_nova/<endpoint-name>"