Skip to content

feat(router): add per-model-group deployment affinity#24110

Merged
Sameerlite merged 3 commits intolitellm_dev_sameer_16_march_weekfrom
Sameerlite/model-level-affinity
Mar 20, 2026
Merged

feat(router): add per-model-group deployment affinity#24110
Sameerlite merged 3 commits intolitellm_dev_sameer_16_march_weekfrom
Sameerlite/model-level-affinity

Conversation

@Sameerlite
Copy link
Copy Markdown
Contributor

Summary

Enable deployment_affinity, responses_api_deployment_check, and session_affinity features to be configured per model group via router_settings.model_group_affinity_config. This allows fine-grained control of affinity behavior across model groups while maintaining backward compatibility (unconfigured groups fall back to global settings).

Relevant issues

Addresses customer request for model-level stickiness configuration (e.g., enable affinity for cross-provider deployments like Bedrock+Azure while leaving other groups free to load-balance).

Pre-Submission checklist

  • I have added tests in tests/test_litellm/ — 4 new tests covering per-group config, fallback behavior, and override scenarios
  • My PR passes all unit tests on make test-unit (all 13 affinity tests + 6 related tests pass)
  • My PR's scope is isolated — only adds per-model-group config for affinity features
  • I have requested a Greptile review (can be done after PR creation)

Changes

  • litellm/router.py: Add model_group_affinity_config parameter; pass to DeploymentAffinityCheck; handle standalone creation when only per-group config is provided
  • deployment_affinity_check.py: Add _get_effective_flags(model_group) helper; use per-group flags in async_filter_deployments and async_pre_call_deployment_hook
  • litellm/types/router.py: Add model_group_affinity_config to UpdateRouterConfig
  • Tests: 4 new tests validating per-group behavior, global fallback, and override logic

Usage

router_settings:
  model_group_affinity_config:
    "gpt-4":
      - deployment_affinity
      - responses_api_deployment_check
    "claude-3":
      - session_affinity

Enable deployment_affinity, responses_api_deployment_check, and session_affinity to be configured per model group via router_settings.model_group_affinity_config, falling back to global settings for unconfigured groups.

- Add model_group_affinity_config parameter to Router and DeploymentAffinityCheck
- Add _get_effective_flags helper to resolve flags per model group
- Update async_filter_deployments and async_pre_call_deployment_hook to use per-group config
- Add 4 comprehensive tests covering per-group config, fallback, and override scenarios

This allows fine-grained control of affinity behavior across model groups, e.g., enabling stickiness only for cross-provider deployments while leaving other groups free to load-balance.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel bot commented Mar 19, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Mar 19, 2026 10:22am

Request Review

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 19, 2026

Greptile Summary

This PR adds per-model-group affinity configuration (model_group_affinity_config) to the LiteLLM Router, enabling fine-grained control of deployment affinity (user-key, responses-API, and session stickiness) on a per-model-group basis rather than only globally. Groups not listed fall back to the existing global optional_pre_call_checks settings, maintaining full backward compatibility.

Key changes:

  • New model_group_affinity_config: Optional[Dict[str, List[str]]] parameter on Router.__init__ and DeploymentAffinityCheck.__init__
  • New _get_effective_flags(model_group) helper in DeploymentAffinityCheck resolves per-group flags with global fallback
  • async_filter_deployments and async_pre_call_deployment_hook now use per-group flags instead of always reading instance-level globals
  • When only model_group_affinity_config is provided (no global optional_pre_call_checks), a DeploymentAffinityCheck callback is still wired up automatically with all global flags disabled
  • Unknown flag names in per-group config are validated at init and emit a warning
  • Bug: model_group_affinity_config is added to UpdateRouterConfig but is NOT added to the _allowed_settings list inside update_settings() in router.py. Dynamic config reloads from the DB or admin API will silently drop the value, even though the type model accepts it.

Confidence Score: 3/5

  • Safe to merge for SDK-only users initializing Router directly, but proxy users who rely on dynamic config reload (DB-backed or admin API) will silently get no per-group affinity applied after a config refresh.
  • Core logic in DeploymentAffinityCheck is well-implemented, backward-compatible, and tested. The one confirmed bug — model_group_affinity_config not present in update_settings._allowed_settings — means the feature will silently fail for proxy users who update router settings dynamically. Initial startup via constructor still works correctly, so direct SDK users are unaffected.
  • litellm/router.py — specifically the update_settings method's _allowed_settings list (line 8497–8499).

Important Files Changed

Filename Overview
litellm/router.py Adds model_group_affinity_config parameter to Router constructor and add_optional_pre_call_checks; creates a standalone DeploymentAffinityCheck callback when only per-group config is provided. Bug: model_group_affinity_config is missing from _allowed_settings in update_settings, causing silent no-ops on dynamic config reload.
litellm/router_utils/pre_call_checks/deployment_affinity_check.py Adds _get_effective_flags helper and model_group_affinity_config support. Per-group flag resolution is clean; early extraction of deployment_model_name in async_pre_call_deployment_hook is a justified refactor for per-group flag resolution. Warning validation of unknown flag names at init is a good addition.
litellm/types/router.py Adds model_group_affinity_config: Optional[Dict[str, List[str]]] to UpdateRouterConfig. Type definition is correct but the corresponding update_settings implementation doesn't process it.
tests/test_litellm/router_utils/pre_call_checks/test_deployment_affinity_check.py Four new tests covering per-group config, global fallback, and override behavior. All tests use mocks (no real network calls). Test logic is sound; the integration test test_model_group_affinity_config_only_applies_to_configured_group correctly patches HTTP handler and routing strategy.
docs/my-website/docs/proxy/config_settings.md Adds documentation entry for model_group_affinity_config in the router settings table. Clear and accurate.
docs/my-website/docs/response_api.md Adds a new "Per-Model-Group Affinity Configuration" section with SDK and proxy YAML examples. Well-structured and accurate.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Router.__init__ called] --> B{model_group_affinity_config set?}
    B -- Yes + no global affinity checks --> C[Create DeploymentAffinityCheck\nwith all global flags=False\nand model_group_affinity_config]
    B -- Yes + global affinity checks --> D[add_optional_pre_call_checks\npasses model_group_affinity_config\nto new or existing callback]
    B -- No --> E[Normal flow: global flags only]

    C --> F[Register callback in optional_callbacks]
    D --> F

    F --> G[Incoming request: async_filter_deployments]
    G --> H[_get_effective_flags model_group]
    H --> I{Group in\nmodel_group_affinity_config?}
    I -- Yes --> J[Use per-group flags]
    I -- No --> K[Use global instance flags]

    J --> L[Apply affinity checks\nresponses_api / session_id / user_key]
    K --> L

    L --> M[Return filtered deployments]

    G --> N[async_pre_call_deployment_hook]
    N --> O[Extract deployment_model_name from metadata]
    O --> P{deployment_model_name found?}
    P -- No --> Q[debug log + return None]
    P -- Yes --> R[_get_effective_flags deployment_model_name]
    R --> S{Any affinity\nflag enabled?}
    S -- No --> Q
    S -- Yes --> T[Write affinity mapping to cache]
Loading

Comments Outside Diff (1)

  1. litellm/router.py, line 8497-8499 (link)

    model_group_affinity_config missing from _allowed_settings in update_settings

    UpdateRouterConfig (in litellm/types/router.py) was correctly updated to include model_group_affinity_config, but the update_settings method's _allowed_settings allowlist was NOT updated. When the proxy refreshes its configuration from the DB (calling llm_router.update_settings(**combined_router_settings)), any model_group_affinity_config value in the config will be silently dropped — the method just logs "Setting model_group_affinity_config is not allowed" at debug level and ignores it.

    Initial startup still works because the Router constructor receives it directly, but any dynamic config reload (e.g., via the admin API or DB-backed config) will silently fail to apply per-group affinity changes.

Last reviewed commit: "docs: add per-model-..."

Comment on lines +442 to +443
if not deployment_model_name:
return None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Warning log silently removed

The original code emitted a verbose_router_logger.warning(...) when deployment_model_name was missing (with model_id for correlation), which was useful for diagnosing misconfigured setups. The refactoring replaced it with a silent return None. Consider restoring a log at debug/warning level to maintain observability, especially since this early-return now also gates the per-group flag resolution.

Suggested change
if not deployment_model_name:
return None
if not deployment_model_name:
verbose_router_logger.debug(
"DeploymentAffinityCheck: deployment_model_name missing in metadata; skipping affinity cache update."
)
return None

Comment on lines +61 to +82
def _get_effective_flags(
self, model_group: str
) -> Tuple[bool, bool, bool]:
"""
Return (enable_user_key_affinity, enable_responses_api_affinity, enable_session_id_affinity)
for the given model group.

If the model group has an explicit entry in model_group_affinity_config, use it.
Otherwise fall back to the global instance flags.
"""
group_checks = self.model_group_affinity_config.get(model_group)
if group_checks is not None:
return (
"deployment_affinity" in group_checks,
"responses_api_deployment_check" in group_checks,
"session_affinity" in group_checks,
)
return (
self.enable_user_key_affinity,
self.enable_responses_api_affinity,
self.enable_session_id_affinity,
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 No validation of per-group flag strings

_get_effective_flags silently ignores unknown flag strings in a group's list. A user typo like "deployment_affinityy" or "responses_api_check" would result in all flags being False for that group with no warning, making debugging very difficult.

Consider adding a validation step (either at init time or here) to warn about unrecognised flag names:

VALID_FLAGS = {"deployment_affinity", "responses_api_deployment_check", "session_affinity"}

def _get_effective_flags(self, model_group: str) -> Tuple[bool, bool, bool]:
    group_checks = self.model_group_affinity_config.get(model_group)
    if group_checks is not None:
        unknown = set(group_checks) - VALID_FLAGS
        if unknown:
            verbose_router_logger.warning(
                "DeploymentAffinityCheck: unknown flag(s) %s for model group '%s'; will be ignored.",
                unknown, model_group,
            )
        return (
            "deployment_affinity" in group_checks,
            "responses_api_deployment_check" in group_checks,
            "session_affinity" in group_checks,
        )
    ...

@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq bot commented Mar 19, 2026

Merging this PR will not alter performance

✅ 16 untouched benchmarks


Comparing Sameerlite/model-level-affinity (a14122c) with main (e5baa22)

Open in CodSpeed

…n on unknown affinity flags

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
@Sameerlite
Copy link
Copy Markdown
Contributor Author

@greptile-apps re review

@Sameerlite Sameerlite changed the base branch from main to litellm_dev_sameer_16_march_week March 20, 2026 10:57
@Sameerlite Sameerlite merged commit de21715 into litellm_dev_sameer_16_march_week Mar 20, 2026
37 of 39 checks passed
@ishaan-berri ishaan-berri deleted the Sameerlite/model-level-affinity branch March 26, 2026 22:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant