feat(router): add per-model-group deployment affinity#24110
feat(router): add per-model-group deployment affinity#24110Sameerlite merged 3 commits intolitellm_dev_sameer_16_march_weekfrom
Conversation
Enable deployment_affinity, responses_api_deployment_check, and session_affinity to be configured per model group via router_settings.model_group_affinity_config, falling back to global settings for unconfigured groups. - Add model_group_affinity_config parameter to Router and DeploymentAffinityCheck - Add _get_effective_flags helper to resolve flags per model group - Update async_filter_deployments and async_pre_call_deployment_hook to use per-group config - Add 4 comprehensive tests covering per-group config, fallback, and override scenarios This allows fine-grained control of affinity behavior across model groups, e.g., enabling stickiness only for cross-provider deployments while leaving other groups free to load-balance. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR adds per-model-group affinity configuration ( Key changes:
Confidence Score: 3/5
|
| Filename | Overview |
|---|---|
| litellm/router.py | Adds model_group_affinity_config parameter to Router constructor and add_optional_pre_call_checks; creates a standalone DeploymentAffinityCheck callback when only per-group config is provided. Bug: model_group_affinity_config is missing from _allowed_settings in update_settings, causing silent no-ops on dynamic config reload. |
| litellm/router_utils/pre_call_checks/deployment_affinity_check.py | Adds _get_effective_flags helper and model_group_affinity_config support. Per-group flag resolution is clean; early extraction of deployment_model_name in async_pre_call_deployment_hook is a justified refactor for per-group flag resolution. Warning validation of unknown flag names at init is a good addition. |
| litellm/types/router.py | Adds model_group_affinity_config: Optional[Dict[str, List[str]]] to UpdateRouterConfig. Type definition is correct but the corresponding update_settings implementation doesn't process it. |
| tests/test_litellm/router_utils/pre_call_checks/test_deployment_affinity_check.py | Four new tests covering per-group config, global fallback, and override behavior. All tests use mocks (no real network calls). Test logic is sound; the integration test test_model_group_affinity_config_only_applies_to_configured_group correctly patches HTTP handler and routing strategy. |
| docs/my-website/docs/proxy/config_settings.md | Adds documentation entry for model_group_affinity_config in the router settings table. Clear and accurate. |
| docs/my-website/docs/response_api.md | Adds a new "Per-Model-Group Affinity Configuration" section with SDK and proxy YAML examples. Well-structured and accurate. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Router.__init__ called] --> B{model_group_affinity_config set?}
B -- Yes + no global affinity checks --> C[Create DeploymentAffinityCheck\nwith all global flags=False\nand model_group_affinity_config]
B -- Yes + global affinity checks --> D[add_optional_pre_call_checks\npasses model_group_affinity_config\nto new or existing callback]
B -- No --> E[Normal flow: global flags only]
C --> F[Register callback in optional_callbacks]
D --> F
F --> G[Incoming request: async_filter_deployments]
G --> H[_get_effective_flags model_group]
H --> I{Group in\nmodel_group_affinity_config?}
I -- Yes --> J[Use per-group flags]
I -- No --> K[Use global instance flags]
J --> L[Apply affinity checks\nresponses_api / session_id / user_key]
K --> L
L --> M[Return filtered deployments]
G --> N[async_pre_call_deployment_hook]
N --> O[Extract deployment_model_name from metadata]
O --> P{deployment_model_name found?}
P -- No --> Q[debug log + return None]
P -- Yes --> R[_get_effective_flags deployment_model_name]
R --> S{Any affinity\nflag enabled?}
S -- No --> Q
S -- Yes --> T[Write affinity mapping to cache]
Comments Outside Diff (1)
-
litellm/router.py, line 8497-8499 (link)model_group_affinity_configmissing from_allowed_settingsinupdate_settingsUpdateRouterConfig(inlitellm/types/router.py) was correctly updated to includemodel_group_affinity_config, but theupdate_settingsmethod's_allowed_settingsallowlist was NOT updated. When the proxy refreshes its configuration from the DB (callingllm_router.update_settings(**combined_router_settings)), anymodel_group_affinity_configvalue in the config will be silently dropped — the method just logs"Setting model_group_affinity_config is not allowed"at debug level and ignores it.Initial startup still works because the Router constructor receives it directly, but any dynamic config reload (e.g., via the admin API or DB-backed config) will silently fail to apply per-group affinity changes.
Last reviewed commit: "docs: add per-model-..."
| if not deployment_model_name: | ||
| return None |
There was a problem hiding this comment.
The original code emitted a verbose_router_logger.warning(...) when deployment_model_name was missing (with model_id for correlation), which was useful for diagnosing misconfigured setups. The refactoring replaced it with a silent return None. Consider restoring a log at debug/warning level to maintain observability, especially since this early-return now also gates the per-group flag resolution.
| if not deployment_model_name: | |
| return None | |
| if not deployment_model_name: | |
| verbose_router_logger.debug( | |
| "DeploymentAffinityCheck: deployment_model_name missing in metadata; skipping affinity cache update." | |
| ) | |
| return None |
| def _get_effective_flags( | ||
| self, model_group: str | ||
| ) -> Tuple[bool, bool, bool]: | ||
| """ | ||
| Return (enable_user_key_affinity, enable_responses_api_affinity, enable_session_id_affinity) | ||
| for the given model group. | ||
|
|
||
| If the model group has an explicit entry in model_group_affinity_config, use it. | ||
| Otherwise fall back to the global instance flags. | ||
| """ | ||
| group_checks = self.model_group_affinity_config.get(model_group) | ||
| if group_checks is not None: | ||
| return ( | ||
| "deployment_affinity" in group_checks, | ||
| "responses_api_deployment_check" in group_checks, | ||
| "session_affinity" in group_checks, | ||
| ) | ||
| return ( | ||
| self.enable_user_key_affinity, | ||
| self.enable_responses_api_affinity, | ||
| self.enable_session_id_affinity, | ||
| ) |
There was a problem hiding this comment.
No validation of per-group flag strings
_get_effective_flags silently ignores unknown flag strings in a group's list. A user typo like "deployment_affinityy" or "responses_api_check" would result in all flags being False for that group with no warning, making debugging very difficult.
Consider adding a validation step (either at init time or here) to warn about unrecognised flag names:
VALID_FLAGS = {"deployment_affinity", "responses_api_deployment_check", "session_affinity"}
def _get_effective_flags(self, model_group: str) -> Tuple[bool, bool, bool]:
group_checks = self.model_group_affinity_config.get(model_group)
if group_checks is not None:
unknown = set(group_checks) - VALID_FLAGS
if unknown:
verbose_router_logger.warning(
"DeploymentAffinityCheck: unknown flag(s) %s for model group '%s'; will be ignored.",
unknown, model_group,
)
return (
"deployment_affinity" in group_checks,
"responses_api_deployment_check" in group_checks,
"session_affinity" in group_checks,
)
...…n on unknown affinity flags Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
|
@greptile-apps re review |
de21715
into
litellm_dev_sameer_16_march_week
Summary
Enable
deployment_affinity,responses_api_deployment_check, andsession_affinityfeatures to be configured per model group viarouter_settings.model_group_affinity_config. This allows fine-grained control of affinity behavior across model groups while maintaining backward compatibility (unconfigured groups fall back to global settings).Relevant issues
Addresses customer request for model-level stickiness configuration (e.g., enable affinity for cross-provider deployments like Bedrock+Azure while leaving other groups free to load-balance).
Pre-Submission checklist
tests/test_litellm/— 4 new tests covering per-group config, fallback behavior, and override scenariosmake test-unit(all 13 affinity tests + 6 related tests pass)Changes
litellm/router.py: Addmodel_group_affinity_configparameter; pass toDeploymentAffinityCheck; handle standalone creation when only per-group config is provideddeployment_affinity_check.py: Add_get_effective_flags(model_group)helper; use per-group flags inasync_filter_deploymentsandasync_pre_call_deployment_hooklitellm/types/router.py: Addmodel_group_affinity_configtoUpdateRouterConfigUsage