[feat]Add prompt management support for responses api#23999
Conversation
- Fix async path: call async_get_chat_completion_prompt in aresponses() before executor dispatch, mirroring acompletion() in main.py. Discard merged_optional_params in async path (sync responses() handles them via local_vars), avoiding TypeError from duplicate kwargs in partial(). - Fix provider re-resolution: replace "/" in model heuristic with model != original_model comparison so bare model names are handled. - Add 3 async tests covering hook invocation, optional param propagation, and non-message item filtering in aresponses(). Made-with: Cursor
…l params - aresponses() now pops prompt_id from kwargs after the async hook runs and passes merged_optional_params via _async_prompt_merged_params. responses() checks for this internal kwarg first and skips the sync hook entirely when present — eliminating double-merge of template messages. - merged_optional_params from async_get_chat_completion_prompt is no longer discarded (_); it flows through to local_vars in responses(). - Async tests now assert get_chat_completion_prompt.assert_not_called() to directly detect any double-execution regression. Made-with: Cursor
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR adds prompt management support ( Key changes:
Issues from the prior review — status:
Remaining concerns:
Confidence Score: 4/5
|
| Filename | Overview |
|---|---|
| litellm/responses/main.py | Adds prompt management blocks to both responses() and aresponses(); the double-merge and optional-param discard bugs from the prior review are resolved via the _async_prompt_merged_params sentinel kwarg; one minor inconsistency remains where local_vars["model"] is not explicitly refreshed in the _async_merged fast path (unlike the sync path), though it is benign in practice. |
| tests/test_litellm/responses/test_responses_prompt_management.py | 9 new mock-only unit tests; async tests now correctly assert get_chat_completion_prompt.assert_not_called() to catch double-merge regressions; test_optional_params_from_template_applied checks temperature propagation but relies on the internal ResponsesAPIOptionalRequestParams TypedDict being passable to .get() which works fine. |
Sequence Diagram
sequenceDiagram
participant Caller
participant aresponses
participant LiteLLMLoggingObj
participant responses
participant Handler
Caller->>aresponses: aresponses(input, model, prompt_id, **kwargs)
aresponses->>aresponses: get_llm_provider(model)
aresponses->>LiteLLMLoggingObj: should_run_prompt_management_hooks(prompt_id)
alt prompt management needed
LiteLLMLoggingObj-->>aresponses: True
aresponses->>LiteLLMLoggingObj: async_get_chat_completion_prompt(model, messages, prompt_id)
LiteLLMLoggingObj-->>aresponses: (merged_model, merged_input, merged_optional_params)
aresponses->>aresponses: kwargs.pop("prompt_id")
aresponses->>aresponses: kwargs["_async_prompt_merged_params"] = merged_optional_params
aresponses->>aresponses: input = merged_input, model = merged_model
else no prompt management
LiteLLMLoggingObj-->>aresponses: False
end
aresponses->>responses: partial(responses, input=input, model=model, **kwargs)
responses->>responses: local_vars = locals()
responses->>responses: get_llm_provider(model, custom_llm_provider)
responses->>responses: _async_merged = kwargs.pop("_async_prompt_merged_params", None)
alt _async_merged is not None
responses->>responses: apply _async_merged to local_vars (skip sync hook)
else no async merged params
responses->>LiteLLMLoggingObj: should_run_prompt_management_hooks(prompt_id)
alt prompt management needed
LiteLLMLoggingObj-->>responses: True
responses->>LiteLLMLoggingObj: get_chat_completion_prompt(model, messages, prompt_id)
LiteLLMLoggingObj-->>responses: (merged_model, merged_input, merged_optional_params)
responses->>responses: update local_vars with merged results
end
end
responses->>responses: get_requested_response_api_optional_param(local_vars)
responses->>Handler: response_api_handler(model, input, responses_api_request)
Handler-->>responses: ResponsesAPIResponse
responses-->>aresponses: ResponsesAPIResponse
aresponses-->>Caller: ResponsesAPIResponse
Last reviewed commit: "fix: prevent double ..."
| _async_merged = kwargs.pop("_async_prompt_merged_params", None) | ||
| if _async_merged is not None: | ||
| for k, v in _async_merged.items(): | ||
| local_vars[k] = v |
There was a problem hiding this comment.
local_vars["model"] not refreshed in async fast-path (inconsistency with sync path)
In the sync prompt-management block (lines 718–719) the code explicitly writes:
local_vars["model"] = model
local_vars["input"] = inputIn this _async_merged fast-path only the optional params from the template are applied to local_vars; model and input are not re-written. This is not a functional bug today — model is passed in as a function argument (captured correctly by locals()) and local_vars["input"] is overwritten at line 746 — but the inconsistency makes the code harder to reason about and could silently break if anything upstream reads local_vars["model"] before line 804.
| _async_merged = kwargs.pop("_async_prompt_merged_params", None) | |
| if _async_merged is not None: | |
| for k, v in _async_merged.items(): | |
| local_vars[k] = v | |
| _async_merged = kwargs.pop("_async_prompt_merged_params", None) | |
| if _async_merged is not None: | |
| # Keep model / input in sync with local_vars just as the sync path does | |
| local_vars["model"] = model | |
| local_vars["input"] = input | |
| for k, v in _async_merged.items(): | |
| local_vars[k] = v |
aafe9da
into
BerriAI:litellm_dev_sameer_16_march_week
Relevant issues
Fixes LIT-2135
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/test_litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unit@greptileaiand received a Confidence Score of at least 4/5 before requesting a maintainer reviewDelays in PR merge?
If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).
CI (LiteLLM team)
Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:
Type
🐛 Bug Fix
Changes