-
-
Notifications
You must be signed in to change notification settings - Fork 6.7k
fix(responses-bridge): extract list-format system content into instructions #21192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
5dbcca8
2f3dc48
ca2ce4a
ab4b619
72848f4
7561c1a
ea2e5ff
b48bec0
60954d2
38e2ecb
41445ff
3ebafc4
1828e68
30b28da
df54e1b
3ec91d4
e957030
933955b
f915b15
08b61f4
96802e1
19ef616
18c79c3
759e9cb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,43 @@ | ||
| # Grafana Pyroscope CPU profiling | ||
|
|
||
| LiteLLM proxy can send continuous CPU profiles to [Grafana Pyroscope](https://grafana.com/docs/pyroscope/latest/) when enabled via environment variables. This is optional and off by default. | ||
|
|
||
| ## Quick start | ||
|
|
||
| 1. **Install the optional dependency** (required only when enabling Pyroscope): | ||
|
|
||
| ```bash | ||
| pip install pyroscope-io | ||
| ``` | ||
|
|
||
| Or install the proxy extra: | ||
|
|
||
| ```bash | ||
| pip install "litellm[proxy]" | ||
| ``` | ||
|
|
||
| 2. **Set environment variables** before starting the proxy: | ||
|
|
||
| | Variable | Required | Description | | ||
| |----------|----------|-------------| | ||
| | `LITELLM_ENABLE_PYROSCOPE` | Yes (to enable) | Set to `true` to enable Pyroscope profiling. | | ||
| | `PYROSCOPE_APP_NAME` | Yes (when enabled) | Application name shown in the Pyroscope UI. | | ||
| | `PYROSCOPE_SERVER_ADDRESS` | Yes (when enabled) | Pyroscope server URL (e.g. `http://localhost:4040`). | | ||
| | `PYROSCOPE_SAMPLE_RATE` | No | Sample rate (integer). If unset, the pyroscope-io library default is used. | | ||
|
|
||
| 3. **Start the proxy**; profiling will begin automatically when the proxy starts. | ||
|
|
||
| ```bash | ||
| export LITELLM_ENABLE_PYROSCOPE=true | ||
| export PYROSCOPE_APP_NAME=litellm-proxy | ||
| export PYROSCOPE_SERVER_ADDRESS=http://localhost:4040 | ||
| litellm --config config.yaml | ||
| ``` | ||
|
|
||
| 4. **View profiles** in the Pyroscope (or Grafana) UI and select your `PYROSCOPE_APP_NAME`. | ||
|
|
||
| ## Notes | ||
|
|
||
| - **Optional dependency**: `pyroscope-io` is an optional dependency. If it is not installed and `LITELLM_ENABLE_PYROSCOPE=true`, the proxy will log a warning and continue without profiling. | ||
| - **Platform support**: The `pyroscope-io` package uses a native extension and is not available on all platforms (e.g. Windows is excluded by the package). | ||
| - **Other settings**: See [Configuration settings](/proxy/config_settings) for all proxy environment variables. |
This file was deleted.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,3 @@ | ||
| -- AlterTable | ||
| ALTER TABLE "LiteLLM_AccessGroupTable" DROP COLUMN "access_model_ids", | ||
| ADD COLUMN "access_model_names" TEXT[] DEFAULT ARRAY[]::TEXT[]; |
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -163,15 +163,23 @@ def convert_chat_completion_messages_to_responses_api( | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| instructions = f"{instructions} {content}" | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| else: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| instructions = content | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| elif isinstance(content, list): | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| # Extract text from content blocks (e.g. [{"type": "text", "text": "..."}]) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| text_parts = [] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| for block in content: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| if isinstance(block, dict) and block.get("type") == "text": | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| text_parts.append(block.get("text", "")) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| elif isinstance(block, str): | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| text_parts.append(block) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| extracted = " ".join(text_parts) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| if instructions: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| instructions = f"{instructions} {extracted}" | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| else: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| instructions = extracted | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Comment on lines
+166
to
+178
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Silently drops non-str/list system content The old
Suggested change
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time! |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| else: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| input_items.append( | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| { | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| "type": "message", | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| "role": role, | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| "content": self._convert_content_to_responses_format( | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| content, role # type: ignore | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ), | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| verbose_logger.warning( | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| "Unexpected system message content type: %s. Skipping.", | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| type(content), | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| elif role == "tool": | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| # Convert tool message to function call output format | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,69 @@ | ||
| """ | ||
| Test guardrails for pipeline E2E testing. | ||
|
|
||
| - StrictFilter: blocks any message containing "bad" (case-insensitive) | ||
| - PermissiveFilter: always passes (simulates an advanced guardrail that is more lenient) | ||
| """ | ||
|
|
||
| from typing import Optional, Union | ||
|
|
||
| from fastapi import HTTPException | ||
|
|
||
| from litellm._logging import verbose_proxy_logger | ||
| from litellm.caching.caching import DualCache | ||
| from litellm.integrations.custom_guardrail import CustomGuardrail | ||
| from litellm.proxy._types import UserAPIKeyAuth | ||
| from litellm.types.utils import CallTypesLiteral | ||
|
|
||
|
|
||
| class StrictFilter(CustomGuardrail): | ||
| """Blocks any message containing the word 'bad'.""" | ||
|
|
||
| async def async_pre_call_hook( | ||
| self, | ||
| user_api_key_dict: UserAPIKeyAuth, | ||
| cache: DualCache, | ||
| data: dict, | ||
| call_type: CallTypesLiteral, | ||
| ) -> Optional[Union[Exception, str, dict]]: | ||
| for msg in data.get("messages", []): | ||
| content = msg.get("content", "") | ||
| if isinstance(content, str) and "bad" in content.lower(): | ||
| verbose_proxy_logger.info("StrictFilter: BLOCKED - found 'bad'") | ||
| raise HTTPException( | ||
| status_code=400, | ||
| detail="StrictFilter: content contains forbidden word 'bad'", | ||
| ) | ||
| verbose_proxy_logger.info("StrictFilter: PASSED") | ||
| return data | ||
|
|
||
|
|
||
| class PermissiveFilter(CustomGuardrail): | ||
| """Always passes - simulates a lenient advanced guardrail.""" | ||
|
|
||
| async def async_pre_call_hook( | ||
| self, | ||
| user_api_key_dict: UserAPIKeyAuth, | ||
| cache: DualCache, | ||
| data: dict, | ||
| call_type: CallTypesLiteral, | ||
| ) -> Optional[Union[Exception, str, dict]]: | ||
| verbose_proxy_logger.info("PermissiveFilter: PASSED (always passes)") | ||
| return data | ||
|
|
||
|
|
||
| class AlwaysBlockFilter(CustomGuardrail): | ||
| """Always blocks - for testing full escalation->block path.""" | ||
|
|
||
| async def async_pre_call_hook( | ||
| self, | ||
| user_api_key_dict: UserAPIKeyAuth, | ||
| cache: DualCache, | ||
| data: dict, | ||
| call_type: CallTypesLiteral, | ||
| ) -> Optional[Union[Exception, str, dict]]: | ||
| verbose_proxy_logger.info("AlwaysBlockFilter: BLOCKED") | ||
| raise HTTPException( | ||
| status_code=400, | ||
| detail="AlwaysBlockFilter: all content blocked", | ||
| ) |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,64 @@ | ||
| model_list: | ||
| - model_name: fake-openai-endpoint | ||
| litellm_params: | ||
| model: openai/gpt-3.5-turbo | ||
| api_key: fake-key | ||
| api_base: https://exampleopenaiendpoint-production.up.railway.app/ | ||
| - model_name: fake-blocked-endpoint | ||
| litellm_params: | ||
| model: openai/gpt-3.5-turbo | ||
| api_key: fake-key | ||
| api_base: https://exampleopenaiendpoint-production.up.railway.app/ | ||
|
|
||
| guardrails: | ||
| - guardrail_name: "strict-filter" | ||
| litellm_params: | ||
| guardrail: pipeline_test_guardrails.StrictFilter | ||
| mode: "pre_call" | ||
| - guardrail_name: "permissive-filter" | ||
| litellm_params: | ||
| guardrail: pipeline_test_guardrails.PermissiveFilter | ||
| mode: "pre_call" | ||
| - guardrail_name: "always-block-filter" | ||
| litellm_params: | ||
| guardrail: pipeline_test_guardrails.AlwaysBlockFilter | ||
| mode: "pre_call" | ||
|
|
||
| policies: | ||
| # Pipeline: strict-filter fails -> escalate to permissive-filter | ||
| # If strict fails but permissive passes -> allow the request | ||
| content-safety-permissive: | ||
| description: "Multi-tier: strict filter with permissive fallback" | ||
| guardrails: | ||
| add: [strict-filter, permissive-filter] | ||
| pipeline: | ||
| mode: "pre_call" | ||
| steps: | ||
| - guardrail: strict-filter | ||
| on_fail: next # escalate to permissive | ||
| on_pass: allow # clean content proceeds | ||
| - guardrail: permissive-filter | ||
| on_fail: block # hard block | ||
| on_pass: allow # permissive says OK | ||
|
|
||
| # Pipeline: strict-filter fails -> escalate to always-block | ||
| # Both fail -> block | ||
| content-safety-strict: | ||
| description: "Multi-tier: strict filter with strict fallback (both block)" | ||
| guardrails: | ||
| add: [strict-filter, always-block-filter] | ||
| pipeline: | ||
| mode: "pre_call" | ||
| steps: | ||
| - guardrail: strict-filter | ||
| on_fail: next | ||
| on_pass: allow | ||
| - guardrail: always-block-filter | ||
| on_fail: block | ||
| on_pass: allow | ||
|
|
||
| policy_attachments: | ||
| - policy: content-safety-permissive | ||
| models: [fake-openai-endpoint] | ||
| - policy: content-safety-strict | ||
| models: [fake-blocked-endpoint] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing test coverage for new branch
This PR adds handling for list-format system content, but there are no corresponding unit tests. The existing test file (
tests/test_litellm/completion_extras/litellm_responses_transformation/test_completion_extras_litellm_responses_transformation_transformation.py) has no tests for system message handling at all — neither for string content nor for this new list-content path.Please add a test case that verifies:
content: [{"type": "text", "text": "Hello"}, {"type": "text", "text": "World"}]producesinstructions = "Hello World"and no system input items.instructions.This would help prevent regressions and satisfy the PR template requirement of "Add at least 1 test in
tests/litellm/".Context Used: Rule from
dashboard- What: Ensure that any PR claiming to fix an issue includes evidence that the issue is resolved, such... (source)Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!