Skip to content

fix(test): update realtime guardrail test assertions for voice violation behavior#22332

Merged
jquinter merged 2 commits intomainfrom
fix/realtime-guardrail-test-assertions
Feb 28, 2026
Merged

fix(test): update realtime guardrail test assertions for voice violation behavior#22332
jquinter merged 2 commits intomainfrom
fix/realtime-guardrail-test-assertions

Conversation

@jquinter
Copy link
Contributor

Summary

  • Two realtime streaming guardrail tests were asserting that no response.create / conversation.item.create messages are sent to the backend when a guardrail blocks
  • But the implementation at realtime_streaming.py:348-363 intentionally sends these to have the LLM voice the guardrail violation message to the user in audio sessions
  • Updated test assertions to match the actual (correct) behavior:
    • test_realtime_guardrail_blocks_prompt_injection: Now verifies the full guardrail flow — response.cancel + guardrail conversation.item.create + response.create
    • test_realtime_text_input_guardrail_blocks_and_returns_error: Now filters out guardrail-injected items and only asserts the original blocked message wasn't forwarded

Test plan

  • Both previously failing tests pass locally
  • CI passes

🤖 Generated with Claude Code

…ion behavior

Tests were asserting no response.create/conversation.item.create sent to
backend when guardrail blocks, but the implementation intentionally sends
these to have the LLM voice the guardrail violation message to the user.

Updated assertions to verify the correct guardrail flow:
- response.cancel is sent to stop any in-progress response
- conversation.item.create with violation message is injected
- response.create is sent to voice the violation
- original blocked content is NOT forwarded

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@vercel
Copy link

vercel bot commented Feb 28, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Feb 28, 2026 2:06am

Request Review

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 28, 2026

Greptile Summary

This PR fixes two previously failing realtime streaming guardrail tests by aligning their assertions with the actual implementation in realtime_streaming.py. The implementation intentionally sends response.cancel + conversation.item.create + response.create to the backend when a guardrail blocks, so the LLM can voice the violation message to the user in audio sessions. The old tests incorrectly asserted that no messages should be sent to the backend.

  • test_realtime_guardrail_blocks_prompt_injection: Now asserts the full guardrail flow (cancel → inject violation item → create response) instead of asserting zero response.create messages
  • test_realtime_text_input_guardrail_blocks_and_returns_error: Now filters out guardrail-injected items and only asserts the original blocked message wasn't forwarded
  • The docstring on the first test still says "NOT sending response.create", contradicting the updated assertions
  • The second test uses a hardcoded string match ("Say exactly the following message") to distinguish guardrail items from original items, which is fragile

Confidence Score: 4/5

  • This PR is safe to merge — it only modifies test assertions to match existing, intentional implementation behavior.
  • The changes are test-only and correctly align assertions with the actual guardrail implementation. The two minor style issues (stale docstring and brittle string filtering) are non-blocking. No production code is changed, no new network calls are introduced, and both tests should now pass.
  • No files require special attention — the only changed file is a test file with no production impact.

Important Files Changed

Filename Overview
tests/test_litellm/litellm_core_utils/test_realtime_streaming.py Updated two guardrail test assertions to align with the actual implementation behavior: guardrail blocks now voice the violation message via conversation.item.create + response.create. Minor issues: stale docstring and fragile string-based filtering.

Sequence Diagram

sequenceDiagram
    participant Backend as Backend WS (LLM)
    participant Proxy as RealTimeStreaming
    participant Client as Client WS

    Note over Backend,Client: Audio transcription guardrail flow
    Backend->>Proxy: conversation.item.input_audio_transcription.completed
    Proxy->>Proxy: run_realtime_guardrails(transcript)
    Proxy-->>Proxy: Guardrail raises exception (blocked)
    Proxy->>Backend: response.cancel
    Proxy->>Client: error (guardrail_violation)
    Proxy->>Backend: conversation.item.create (violation prompt)
    Proxy->>Backend: response.create

    Note over Backend,Client: Text input guardrail flow
    Client->>Proxy: conversation.item.create (user text)
    Proxy->>Proxy: run_realtime_guardrails(text)
    Proxy-->>Proxy: Guardrail raises exception (blocked)
    Proxy->>Backend: response.cancel
    Proxy->>Client: error (guardrail_violation)
    Proxy->>Backend: conversation.item.create (violation prompt)
    Proxy->>Backend: response.create
    Note over Proxy: Original message NOT forwarded
Loading

Last reviewed commit: 9a48c8e

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +624 to +631
# Filter out guardrail-injected items (contain "Say exactly the following message")
original_items = [
item for item in forwarded_items
if not any(
"Say exactly the following message" in c.get("text", "")
for c in item.get("item", {}).get("content", [])
if isinstance(c, dict)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fragile coupling to implementation string literal

Filtering guardrail-injected items by checking for the hardcoded substring "Say exactly the following message" is brittle — if the prompt wording in realtime_streaming.py:351 changes, this test will silently pass incorrectly (it would stop filtering guardrail items and could mask a real forwarding bug). Consider asserting on a more stable property, such as checking that no item contains the original user text ("My email is test@example.com").

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 28, 2026

Additional Comments (1)

tests/test_litellm/litellm_core_utils/test_realtime_streaming.py
Stale docstring contradicts updated assertions

The docstring still says "NOT sending response.create to the backend", but the updated assertions now correctly expect exactly one response.create (the guardrail-triggered one). Please update the docstring to reflect the actual intended behavior.

    """
    Test that when a transcription event containing prompt injection arrives from the
    backend, a registered guardrail blocks it — sending a warning to the client
    and voicing the guardrail violation message via response.cancel + conversation.item.create + response.create.
    """

Addresses Greptile review feedback.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jquinter jquinter merged commit 9b20a05 into main Feb 28, 2026
30 of 34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant