Skip to content

Generic Guardrails: Add a configurable fallback to handle generic guardrail endpoint connection failures#21245

Merged
krrishdholakia merged 3 commits intoBerriAI:litellm_oss_staging_02_16_2026from
itayov:generic-guardrails-api-fallback
Feb 16, 2026
Merged

Generic Guardrails: Add a configurable fallback to handle generic guardrail endpoint connection failures#21245
krrishdholakia merged 3 commits intoBerriAI:litellm_oss_staging_02_16_2026from
itayov:generic-guardrails-api-fallback

Conversation

@itayov
Copy link
Contributor

@itayov itayov commented Feb 15, 2026

Add a configurable fallback to handle generic guardrail endpoint connection failures

Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run
    Link:

  • CI run for the last commit
    Link:

  • Merge / cherry-pick CI run
    Links:

Type

🆕 New Feature
📖 Documentation
✅ Test

Changes

  • Add unreachable_fallback config option for generic_guardrail_api (fail_closed default, fail_open on network/unreachable errors).
  • Log CRITICAL and proceed when fail_open is enabled and guardrail endpoint is unreachable.
  • Add unit tests + docs/example config updates.

When fail_closed (same like before):
image

When fail_open (log critical, but continue the flow):
image

@vercel
Copy link

vercel bot commented Feb 15, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Feb 15, 2026 6:05pm

Request Review

@itayov
Copy link
Contributor Author

itayov commented Feb 15, 2026

@ greptileai

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 15, 2026

Greptile Summary

This PR adds a configurable unreachable_fallback option (fail_closed / fail_open) for the Generic Guardrail API, allowing requests to proceed when the guardrail endpoint is unreachable due to network errors. The default behavior remains fail_closed (raise an error), preserving backward compatibility.

  • Adds unreachable_fallback config param to GenericGuardrailAPI, BaseLitellmParams, and GenericGuardrailAPIOptionalParams
  • Catches httpx.RequestError (transport-level: DNS, connect, timeout) separately from other exceptions; in fail_open mode, logs a CRITICAL message and allows the request to proceed unmodified
  • Adds two unit tests covering both fail_closed (default) and fail_open behaviors with mocked network errors
  • Updates docs and example config with the new option
  • Note: The unreachable_fallback field is placed on BaseLitellmParams (shared by all guardrail types), but only consumed by GenericGuardrailAPI — other guardrail types will silently ignore it
  • Note: HTTP 502/503 responses (common when a guardrail sits behind a reverse proxy) are not covered by the fail_open path since httpx.HTTPStatusError is not a subclass of httpx.RequestError

Confidence Score: 4/5

  • This PR is safe to merge — the default behavior is unchanged (fail_closed) and the new fail_open path is opt-in with proper logging.
  • The implementation is clean, well-tested, and backward-compatible. The default remains fail_closed so no existing behavior changes. Minor concerns: (1) the unreachable_fallback field leaks to all guardrail types via BaseLitellmParams, and (2) fail_open only handles transport-level errors, not HTTP 502/503 from reverse proxies. Neither is a blocker.
  • litellm/types/guardrails.pyunreachable_fallback is on the shared base class affecting all guardrail types. litellm/proxy/guardrails/guardrail_hooks/generic_guardrail_api/generic_guardrail_api.py — fail_open scope limited to httpx.RequestError only.

Important Files Changed

Filename Overview
litellm/proxy/guardrails/guardrail_hooks/generic_guardrail_api/generic_guardrail_api.py Core implementation of fail-open/fail-closed fallback for httpx.RequestError. Logic is sound for connection-level errors but does not cover HTTP 502/503 errors that could indicate an unreachable endpoint behind a reverse proxy.
litellm/types/guardrails.py Adds unreachable_fallback to BaseLitellmParams, which is the base for ALL guardrail types — not just Generic Guardrail API. The field is exposed in configs for all guardrails but only consumed by generic_guardrail_api, which may confuse users of other guardrail types.
litellm/types/proxy/guardrails/guardrail_hooks/generic_guardrail_api.py Adds unreachable_fallback to GenericGuardrailAPIOptionalParams with proper type and documentation. Minor formatting change to litellm_trace_id.
litellm/proxy/guardrails/guardrail_hooks/generic_guardrail_api/init.py Passes unreachable_fallback from config to GenericGuardrailAPI constructor with a safe default of fail_closed.
tests/test_litellm/proxy/guardrails/guardrail_hooks/test_generic_guardrail_api.py Adds two well-structured unit tests covering both fail_closed (default) and fail_open behavior. All tests use mocks — no real network calls.
docs/my-website/docs/adding_provider/generic_guardrail_api.md Documents the new unreachable_fallback config option with clear description of both modes.
litellm/proxy/guardrails/guardrail_hooks/generic_guardrail_api/example_config.yaml Adds unreachable_fallback example with both options documented in a comment.

Flowchart

flowchart TD
    A[apply_guardrail called] --> B[Build request payload]
    B --> C[POST to guardrail endpoint]
    C --> D{Response received?}
    D -->|Yes| E[raise_for_status]
    E -->|2xx| F{action == BLOCKED?}
    F -->|Yes| G[Raise GuardrailRaisedException]
    F -->|No| H[Return inputs with modifications]
    E -->|4xx/5xx| I[HTTPStatusError caught by generic except]
    I --> J[Raise Exception - always]
    D -->|No - RequestError| K{unreachable_fallback?}
    K -->|fail_closed| L[Log error + Raise Exception]
    K -->|fail_open| M[Log CRITICAL + Return inputs unchanged]
Loading

Last reviewed commit: 653293c

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

7 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

@itayov
Copy link
Contributor Author

itayov commented Feb 15, 2026

@greptileai

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 15, 2026

Greptile Summary

This PR adds a configurable unreachable_fallback option (fail_closed / fail_open) to the Generic Guardrail API, allowing users to choose whether to block or proceed when the guardrail endpoint is unreachable. The default remains fail_closed (raise on error), preserving backward compatibility.

  • Adds unreachable_fallback field to BaseLitellmParams, GenericGuardrailAPIOptionalParams, and the GenericGuardrailAPI constructor
  • Implements separate exception handlers for httpx.HTTPStatusError (502/503/504) and httpx.RequestError (DNS/connect failures) with fail-open passthrough
  • Extracts _fail_open_passthrough and _build_guardrail_return_inputs helper methods for cleaner error handling
  • Logs at CRITICAL level with full context (guardrail name, api_base, call/trace IDs) when fail-open is triggered
  • Adds 3 unit tests covering fail_closed default, fail_open on network error, and fail_open on HTTP 503
  • Issue found: Timeout errors are converted to litellm.Timeout by AsyncHTTPHandler.post() and bypass the httpx.RequestError catch — they will always raise even with fail_open enabled

Confidence Score: 3/5

  • PR is mostly safe but has a gap where timeout errors bypass the fail_open path
  • The implementation is well-structured with clean helper extraction and proper logging. However, the timeout exception gap means the feature doesn't fully work as documented — users who enable fail_open expecting resilience to endpoint unreachability will still see failures on timeouts, which is one of the most common network failure modes. The default behavior (fail_closed) is unchanged and safe.
  • Pay close attention to litellm/proxy/guardrails/guardrail_hooks/generic_guardrail_api/generic_guardrail_api.py — the exception handling chain needs to also catch litellm.Timeout for the fail_open path to be complete.

Important Files Changed

Filename Overview
litellm/proxy/guardrails/guardrail_hooks/generic_guardrail_api/generic_guardrail_api.py Core implementation of fail_open/fail_closed logic. Well-structured with extracted helpers. Has a bug where timeout errors bypass the fail_open path because AsyncHTTPHandler.post() converts httpx.TimeoutException to litellm.Timeout (not httpx.RequestError).
litellm/proxy/guardrails/guardrail_hooks/generic_guardrail_api/init.py Passes the new unreachable_fallback param through to GenericGuardrailAPI constructor. Clean and consistent with existing parameter forwarding pattern.
litellm/types/guardrails.py Adds unreachable_fallback field to BaseLitellmParams with clear documentation noting it's only implemented for generic_guardrail_api. Also adds it to the normalize_lowercase validator.
litellm/types/proxy/guardrails/guardrail_hooks/generic_guardrail_api.py Adds unreachable_fallback to GenericGuardrailAPIOptionalParams for UI/config model. Minor formatting changes to litellm_trace_id.
tests/test_litellm/proxy/guardrails/guardrail_hooks/test_generic_guardrail_api.py Adds 3 new tests for fail_open/fail_closed behavior. All mock-based (no real network calls). Missing test for timeout scenario, which is the gap in the implementation.
docs/my-website/docs/adding_provider/generic_guardrail_api.md Adds documentation for the new unreachable_fallback config option with clear description of behavior.
litellm/proxy/guardrails/guardrail_hooks/generic_guardrail_api/example_config.yaml Adds unreachable_fallback example to the config YAML template with inline comment describing options.

Flowchart

flowchart TD
    A[apply_guardrail called] --> B[Build request payload]
    B --> C[async_handler.post to guardrail endpoint]
    C -->|Success 2xx| D[Parse response]
    D -->|action=BLOCKED| E[Raise GuardrailRaisedException]
    D -->|action=NONE/INTERVENED| F[Return inputs with modifications]
    
    C -->|httpx.HTTPStatusError| G{Status code 502/503/504?}
    G -->|Yes| H{unreachable_fallback?}
    G -->|No| I[Raise Exception]
    
    H -->|fail_open| J[Log CRITICAL & proceed]
    H -->|fail_closed| I
    
    C -->|httpx.RequestError| K{unreachable_fallback?}
    K -->|fail_open| J
    K -->|fail_closed| I
    
    C -->|litellm.Timeout| L["Falls to generic except Exception ⚠️"]
    L --> I

    style L fill:#ff6b6b,stroke:#c92a2a,color:#fff
    style J fill:#51cf66,stroke:#2b8a3e,color:#fff
Loading

Last reviewed commit: 207b622

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

7 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@itayov
Copy link
Contributor Author

itayov commented Feb 15, 2026

@greptileai

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 15, 2026

Greptile Summary

This PR adds a configurable unreachable_fallback option (fail_closed / fail_open) to the Generic Guardrail API integration, allowing operators to choose whether requests should proceed when the guardrail endpoint is unreachable.

  • Adds unreachable_fallback config field to BaseLitellmParams, GenericGuardrailAPIOptionalParams, and the GenericGuardrailAPI constructor, defaulting to fail_closed (existing behavior)
  • Implements fail-open handling for three error categories: litellm.Timeout (wrapping httpx.TimeoutException), httpx.HTTPStatusError with status 502/503/504, and httpx.RequestError (DNS, connection refused, etc.)
  • Logs at CRITICAL level with guardrail name, API base, call/trace IDs when fail-open is triggered
  • Refactors return-value construction into _build_guardrail_return_inputs and _fail_open_passthrough helper methods
  • Adds 4 well-structured mock-only unit tests covering default fail_closed, fail_open with network errors, HTTP 503, and timeouts
  • Updates documentation and example config

Confidence Score: 4/5

  • This PR is safe to merge with minimal risk — it adds a new opt-in config option that defaults to existing behavior.
  • The implementation correctly handles three categories of unreachable errors (timeout, HTTP 502/503/504, transport-level). The default behavior (fail_closed) is preserved, making this a safe additive change. Exception handling order is correct (specific before generic), and test coverage is good. The only minor concern is the response.raise_for_status() dead code on line 408, which is pre-existing.
  • litellm/proxy/guardrails/guardrail_hooks/generic_guardrail_api/generic_guardrail_api.py has a pre-existing dead raise_for_status() call that could be cleaned up.

Important Files Changed

Filename Overview
litellm/proxy/guardrails/guardrail_hooks/generic_guardrail_api/generic_guardrail_api.py Core implementation of fail-open/fail-closed fallback behavior. Adds exception handlers for litellm.Timeout, httpx.HTTPStatusError (502/503/504), and httpx.RequestError. Refactors return-value construction into helper methods. The response.raise_for_status() on line 408 is dead code (already called by AsyncHTTPHandler.post), but pre-existing.
litellm/proxy/guardrails/guardrail_hooks/generic_guardrail_api/init.py Passes unreachable_fallback from parsed config to GenericGuardrailAPI constructor via getattr with default.
litellm/types/guardrails.py Adds unreachable_fallback field to BaseLitellmParams (shared base for all guardrails) with clear documentation that it only applies to generic_guardrail_api. Also adds it to the normalize_lowercase field validator list.
litellm/types/proxy/guardrails/guardrail_hooks/generic_guardrail_api.py Adds unreachable_fallback field to GenericGuardrailAPIOptionalParams for UI/config model. Minor reformatting of litellm_trace_id field.
tests/test_litellm/proxy/guardrails/guardrail_hooks/test_generic_guardrail_api.py Adds 4 new tests covering: default fail_closed behavior, fail_open with network errors, fail_open with HTTP 503, and fail_open with litellm.Timeout. All tests use mocks (no real network calls) and cover the key scenarios well.
docs/my-website/docs/adding_provider/generic_guardrail_api.md Adds unreachable_fallback config option to the documentation example with clear description.
litellm/proxy/guardrails/guardrail_hooks/generic_guardrail_api/example_config.yaml Adds unreachable_fallback option to the example config YAML file.

Flowchart

flowchart TD
    A[apply_guardrail called] --> B[POST to guardrail endpoint]
    B --> C{Response OK?}
    C -->|Yes| D{action == BLOCKED?}
    D -->|Yes| E[Raise GuardrailRaisedException]
    D -->|No| F[Return inputs with response modifications]
    C -->|No - Exception| G{Exception Type?}
    G -->|GuardrailRaisedException| E
    G -->|litellm.Timeout| H{unreachable_fallback?}
    G -->|httpx.HTTPStatusError 502/503/504| H
    G -->|httpx.RequestError| H
    G -->|httpx.HTTPStatusError other| I[Raise Exception]
    G -->|Other Exception| I
    H -->|fail_open| J[Log CRITICAL + return original inputs]
    H -->|fail_closed| I
Loading

Last reviewed commit: ffb5aa1

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

7 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 15, 2026

Additional Comments (1)

litellm/proxy/guardrails/guardrail_hooks/generic_guardrail_api/generic_guardrail_api.py
Dead code: raise_for_status() already called upstream

AsyncHTTPHandler.post() (in litellm/llms/custom_httpx/http_handler.py:466) already calls response.raise_for_status() before returning. If the response had a non-2xx status code, httpx.HTTPStatusError would have been raised inside post() and would never reach this line. This call is therefore unreachable.

This is pre-existing (not introduced by this PR), but worth noting since this method is being refactored anyway.

@krrishdholakia krrishdholakia changed the base branch from main to litellm_oss_staging_02_16_2026 February 16, 2026 16:32
@krrishdholakia krrishdholakia merged commit bc2fefd into BerriAI:litellm_oss_staging_02_16_2026 Feb 16, 2026
12 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants