Add native_background_mode to override polling_via_cache for specific models by xianzongxie-stripe · Pull Request #19899 · BerriAI/litellm

xianzongxie-stripe · 2026-01-28T00:22:54Z

Relevant issues

This follow-up to PR #16862 allows users to specify models that should use the native provider's background mode instead of polling via cache.

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem

CI (LiteLLM team)

CI status guideline:

50-55 passing tests: main is stable with minor issues.

45-49 passing tests: acceptable but needs attention

<= 40 passing tests: unstable; be careful with your merges and assess the risk.

Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:

Type

🆕 New Feature

Changes

This follow-up to PR #16862 allows users to specify models that should use the native provider's background mode instead of polling via cache.

Config example:
litellm_settings:
responses:
background_mode:
polling_via_cache: ["openai"]
native_background_mode: ["o4-mini-deep-research"]
ttl: 3600

When a model is in native_background_mode list, should_use_polling_for_request returns False, allowing the request to fall through to native provider handling.

Testing

Config:

litellm_settings:
  cache: true
  cache_params:
    type: "redis"
    ttl: 3600
    host: "127.0.0.1"
    port: 6379
  responses:
    background_mode:
      polling_via_cache: ["openai"]
      native_background_mode: ["o4-mini-deep-research"]
      ttl: 3600

polling_via_cache enabled for gpt-5.2:

  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-1234" \
  -d '{
    "model": "gpt-5.2",
    "input": "Tell me a three sentence bedtime story about a unicorn.",
    "background": true
  }' \
  | jq

Response:

  "id": "litellm_poll_488c5546-7b0c-412d-955e-5792202d2201",
  "created_at": 1769579421,
  "error": null,
  "incomplete_details": null,
  "instructions": null,
  "metadata": {},
  "model": null,
  "object": "response",
  "output": [],
  "parallel_tool_calls": null,
  "temperature": null,
  "tool_choice": null,
  "tools": null,
  "top_p": null,
  "max_output_tokens": null,
  "previous_response_id": null,
  "reasoning": null,
  "status": "queued",
  "text": null,
  "truncation": null,
  "usage": null,
  "user": null,
  "store": null
}

Poll the response:

  curl http://0.0.0.0:4000/v1/responses/litellm_poll_488c5546-7b0c-412d-955e-5792202d2201 \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-1234" \
  | jq

{
  "id": "litellm_poll_488c5546-7b0c-412d-955e-5792202d2201",
  "created_at": 1769579421,
  "error": null,
  "incomplete_details": null,
  "instructions": null,
  "metadata": {},
  "model": "gpt-5.2-2025-12-11",
  "object": "response",
  "output": [
    {
      "id": "msg_0846a0de0fb14792006979a39e16188190bdb1f669aa610ea8",
      "content": [
        {
          "annotations": [],
          "text": "A gentle unicorn named Luma trotted through a moonlit meadow, leaving a soft trail of sparkling dew behind her. When she found a lost little bunny shivering under a fern, she warmed it with her glowing horn and guided it safely home. As the stars hummed quietly overhead, Luma curled up in the silver grass and everyone in the meadow drifted into peaceful dreams.",
          "type": "output_text",
          "logprobs": []
        }
      ],
      "role": "assistant",
      "status": "completed",
      "type": "message"
    }
  ],
  "parallel_tool_calls": true,
  "temperature": 1.0,
  "tool_choice": "auto",
  "tools": [],
  "top_p": 0.98,
  "max_output_tokens": null,
  "previous_response_id": null,
  "reasoning": {
    "effort": "none"
  },
  "status": "completed",
  "text": {
    "format": {
      "type": "text"
    },
    "verbosity": "medium"
  },
  "truncation": "disabled",
  "usage": {
    "input_tokens": 17,
    "input_tokens_details": {
      "cached_tokens": 0
    },
    "output_tokens": 81,
    "output_tokens_details": {
      "reasoning_tokens": 0
    },
    "total_tokens": 98
  },
  "user": null,
  "store": true
}

When native background enabled, user should get response id starts with resp:

Request:

curl http://0.0.0.0:4000/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-1234" \
  -d '{
    "model": "o4-mini-deep-research",
    "input": "Tell me a three sentence bedtime story about a unicorn.",
    "tools": [
      {"type": "web_search"}
    ],
    "background": true
  }' \
  | jq

Response:

{
  "id": "resp_bGl0ZWxsbTpjdXN0b21fbGxtX3Byb3ZpZGVyOm9wZW5haTttb2RlbF9pZDpmOGYxOWJiNzQ1NGM1OTlmMzI2NzhiZjQ2NzViZjg4ZDAzZTVjZmY4YzAwNWZlYmI3NmE2NWIzNDVkYWVmMmE5O3Jlc3BvbnNlX2lkOnJlc3BfMDExMWE4NzE5MWI5ZDM4NTAwNjk3OWE3MDk5ZmM0ODE5NmJiZjUxOTkwODMxNzUyZGU=",
  "created_at": 1769580297,
  "error": null,
  "incomplete_details": null,
  "instructions": null,
  "metadata": {},
  "model": "o4-mini-deep-research-2025-06-26",
  "object": "response",
  "output": [],
  "parallel_tool_calls": true,
  "temperature": 1.0,
  "tool_choice": "auto",
  "tools": [
    {
      "type": "web_search_preview",
      "search_context_size": "medium",
      "user_location": null
    }
  ],
  "top_p": 1.0,
  "max_output_tokens": null,
  "previous_response_id": null,
  "reasoning": {
    "effort": "medium",
    "summary": null
  },
  "status": "queued",
  "text": {
    "format": {
      "type": "text"
    },
    "verbosity": "medium"
  },
  "truncation": "disabled",
  "usage": null,
  "user": null,
  "store": true,
  "background": true,
  "completed_at": null,
  "frequency_penalty": 0.0,
  "max_tool_calls": 225,
  "presence_penalty": 0.0,
  "prompt_cache_key": null,
  "prompt_cache_retention": null,
  "safety_identifier": null,
  "service_tier": "auto",
  "top_logprobs": 0
}

Get response with response id under native background mode:
Request:

curl http://0.0.0.0:4000/v1/responses/resp_bGl0ZWxsbTpjdXN0b21fbGxtX3Byb3ZpZGVyOm9wZW5haTttb2RlbF9pZDpmOGYxOWJiNzQ1NGM1OTlmMzI2NzhiZjQ2NzViZjg4ZDAzZTVjZmY4YzAwNWZlYmI3NmE2NWIzNDVkYWVmMmE5O3Jlc3BvbnNlX2lkOnJlc3BfMDExMWE4NzE5MWI5ZDM4NTAwNjk3OWE3MDk5ZmM0ODE5NmJiZjUxOTkwODMxNzUyZGU\= \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-1234" \
  | jq

Partial response:

{
  "id": "resp_bGl0ZWxsbTpjdXN0b21fbGxtX3Byb3ZpZGVyOm9wZW5haTttb2RlbF9pZDpmOGYxOWJiNzQ1NGM1OTlmMzI2NzhiZjQ2NzViZjg4ZDAzZTVjZmY4YzAwNWZlYmI3NmE2NWIzNDVkYWVmMmE5O3Jlc3BvbnNlX2lkOnJlc3BfMDExMWE4NzE5MWI5ZDM4NTAwNjk3OWE3MDk5ZmM0ODE5NmJiZjUxOTkwODMxNzUyZGU=",
  "created_at": 1769580297,
  "error": null,
  "incomplete_details": null,
  "instructions": null,
  "metadata": {},
  "model": "o4-mini-deep-research-2025-06-26",
  "object": "response",
  "output": [],
  "parallel_tool_calls": true,
  "temperature": 1.0,
  "tool_choice": "auto",
  "tools": [
    {
      "type": "web_search_preview",
      "search_context_size": "medium",
      "user_location": null
    }
  ],
  "top_p": 1.0,
  "max_output_tokens": null,
  "previous_response_id": null,
  "reasoning": {
    "effort": "medium",
    "summary": null
  },
  "status": "in_progress",
  "text": {
    "format": {
      "type": "text"
    },
    "verbosity": "medium"
  },
  "truncation": "disabled",
  "usage": null,
  "user": null,
  "store": true,
  "background": true,
  "completed_at": null,
  "frequency_penalty": 0.0,
  "max_tool_calls": 225,
  "presence_penalty": 0.0,
  "prompt_cache_key": null,
  "prompt_cache_retention": null,
  "safety_identifier": null,
  "service_tier": "auto",
  "top_logprobs": 0
}

… models This follow-up to PR BerriAI#16862 allows users to specify models that should use the native provider's background mode instead of polling via cache. Config example: litellm_settings: responses: background_mode: polling_via_cache: ["openai"] native_background_mode: ["o4-mini-deep-research"] ttl: 3600 When a model is in native_background_mode list, should_use_polling_for_request returns False, allowing the request to fall through to native provider handling. Committed-By-Agent: cursor

vercel · 2026-01-28T00:22:59Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Review	Updated (UTC)
litellm	Error		Jan 28, 2026 0:50am

Added 8 new unit tests for the native_background_mode feature: - test_polling_disabled_when_model_in_native_background_mode - test_polling_disabled_for_native_background_mode_with_provider_list - test_polling_enabled_when_model_not_in_native_background_mode - test_polling_enabled_when_native_background_mode_is_none - test_polling_enabled_when_native_background_mode_is_empty_list - test_native_background_mode_exact_match_required - test_native_background_mode_with_provider_prefix_in_request - test_native_background_mode_with_router_lookup Committed-By-Agent: cursor

Sameerlite

LGTM

vercel bot had a problem deploying to Preview January 28, 2026 00:24 Failure

vercel bot had a problem deploying to Preview January 28, 2026 00:50 Failure

xianzongxie-stripe changed the title ~~Add native_background_mode to override polling_via_cache for specific…~~ Add native_background_mode to override polling_via_cache for specific models Jan 28, 2026

xianzongxie-stripe marked this pull request as ready for review January 28, 2026 06:12

Sameerlite approved these changes Jan 28, 2026

View reviewed changes

Sameerlite changed the base branch from main to litellm_oss_staging_01_27_2026 January 28, 2026 06:57

Sameerlite merged commit 9b44984 into BerriAI:litellm_oss_staging_01_27_2026 Jan 28, 2026
5 of 7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add native_background_mode to override polling_via_cache for specific models#19899

Add native_background_mode to override polling_via_cache for specific models#19899
Sameerlite merged 2 commits intoBerriAI:litellm_oss_staging_01_27_2026from
xianzongxie-stripe:add_native_background_mode_override

xianzongxie-stripe commented Jan 28, 2026 •

edited

Loading

Uh oh!

vercel bot commented Jan 28, 2026 •

edited

Loading

Uh oh!

Sameerlite left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

xianzongxie-stripe commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Relevant issues

Pre-Submission checklist

CI (LiteLLM team)

Type

Changes

Testing

Uh oh!

vercel bot commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Sameerlite left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

xianzongxie-stripe commented Jan 28, 2026 •

edited

Loading

vercel bot commented Jan 28, 2026 •

edited

Loading