Skip to content

Add native_background_mode to override polling_via_cache for specific models#19899

Merged
Sameerlite merged 2 commits intoBerriAI:litellm_oss_staging_01_27_2026from
xianzongxie-stripe:add_native_background_mode_override
Jan 28, 2026
Merged

Add native_background_mode to override polling_via_cache for specific models#19899
Sameerlite merged 2 commits intoBerriAI:litellm_oss_staging_01_27_2026from
xianzongxie-stripe:add_native_background_mode_override

Conversation

@xianzongxie-stripe
Copy link
Contributor

@xianzongxie-stripe xianzongxie-stripe commented Jan 28, 2026

Relevant issues

This follow-up to PR #16862 allows users to specify models that should use the native provider's background mode instead of polling via cache.

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run
    Link:

  • CI run for the last commit
    Link:

  • Merge / cherry-pick CI run
    Links:

Type

🆕 New Feature

Changes

This follow-up to PR #16862 allows users to specify models that should use the native provider's background mode instead of polling via cache.

Config example:
litellm_settings:
responses:
background_mode:
polling_via_cache: ["openai"]
native_background_mode: ["o4-mini-deep-research"]
ttl: 3600

When a model is in native_background_mode list, should_use_polling_for_request returns False, allowing the request to fall through to native provider handling.

Testing

Config:

litellm_settings:
  cache: true
  cache_params:
    type: "redis"
    ttl: 3600
    host: "127.0.0.1"
    port: 6379
  responses:
    background_mode:
      polling_via_cache: ["openai"]
      native_background_mode: ["o4-mini-deep-research"]
      ttl: 3600

polling_via_cache enabled for gpt-5.2:

  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-1234" \
  -d '{
    "model": "gpt-5.2",
    "input": "Tell me a three sentence bedtime story about a unicorn.",
    "background": true
  }' \
  | jq

Response:

  "id": "litellm_poll_488c5546-7b0c-412d-955e-5792202d2201",
  "created_at": 1769579421,
  "error": null,
  "incomplete_details": null,
  "instructions": null,
  "metadata": {},
  "model": null,
  "object": "response",
  "output": [],
  "parallel_tool_calls": null,
  "temperature": null,
  "tool_choice": null,
  "tools": null,
  "top_p": null,
  "max_output_tokens": null,
  "previous_response_id": null,
  "reasoning": null,
  "status": "queued",
  "text": null,
  "truncation": null,
  "usage": null,
  "user": null,
  "store": null
}

Poll the response:

  curl http://0.0.0.0:4000/v1/responses/litellm_poll_488c5546-7b0c-412d-955e-5792202d2201 \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-1234" \
  | jq
{
  "id": "litellm_poll_488c5546-7b0c-412d-955e-5792202d2201",
  "created_at": 1769579421,
  "error": null,
  "incomplete_details": null,
  "instructions": null,
  "metadata": {},
  "model": "gpt-5.2-2025-12-11",
  "object": "response",
  "output": [
    {
      "id": "msg_0846a0de0fb14792006979a39e16188190bdb1f669aa610ea8",
      "content": [
        {
          "annotations": [],
          "text": "A gentle unicorn named Luma trotted through a moonlit meadow, leaving a soft trail of sparkling dew behind her. When she found a lost little bunny shivering under a fern, she warmed it with her glowing horn and guided it safely home. As the stars hummed quietly overhead, Luma curled up in the silver grass and everyone in the meadow drifted into peaceful dreams.",
          "type": "output_text",
          "logprobs": []
        }
      ],
      "role": "assistant",
      "status": "completed",
      "type": "message"
    }
  ],
  "parallel_tool_calls": true,
  "temperature": 1.0,
  "tool_choice": "auto",
  "tools": [],
  "top_p": 0.98,
  "max_output_tokens": null,
  "previous_response_id": null,
  "reasoning": {
    "effort": "none"
  },
  "status": "completed",
  "text": {
    "format": {
      "type": "text"
    },
    "verbosity": "medium"
  },
  "truncation": "disabled",
  "usage": {
    "input_tokens": 17,
    "input_tokens_details": {
      "cached_tokens": 0
    },
    "output_tokens": 81,
    "output_tokens_details": {
      "reasoning_tokens": 0
    },
    "total_tokens": 98
  },
  "user": null,
  "store": true
}

When native background enabled, user should get response id starts with resp:

Request:

curl http://0.0.0.0:4000/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-1234" \
  -d '{
    "model": "o4-mini-deep-research",
    "input": "Tell me a three sentence bedtime story about a unicorn.",
    "tools": [
      {"type": "web_search"}
    ],
    "background": true
  }' \
  | jq

Response:

{
  "id": "resp_bGl0ZWxsbTpjdXN0b21fbGxtX3Byb3ZpZGVyOm9wZW5haTttb2RlbF9pZDpmOGYxOWJiNzQ1NGM1OTlmMzI2NzhiZjQ2NzViZjg4ZDAzZTVjZmY4YzAwNWZlYmI3NmE2NWIzNDVkYWVmMmE5O3Jlc3BvbnNlX2lkOnJlc3BfMDExMWE4NzE5MWI5ZDM4NTAwNjk3OWE3MDk5ZmM0ODE5NmJiZjUxOTkwODMxNzUyZGU=",
  "created_at": 1769580297,
  "error": null,
  "incomplete_details": null,
  "instructions": null,
  "metadata": {},
  "model": "o4-mini-deep-research-2025-06-26",
  "object": "response",
  "output": [],
  "parallel_tool_calls": true,
  "temperature": 1.0,
  "tool_choice": "auto",
  "tools": [
    {
      "type": "web_search_preview",
      "search_context_size": "medium",
      "user_location": null
    }
  ],
  "top_p": 1.0,
  "max_output_tokens": null,
  "previous_response_id": null,
  "reasoning": {
    "effort": "medium",
    "summary": null
  },
  "status": "queued",
  "text": {
    "format": {
      "type": "text"
    },
    "verbosity": "medium"
  },
  "truncation": "disabled",
  "usage": null,
  "user": null,
  "store": true,
  "background": true,
  "completed_at": null,
  "frequency_penalty": 0.0,
  "max_tool_calls": 225,
  "presence_penalty": 0.0,
  "prompt_cache_key": null,
  "prompt_cache_retention": null,
  "safety_identifier": null,
  "service_tier": "auto",
  "top_logprobs": 0
}

Get response with response id under native background mode:
Request:

curl http://0.0.0.0:4000/v1/responses/resp_bGl0ZWxsbTpjdXN0b21fbGxtX3Byb3ZpZGVyOm9wZW5haTttb2RlbF9pZDpmOGYxOWJiNzQ1NGM1OTlmMzI2NzhiZjQ2NzViZjg4ZDAzZTVjZmY4YzAwNWZlYmI3NmE2NWIzNDVkYWVmMmE5O3Jlc3BvbnNlX2lkOnJlc3BfMDExMWE4NzE5MWI5ZDM4NTAwNjk3OWE3MDk5ZmM0ODE5NmJiZjUxOTkwODMxNzUyZGU\= \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-1234" \
  | jq

Partial response:

{
  "id": "resp_bGl0ZWxsbTpjdXN0b21fbGxtX3Byb3ZpZGVyOm9wZW5haTttb2RlbF9pZDpmOGYxOWJiNzQ1NGM1OTlmMzI2NzhiZjQ2NzViZjg4ZDAzZTVjZmY4YzAwNWZlYmI3NmE2NWIzNDVkYWVmMmE5O3Jlc3BvbnNlX2lkOnJlc3BfMDExMWE4NzE5MWI5ZDM4NTAwNjk3OWE3MDk5ZmM0ODE5NmJiZjUxOTkwODMxNzUyZGU=",
  "created_at": 1769580297,
  "error": null,
  "incomplete_details": null,
  "instructions": null,
  "metadata": {},
  "model": "o4-mini-deep-research-2025-06-26",
  "object": "response",
  "output": [],
  "parallel_tool_calls": true,
  "temperature": 1.0,
  "tool_choice": "auto",
  "tools": [
    {
      "type": "web_search_preview",
      "search_context_size": "medium",
      "user_location": null
    }
  ],
  "top_p": 1.0,
  "max_output_tokens": null,
  "previous_response_id": null,
  "reasoning": {
    "effort": "medium",
    "summary": null
  },
  "status": "in_progress",
  "text": {
    "format": {
      "type": "text"
    },
    "verbosity": "medium"
  },
  "truncation": "disabled",
  "usage": null,
  "user": null,
  "store": true,
  "background": true,
  "completed_at": null,
  "frequency_penalty": 0.0,
  "max_tool_calls": 225,
  "presence_penalty": 0.0,
  "prompt_cache_key": null,
  "prompt_cache_retention": null,
  "safety_identifier": null,
  "service_tier": "auto",
  "top_logprobs": 0
}

… models

This follow-up to PR BerriAI#16862 allows users to specify models that should use
the native provider's background mode instead of polling via cache.

Config example:
  litellm_settings:
    responses:
      background_mode:
        polling_via_cache: ["openai"]
        native_background_mode: ["o4-mini-deep-research"]
        ttl: 3600

When a model is in native_background_mode list, should_use_polling_for_request
returns False, allowing the request to fall through to native provider handling.

Committed-By-Agent: cursor
@vercel
Copy link

vercel bot commented Jan 28, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
litellm Error Error Jan 28, 2026 0:50am

Request Review

Added 8 new unit tests for the native_background_mode feature:
- test_polling_disabled_when_model_in_native_background_mode
- test_polling_disabled_for_native_background_mode_with_provider_list
- test_polling_enabled_when_model_not_in_native_background_mode
- test_polling_enabled_when_native_background_mode_is_none
- test_polling_enabled_when_native_background_mode_is_empty_list
- test_native_background_mode_exact_match_required
- test_native_background_mode_with_provider_prefix_in_request
- test_native_background_mode_with_router_lookup

Committed-By-Agent: cursor
@xianzongxie-stripe xianzongxie-stripe changed the title Add native_background_mode to override polling_via_cache for specific… Add native_background_mode to override polling_via_cache for specific models Jan 28, 2026
@xianzongxie-stripe xianzongxie-stripe marked this pull request as ready for review January 28, 2026 06:12
Copy link
Collaborator

@Sameerlite Sameerlite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Sameerlite Sameerlite changed the base branch from main to litellm_oss_staging_01_27_2026 January 28, 2026 06:57
@Sameerlite Sameerlite merged commit 9b44984 into BerriAI:litellm_oss_staging_01_27_2026 Jan 28, 2026
5 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants