Skip to content

Fix: Vertex ai Batch Output File Download Fails with 500#23718

Merged
Sameerlite merged 1 commit intomainfrom
litellm_fix_vertex_ai_batch
Mar 16, 2026
Merged

Fix: Vertex ai Batch Output File Download Fails with 500#23718
Sameerlite merged 1 commit intomainfrom
litellm_fix_vertex_ai_batch

Conversation

@Sameerlite
Copy link
Contributor

Relevant issues

When using LiteLLM managed batches with a Vertex AI Gemini model, the batch job completes successfully and the proxy correctly updates the batch status to completed and populates output_file_id. However, calling client.files.content(output_file_id) to download the results returns a 500 error.

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run
    Link:

  • CI run for the last commit
    Link:

  • Merge / cherry-pick CI run
    Links:

Type

🐛 Bug Fix
🧹 Refactoring

Changes

Primary root cause (empty target_model_names)

In the managed-files post-processing hook (_PROXY_LiteLLMManagedFiles.async_post_call_success_hook), LiteLLM created unified output file IDs using model_name.
For Vertex batch retrieve path, the response often had model_id but no model_name in _hidden_params.
The unified ID builder (get_unified_output_file_id) used model_name or "", so it wrote:
target_model_names, (empty)
That’s why your decoded output_file_id had blank target model names.

@vercel
Copy link

vercel bot commented Mar 16, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Mar 16, 2026 6:39am

Request Review

@codspeed-hq
Copy link
Contributor

codspeed-hq bot commented Mar 16, 2026

Merging this PR will not alter performance

✅ 16 untouched benchmarks


Comparing litellm_fix_vertex_ai_batch (22b333c) with main (cd37ee1)

Open in CodSpeed

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 16, 2026

Greptile Summary

This PR fixes a 500 error when downloading Vertex AI batch output files through the LiteLLM proxy. Two root causes are addressed: (1) the unified output file ID was built with a blank target_model_names field because Vertex batch-retrieve responses don't populate _hidden_params.model_name, and (2) the output GCS path was being constructed incorrectly, potentially yielding just /predictions.jsonl when gcsOutputDirectory was empty.

Key changes:

  • managed_files.py: When model_name is absent from the response, the code now recovers target_model_names by decoding the managed input file ID and extracts the model list from it, ensuring the unified output file ID carries correct routing metadata.
  • batches/transformation.py: Rearranges the /predictions.jsonl suffix logic to check for an empty gcsOutputDirectory before appending, and similarly normalises the outputUriPrefix fallback path.
  • Two new test files cover the regression scenario (missing model_name) and the corrected GCS path construction.

Remaining concern: The else branch in async_post_call_success_hook that calls litellm.afile_retrieve still references the original model_name variable (possibly None) rather than resolved_model_name. While this path is only reached when _llm_router is None (an uncommon setup), it means the Vertex AI fix is incomplete in that configuration.

Confidence Score: 3/5

  • Mostly safe to merge — the primary fix is correct, but an incomplete use of model_name instead of resolved_model_name in the file-retrieve fallback branch should be addressed first.
  • The core logic for recovering target_model_names from the managed input file ID is sound and well-tested. The GCS path normalisation is also correct. However, the else branch for afile_retrieve (line 957) still uses the unresolved model_name, leaving the Vertex AI case partially broken when _llm_router is None. This is a real, reachable bug path, albeit less common.
  • Pay close attention to enterprise/litellm_enterprise/proxy/hooks/managed_files.py — specifically the else branch of the afile_retrieve call at line 957 that uses model_name instead of resolved_model_name.

Important Files Changed

Filename Overview
enterprise/litellm_enterprise/proxy/hooks/managed_files.py Adds model-name recovery fallback from managed input file ID when model_name is absent in _hidden_params. The fix correctly passes resolved_model_name to get_unified_output_file_id, but the parallel else branch for litellm.afile_retrieve still uses the original model_name variable, leaving a partial fix for the Vertex AI case when _llm_router is unavailable.
litellm/llms/vertex_ai/batches/transformation.py Fixes _get_output_file_id_from_vertex_ai_batch_response to avoid pre-emptively appending /predictions.jsonl before checking for an empty GCS directory, and consistently appends the suffix in both the primary and fallback code paths. Logic is sound; the residual != "/predictions.jsonl" guard is technically correct but slightly confusing after the refactor.
tests/enterprise/litellm_enterprise/proxy/hooks/test_managed_files.py Adds a regression test that verifies target_model_names are preserved in the unified output file ID when model_name is absent from _hidden_params. Uses mocks correctly; no real network calls.
tests/test_litellm/llms/vertex_ai/test_vertex_ai_batch_transformation.py New unit tests covering the primary (outputInfo.gcsOutputDirectory) and fallback (outputConfig.gcsDestination.outputUriPrefix) paths for output file ID generation. Missing a test for the outputUriPrefix that already ends with /predictions.jsonl.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Proxy as LiteLLM Proxy
    participant Hook as _PROXY_LiteLLMManagedFiles
    participant Vertex as Vertex AI
    participant Cache as Unified File Cache

    Client->>Proxy: POST /v1/batches (with managed file ID)
    Proxy->>Vertex: Submit batch job
    Vertex-->>Proxy: batch response (no model_name in hidden_params)

    Proxy->>Hook: async_post_call_success_hook(response)

    Note over Hook: model_name is None from _hidden_params
    Hook->>Hook: Decode unified_file_id (base64)
    Hook->>Hook: get_models_from_unified_file_id()<br/>→ ["gemini-2.5-pro"]
    Hook->>Hook: resolved_model_name = "gemini-2.5-pro"

    Hook->>Hook: get_unified_output_file_id(<br/>output_file_id, model_id,<br/>resolved_model_name)
    Note over Hook: Unified ID now contains<br/>target_model_names,gemini-2.5-pro

    Hook->>Vertex: afile_retrieve(original_file_id, **creds)
    Vertex-->>Hook: file object
    Hook->>Cache: store_unified_file_id(unified_file_id, file_object)

    Proxy-->>Client: batch response with correct unified output_file_id

    Client->>Proxy: GET /v1/files/{unified_output_file_id}/content
    Proxy->>Cache: lookup unified_output_file_id
    Cache-->>Proxy: file object with routing metadata
    Proxy-->>Client: 200 OK (file content)
Loading

Comments Outside Diff (1)

  1. enterprise/litellm_enterprise/proxy/hooks/managed_files.py, line 956-959 (link)

    Incomplete fix: model_name still used in fallback branch

    This else branch (reached when _llm_router is None) still references model_name — the original, potentially-None variable — instead of resolved_model_name, which was introduced by this PR to recover the provider from the managed input file ID.

    When model_name is None (exactly the Vertex AI case this PR fixes) and _llm_router happens to be None, custom_llm_provider will silently fall back to "openai", causing a failed file-retrieve attempt on a GCS-backed file.

Last reviewed commit: 22b333c

Comment on lines +148 to 151
if output_file_id:
output_file_id = output_file_id.rstrip("/") + "/predictions.jsonl"
if output_file_id and output_file_id != "/predictions.jsonl":
return output_file_id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant guard after rstrip logic

After the if output_file_id: block runs, output_file_id is always stripped_dir + "/predictions.jsonl" where stripped_dir is the non-empty gcsOutputDirectory with trailing slashes removed. It can only equal "/predictions.jsonl" if the original directory value was literally "/" or a sequence of slashes — an essentially invalid GCS path.

The condition is technically harmless (and does handle that theoretical edge case), but it reads confusingly after the refactor. Consider a brief comment to clarify its purpose:

Suggested change
if output_file_id:
output_file_id = output_file_id.rstrip("/") + "/predictions.jsonl"
if output_file_id and output_file_id != "/predictions.jsonl":
return output_file_id
if output_file_id:
output_file_id = output_file_id.rstrip("/") + "/predictions.jsonl"
# Guard against a bare "/" directory producing "/predictions.jsonl"
if output_file_id and output_file_id != "/predictions.jsonl":
return output_file_id

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment on lines +1 to +38
from litellm.llms.vertex_ai.batches.transformation import VertexAIBatchTransformation


def test_output_file_id_uses_predictions_jsonl_with_output_info():
response = {
"outputInfo": {
"gcsOutputDirectory": "gs://test-bucket/litellm-vertex-files/publishers/google/models/gemini-2.5-pro/prediction-model-123"
}
}

output_file_id = VertexAIBatchTransformation._get_output_file_id_from_vertex_ai_batch_response(
response
)

assert (
output_file_id
== "gs://test-bucket/litellm-vertex-files/publishers/google/models/gemini-2.5-pro/prediction-model-123/predictions.jsonl"
)


def test_output_file_id_falls_back_to_output_uri_prefix_with_predictions_jsonl():
response = {
"outputInfo": {},
"outputConfig": {
"gcsDestination": {
"outputUriPrefix": "gs://test-bucket/litellm-vertex-files/publishers/google/models/gemini-2.5-pro/prediction-model-456"
}
},
}

output_file_id = VertexAIBatchTransformation._get_output_file_id_from_vertex_ai_batch_response(
response
)

assert (
output_file_id
== "gs://test-bucket/litellm-vertex-files/publishers/google/models/gemini-2.5-pro/prediction-model-456/predictions.jsonl"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing test for outputUriPrefix that already ends with /predictions.jsonl

The new guard if output_uri_prefix.endswith("/predictions.jsonl"): return output_uri_prefix is not exercised by any test. Without a test, a future refactor could accidentally double-append /predictions.jsonl and go undetected.

Consider adding:

def test_output_file_id_does_not_double_append_predictions_jsonl():
    response = {
        "outputInfo": {},
        "outputConfig": {
            "gcsDestination": {
                "outputUriPrefix": "gs://test-bucket/litellm-vertex-files/prediction-model-789/predictions.jsonl"
            }
        },
    }

    output_file_id = VertexAIBatchTransformation._get_output_file_id_from_vertex_ai_batch_response(
        response
    )

    assert output_file_id == "gs://test-bucket/litellm-vertex-files/prediction-model-789/predictions.jsonl"

@Sameerlite Sameerlite merged commit ab377f3 into main Mar 16, 2026
93 of 99 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants