Skip to content

fix(test): mock vertexai module in GPT-OSS tests to prevent authentication#21276

Merged
jquinter merged 1 commit intomainfrom
fix/vertex-gpt-oss-test-mock
Feb 15, 2026
Merged

fix(test): mock vertexai module in GPT-OSS tests to prevent authentication#21276
jquinter merged 1 commit intomainfrom
fix/vertex-gpt-oss-test-mock

Conversation

@jquinter
Copy link
Contributor

Summary

Fixes 401 authentication errors in test_vertex_ai_gpt_oss_simple_request and test_vertex_ai_gpt_oss_reasoning_effort tests.

Root Cause

The tests were mocking VertexLLM._ensure_access_token but not the vertexai module import. When the code executed import vertexai in vertex_ai_partner_models/main.py, it triggered authentication attempts even with mocked tokens. This caused the tests to make real API calls with the fake token, resulting in 401 errors in CI.

Changes

  • Added patch.dict('sys.modules', {'vertexai': mock_vertexai, 'vertexai.preview': mock_vertexai.preview}) to both async tests
  • This mocks the entire vertexai module, preventing import-time authentication
  • Works in combination with the existing autouse fixture (PR fix(test): add environment cleanup for Vertex AI GPT-OSS tests #21272) that clears environment variables

Testing

  • All 4 tests in the file pass: poetry run pytest tests/test_litellm/llms/vertex_ai/vertex_ai_partner_models/gpt_oss/test_vertex_ai_gpt_oss_transformation.py -v
  • Tests now fully isolated and won't attempt real API calls

Related Issues

This is not related to PR #21217 (which only modifies Anthropic tests). This is a pre-existing issue in the Vertex AI GPT-OSS tests that manifests in CI due to the vertexai module being installed.

🤖 Generated with Claude Code

…ation

The test_vertex_ai_gpt_oss_simple_request and test_vertex_ai_gpt_oss_reasoning_effort
tests were failing in CI with 401 authentication errors. This was because the
vertexai module import was triggering authentication attempts even though the
_ensure_access_token method was mocked.

Added patch.dict('sys.modules', ...) to mock the vertexai module entirely,
preventing it from trying to authenticate when imported. This ensures tests
are fully isolated and don't attempt real API calls regardless of environment
variables or test execution order.

This follows the same pattern used in other Vertex AI tests and works in
combination with the autouse fixture that clears environment variables.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@vercel
Copy link

vercel bot commented Feb 15, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Feb 15, 2026 10:46pm

Request Review

@jquinter jquinter merged commit 682e8c0 into main Feb 15, 2026
17 of 23 checks passed
jquinter added a commit that referenced this pull request Feb 15, 2026
…ution

Implements three key improvements to reduce test flakiness from parallel execution:

1. **Split Vertex AI tests into separate group** (workers: 1)
   - Vertex AI tests often have environment variable pollution issues
   - Running serially prevents cross-test interference with GOOGLE_APPLICATION_CREDENTIALS
   - Isolates authentication-related test failures

2. **Reduce workers for other LLM tests** (4 -> 2)
   - Decreases chance of race conditions and state conflicts
   - Still parallel but with less contention

3. **Add --dist=loadscope to pytest-xdist**
   - Keeps tests from the same file together on one worker
   - Reduces interference between unrelated test modules
   - Data shows 70% pass rate WITH loadscope vs 40% WITHOUT
   - Better test isolation while maintaining parallelism

Note: loadscope exposes one tokenizer cache issue in core-utils which will be
fixed in a separate PR. The tradeoff is worth it (7/10 pass vs 4/10 without).

These changes address the root causes of intermittent test failures in:
PRs #21268, #21271, #21272, #21273, #21275, #21276:
- Environment variable pollution (GOOGLE_APPLICATION_CREDENTIALS, VERTEXAI_PROJECT)
- Global state conflicts (litellm.known_tokenizer_config)
- Async mock timing issues with parallel execution

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
jquinter added a commit that referenced this pull request Feb 18, 2026
…ution

Implements three key improvements to reduce test flakiness from parallel execution:

1. **Split Vertex AI tests into separate group** (workers: 1)
   - Vertex AI tests often have environment variable pollution issues
   - Running serially prevents cross-test interference with GOOGLE_APPLICATION_CREDENTIALS
   - Isolates authentication-related test failures

2. **Reduce workers for other LLM tests** (4 -> 2)
   - Decreases chance of race conditions and state conflicts
   - Still parallel but with less contention

3. **Add --dist=loadscope to pytest-xdist**
   - Keeps tests from the same file together on one worker
   - Reduces interference between unrelated test modules
   - Data shows 70% pass rate WITH loadscope vs 40% WITHOUT
   - Better test isolation while maintaining parallelism

Note: loadscope exposes one tokenizer cache issue in core-utils which will be
fixed in a separate PR. The tradeoff is worth it (7/10 pass vs 4/10 without).

These changes address the root causes of intermittent test failures in:
PRs #21268, #21271, #21272, #21273, #21275, #21276:
- Environment variable pollution (GOOGLE_APPLICATION_CREDENTIALS, VERTEXAI_PROJECT)
- Global state conflicts (litellm.known_tokenizer_config)
- Async mock timing issues with parallel execution

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant