Skip to content

feat: add governance plugin E2E tests and doc updates#1318

Merged
akshaydeo merged 1 commit intomainfrom
01-13-feat_governance_e2e_tests_added_on_ci_and_doc_updates
Jan 13, 2026
Merged

feat: add governance plugin E2E tests and doc updates#1318
akshaydeo merged 1 commit intomainfrom
01-13-feat_governance_e2e_tests_added_on_ci_and_doc_updates

Conversation

@Pratham-Mishra04
Copy link
Copy Markdown
Collaborator

Summary

Added end-to-end testing infrastructure for the governance plugin, enabling comprehensive testing of governance features in a real Bifrost environment.

Changes

  • Created a dedicated E2E test script for governance plugin that builds Bifrost, starts it with test config, and runs tests
  • Moved governance tests from plugins/governance to tests/governance directory for better organization
  • Added test configuration file for governance E2E tests
  • Updated the release script to run governance tests during plugin release
  • Added Makefile target for running governance tests with support for debugging
  • Enhanced test utilities with more robust waiting and retry mechanisms
  • Fixed header parsing test location by moving it to core/schemas

Type of change

  • Feature
  • Refactor
  • Chore/CI

Affected areas

  • Core (Go)
  • Plugins
  • CI/CD

How to test

Run the governance E2E tests:

# Run all governance tests
make test-governance

# Run a specific test
make test-governance TESTCASE=TestVKBudgetExceeded

# Run tests matching a pattern
make test-governance PATTERN=Budget

# Run with debugger
make test-governance DEBUG=1

Related issues

Improves test coverage for governance plugin and ensures reliability of governance features.

Checklist

  • I added/updated tests where appropriate
  • I updated documentation where needed
  • I verified builds succeed

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jan 13, 2026

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Automated governance end-to-end test workflow and public governance test target
    • Provider routing guide covering governance-based routing and adaptive load balancing
    • Virtual Key header format validation tests (multiple auth header formats)
  • Tests

    • Improved stability: polling-based synchronization replaces fixed sleeps
    • New test utilities and HTTP helpers; governance test orchestration and reporting
  • Documentation

    • Reorganized MCP navigation and expanded governance routing docs with model validation guidance

✏️ Tip: You can customize this high-level summary in your review settings.

Walkthrough

Adds a governance E2E test runner and CI integration, a Makefile test-governance target, comprehensive governance test utilities and config, refactors plugin test fixtures, hardens tests with polling/timeouts, introduces rate-limit test payloads, and updates documentation/navigation for provider routing and MCP links.

Changes

Cohort / File(s) Summary
CI / Release Scripts
​.github/workflows/scripts/release-single-plugin.sh, ​.github/workflows/scripts/run-governance-e2e-tests.sh, ​.github/workflows/release-pipeline.yml
Add run-governance-e2e-tests.sh; update release script to run governance unit tests with coverage, optional Codecov upload, and invoke E2E runner; add Node.js setup step in release pipeline.
Build / Test Targets
Makefile
New public test-governance target with TESTCASE/PATTERN mutual-exclusion, .env loading, optional delve debug, gotestsum/junit output, per-case reports, and help text update.
Governance E2E Test Framework
tests/governance/* (e.g., test_utils.go, go.mod, config.json, many _test.go)
Introduce comprehensive test utilities (API helpers, polling WaitFor*, GlobalTestData, cost model, ParseDuration), new module and config JSON, convert many tests to polling/longer timeouts, update ExtractIDFromResponse signature, add VK header-format test, and add rate-limit payload types.
Plugin Test Refactor
plugins/governance/fixtures_test.go (deleted), plugins/governance/test_utils.go
Remove large fixtures file; add focused MockLogger, in-memory builders, and assertion helpers in test_utils.go.
Core Package Fix
core/schemas/plugin_test.go
Change package declaration from governanceschemas.
Docs & Navigation
docs/... (many), docs/docs.json, docs/providers/provider-routing.mdx
Add provider-routing guide (large), update MCP/governance link paths and redirects, minor wording/link fixes across quickstart and architecture pages.
Contributor Guide
docs/contributing/setting-up-repo.mdx
Add examples for make test-governance.

Sequence Diagram(s)

sequenceDiagram
    actor CI
    participant ReleaseScript as release-single-plugin.sh
    participant E2EScript as run-governance-e2e-tests.sh
    participant Make as Makefile
    participant Build as build (make)
    participant Server as Bifrost Server
    participant Health as /health
    participant Tests as go test (E2E)
    participant Codecov as Codecov (optional)

    CI->>ReleaseScript: trigger plugin release (governance)
    ReleaseScript->>Make: run unit tests with coverage
    Make-->>ReleaseScript: tests result + coverage file
    ReleaseScript->>Codecov: upload coverage (if token)
    ReleaseScript->>E2EScript: invoke E2E runner
    E2EScript->>Build: make build LOCAL=1
    Build-->>E2EScript: build success
    E2EScript->>Server: start Bifrost (background)
    E2EScript->>Health: poll /health until ready
    Health-->>E2EScript: ready
    E2EScript->>Tests: run go E2E tests
    Tests->>Server: exercise API endpoints
    Server-->>Tests: responses
    Tests-->>E2EScript: results
    E2EScript->>Server: stop, collect logs, cleanup
    E2EScript-->>ReleaseScript: exit status
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~55 minutes

Poem

🐰
I hopped through scripts and tests tonight,
Polling hearts until they beat just right,
Keys and routes now neatly spun,
Background servers hum, the checks are done,
CI carrots gleam — a rabbit’s delight!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: adding governance plugin E2E tests and documentation updates. It is concise, specific, and directly related to the primary objectives of the PR.
Description check ✅ Passed The PR description covers all major required sections: Summary, Changes, Type of change, Affected areas, How to test, and Checklist. It provides clear context about the E2E testing infrastructure, file reorganization, and testing commands.
Docstring Coverage ✅ Passed Docstring coverage is 82.80% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch 01-13-feat_governance_e2e_tests_added_on_ci_and_doc_updates

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

🧪 Test Suite Available

This PR can be tested by a repository admin.

Run tests for PR #1318

Copy link
Copy Markdown
Collaborator Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @tests/governance/usagetracking_test.go:
- Around line 327-356: The condition currently only checks that rate_limit_id is
non-empty; update the WaitForCondition callback so after extracting rateLimitID
from vkData it fetches the rate limit's in-memory record (either by requesting
the rate-limit endpoint with rateLimitID or reading vkData["rate_limit"] if
present) and asserts that the rate limit's "token_current_usage" field is a
number > 0 (or greater than a previously captured baseline) before returning
true; use the existing MakeRequest call pattern and the same
virtualKeysMap/vkData handling to locate rateLimitID and read
token_current_usage.
🧹 Nitpick comments (6)
tests/governance/test_utils.go (2)

197-206: Deprecated rand.Seed usage.

rand.Seed is deprecated since Go 1.20. The global random source is now automatically seeded, so you can simply remove the rand.Seed call. If you need deterministic behavior for specific tests, use a local random source instead.

♻️ Suggested fix
 // generateRandomID generates a random ID for test resources
 func generateRandomID() string {
-	rand.Seed(time.Now().UnixNano())
 	const letters = "abcdefghijklmnopqrstuvwxyz0123456789"
 	b := make([]byte, 8)
 	for i := range b {
 		b[i] = letters[rand.Intn(len(letters))]
 	}
 	return string(b)
 }

301-322: Unused keyPath parameter in ExtractIDFromResponse.

The keyPath parameter is declared but never used - the function always searches through a hardcoded list of keys ("virtual_key", "team", "customer"). Either remove the parameter or use it to make the function more flexible.

♻️ Option 1: Remove unused parameter
 // ExtractIDFromResponse extracts the ID from a creation response
-func ExtractIDFromResponse(t *testing.T, resp *APIResponse, keyPath string) string {
+func ExtractIDFromResponse(t *testing.T, resp *APIResponse) string {
♻️ Option 2: Use the parameter
 // ExtractIDFromResponse extracts the ID from a creation response
 func ExtractIDFromResponse(t *testing.T, resp *APIResponse, keyPath string) string {
 	if resp.StatusCode >= 400 {
 		t.Fatalf("Request failed with status %d: %v", resp.StatusCode, resp.Body)
 	}

-	// Navigate through the response to find the ID
-	data := resp.Body
-	parts := []string{"virtual_key", "team", "customer"}
-	for _, part := range parts {
-		if val, ok := data[part]; ok {
-			if nested, ok := val.(map[string]interface{}); ok {
-				if id, ok := nested["id"].(string); ok {
-					return id
-				}
+	// Use keyPath if provided, otherwise search common keys
+	keysToCheck := []string{keyPath}
+	if keyPath == "" {
+		keysToCheck = []string{"virtual_key", "team", "customer"}
+	}
+	
+	for _, key := range keysToCheck {
+		if val, ok := resp.Body[key]; ok {
+			if nested, ok := val.(map[string]interface{}); ok {
+				if id, ok := nested["id"].(string); ok {
+					return id
+				}
 			}
 		}
 	}
.github/workflows/scripts/run-governance-e2e-tests.sh (2)

113-152: Consider checking for port availability before starting Bifrost.

The health check polling is robust, but if port 8080 is already in use (e.g., from a previous failed run or another process), the server will fail to bind. Consider adding a pre-check or making the port configurable.

Optional: Add port availability check
 # Step 3: Start Bifrost in background
 echo ""
 echo -e "${CYAN}🚀 Step 3: Starting Bifrost server...${NC}"
 
+# Check if port is already in use
+if lsof -i ":${BIFROST_PORT}" > /dev/null 2>&1 || nc -z "${BIFROST_HOST}" "${BIFROST_PORT}" 2>/dev/null; then
+  echo -e "${RED}❌ Port ${BIFROST_PORT} is already in use${NC}"
+  exit 1
+fi
+
 # Ensure data directory exists for SQLite database
 mkdir -p data

56-65: Log file persists after successful runs.

The log file at /tmp/bifrost-governance-test.log is retained after runs (line 58-59 only logs its location). In CI environments with ephemeral runners this is fine, but for local development it may accumulate. Consider adding an option to clean up logs on success.

tests/governance/inmemorysync_test.go (1)

506-533: Resource availability check only validates status codes, not actual resource presence.

The polling condition only checks that the endpoints return 200, but doesn't verify that the specific created VK, team, and customer are actually present in the response maps. This could pass even if the resources haven't synced yet.

♻️ Suggested fix to verify actual resource presence
 	// Wait for all resources to be available in in-memory store
+	vkData := createVKResp.Body["virtual_key"].(map[string]interface{})
+	vkValue := vkData["value"].(string)
+
 	allResourcesReady := WaitForCondition(t, func() bool {
 		getVKResp := MakeRequest(t, APIRequest{
 			Method: "GET",
 			Path:   "/api/governance/virtual-keys?from_memory=true",
 		})
 		if getVKResp.StatusCode != 200 {
 			return false
 		}
+		vksMap, ok := getVKResp.Body["virtual_keys"].(map[string]interface{})
+		if !ok {
+			return false
+		}
+		if _, exists := vksMap[vkValue]; !exists {
+			return false
+		}
 
 		getTeamsResp := MakeRequest(t, APIRequest{
 			Method: "GET",
 			Path:   "/api/governance/teams?from_memory=true",
 		})
 		if getTeamsResp.StatusCode != 200 {
 			return false
 		}
+		teamsMap, ok := getTeamsResp.Body["teams"].(map[string]interface{})
+		if !ok {
+			return false
+		}
+		if _, exists := teamsMap[teamID]; !exists {
+			return false
+		}
 
 		getCustomersResp := MakeRequest(t, APIRequest{
 			Method: "GET",
 			Path:   "/api/governance/customers?from_memory=true",
 		})
-		return getCustomersResp.StatusCode == 200
+		if getCustomersResp.StatusCode != 200 {
+			return false
+		}
+		customersMap, ok := getCustomersResp.Body["customers"].(map[string]interface{})
+		if !ok {
+			return false
+		}
+		_, exists := customersMap[customerID]
+		return exists
 	}, 3*time.Second, "all resources available in in-memory store")
tests/governance/e2e_test.go (1)

1276-1300: Avoid logging inside polling condition function.

The t.Logf calls on lines 1284 and 1290-1291 inside the WaitForCondition check function will execute on every polling attempt, leading to excessive log output. Move diagnostic logging outside the condition or only log on the final failure.

♻️ Remove logging from inside the polling condition
 	// Wait for in-memory store to sync (poll with timeout instead of fixed sleep)
 	vkRemoved := WaitForCondition(t, func() bool {
 		getDataResp2 := MakeRequest(t, APIRequest{
 			Method: "GET",
 			Path:   "/api/governance/virtual-keys?from_memory=true",
 		})
 
 		if getDataResp2.StatusCode != 200 {
-			t.Logf("Failed to get VK data: status %d", getDataResp2.StatusCode)
 			return false
 		}
 
 		virtualKeysMap2, ok := getDataResp2.Body["virtual_keys"].(map[string]interface{})
 		if !ok {
-			t.Logf("Invalid response structure for virtual_keys")
 			return false
 		}
 
 		_, exists := virtualKeysMap2[vkValue]
 		return !exists // Return true when VK is NOT found (successfully removed)
 	}, 5*time.Second, "VK removed from in-memory store")
 
 	if !vkRemoved {
-		t.Fatalf("VK still exists in in-memory store after deletion (timeout after 5s)")
+		// Log diagnostic info on failure
+		getDataResp := MakeRequest(t, APIRequest{
+			Method: "GET",
+			Path:   "/api/governance/virtual-keys?from_memory=true",
+		})
+		t.Fatalf("VK still exists in in-memory store after deletion (timeout after 5s), final response status: %d", getDataResp.StatusCode)
 	}
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8a6654b and 43667dd.

⛔ Files ignored due to path filters (1)
  • tests/governance/go.sum is excluded by !**/*.sum
📒 Files selected for processing (22)
  • .github/workflows/scripts/release-single-plugin.sh
  • .github/workflows/scripts/run-governance-e2e-tests.sh
  • Makefile
  • core/schemas/plugin_test.go
  • docs/contributing/setting-up-repo.mdx
  • plugins/governance/fixtures_test.go
  • plugins/governance/test_utils.go
  • tests/governance/advancedscenarios_test.go
  • tests/governance/config.json
  • tests/governance/configupdatesync_test.go
  • tests/governance/customerbudget_test.go
  • tests/governance/e2e_test.go
  • tests/governance/edgecases_test.go
  • tests/governance/go.mod
  • tests/governance/inmemorysync_test.go
  • tests/governance/providerbudget_test.go
  • tests/governance/ratelimit_test.go
  • tests/governance/ratelimitenforcement_test.go
  • tests/governance/teambudget_test.go
  • tests/governance/test_utils.go
  • tests/governance/usagetracking_test.go
  • tests/governance/vkbudget_test.go
💤 Files with no reviewable changes (1)
  • plugins/governance/fixtures_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • tests/governance/ratelimit_test.go
  • tests/governance/configupdatesync_test.go
  • tests/governance/config.json
  • tests/governance/go.mod
  • Makefile
  • tests/governance/inmemorysync_test.go
  • tests/governance/e2e_test.go
  • tests/governance/ratelimitenforcement_test.go
  • docs/contributing/setting-up-repo.mdx
  • core/schemas/plugin_test.go
  • plugins/governance/test_utils.go
  • tests/governance/usagetracking_test.go
  • tests/governance/test_utils.go
🧠 Learnings (4)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • tests/governance/ratelimit_test.go
  • tests/governance/configupdatesync_test.go
  • tests/governance/inmemorysync_test.go
  • tests/governance/e2e_test.go
  • tests/governance/ratelimitenforcement_test.go
  • core/schemas/plugin_test.go
  • plugins/governance/test_utils.go
  • tests/governance/usagetracking_test.go
  • tests/governance/test_utils.go
📚 Learning: 2025-12-29T11:54:55.836Z
Learnt from: akshaydeo
Repo: maximhq/bifrost PR: 1153
File: framework/configstore/rdb.go:2221-2246
Timestamp: 2025-12-29T11:54:55.836Z
Learning: In Go reviews, do not flag range-over-int patterns like for i := range n as compile-time errors, assuming Go 1.22+ semantics. Only flag actual range-capable values (slices, arrays, maps, channels, strings) and other compile-time issues. This applies to all Go files across the repository.

Applied to files:

  • tests/governance/ratelimit_test.go
  • tests/governance/configupdatesync_test.go
  • tests/governance/inmemorysync_test.go
  • tests/governance/e2e_test.go
  • tests/governance/ratelimitenforcement_test.go
  • core/schemas/plugin_test.go
  • plugins/governance/test_utils.go
  • tests/governance/usagetracking_test.go
  • tests/governance/test_utils.go
📚 Learning: 2025-12-30T05:37:48.365Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1180
File: docs/features/mcp/connecting-to-servers.mdx:452-458
Timestamp: 2025-12-30T05:37:48.365Z
Learning: When reviewing documentation PRs in a Graphite-managed stack, first check related or previous PRs in the stack for feature implementations before flagging documentation as incorrect or unsupported. Documentation MDX files often reference features implemented in earlier stack PRs; verify that the documented behavior exists in earlier changes and that the docs accurately reflect the implemented state before requesting edits.

Applied to files:

  • docs/contributing/setting-up-repo.mdx
📚 Learning: 2025-12-22T10:50:40.990Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1154
File: plugins/governance/store.go:1165-1186
Timestamp: 2025-12-22T10:50:40.990Z
Learning: In the Bifrost governance plugin, budgets and rate limits have 1:1 relationships with their parent entities (virtual keys, teams, customers). Do not assume sharing; ensure cascade deletion logic only deletes budgets/rate limits when there are no shared references. Enforce invariants in code and add tests to verify no cross-entity sharing and that cascade deletes only remove the specific child of the parent. If a counterexample arises, adjust data model or add guards.

Applied to files:

  • plugins/governance/test_utils.go
🧬 Code graph analysis (6)
tests/governance/inmemorysync_test.go (1)
tests/governance/test_utils.go (3)
  • WaitForCondition (450-473)
  • MakeRequest (91-141)
  • APIRequest (76-81)
tests/governance/e2e_test.go (1)
tests/governance/test_utils.go (10)
  • WaitForCondition (450-473)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • NewGlobalTestData (361-367)
  • CreateVirtualKeyRequest (209-218)
  • BudgetRequest (231-234)
  • ExtractIDFromResponse (302-322)
  • MakeRequestWithCustomHeaders (145-195)
  • ChatCompletionRequest (287-293)
  • ChatMessage (296-299)
tests/governance/ratelimitenforcement_test.go (1)
tests/governance/test_utils.go (3)
  • WaitForCondition (450-473)
  • MakeRequest (91-141)
  • APIRequest (76-81)
plugins/governance/test_utils.go (2)
ui/lib/types/governance.ts (3)
  • Budget (5-11)
  • RateLimit (13-25)
  • VirtualKey (61-80)
plugins/governance/resolver.go (1)
  • Decision (15-15)
tests/governance/usagetracking_test.go (2)
tests/governance/test_utils.go (5)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (209-218)
  • CreateRateLimitRequest (256-261)
  • WaitForCondition (450-473)
ui/lib/types/governance.ts (1)
  • RateLimit (13-25)
tests/governance/test_utils.go (2)
ui/lib/types/governance.ts (2)
  • Budget (5-11)
  • RateLimit (13-25)
core/schemas/models.go (1)
  • Model (107-127)
🔇 Additional comments (31)
core/schemas/plugin_test.go (1)

1-1: LGTM - Package relocation is appropriate.

Moving this test from the governance package to the schemas package is the right call since caseInsensitiveLookup is a core header utility that belongs with the schemas module. The comprehensive test cases cover important edge cases including nil maps, empty keys, and various header format variations.

tests/governance/config.json (1)

1-66: LGTM - Test configuration is well-structured.

The configuration properly enables governance features (enable_governance, enforce_governance_header) while disabling direct keys to ensure governance controls are exercised. Using environment variables for API keys (env.OPENAI_API_KEY, etc.) is a good security practice.

Makefile (1)

591-732: Well-implemented test target following established patterns.

The test-governance target correctly mirrors the structure of test-core, including:

  • Mutual exclusivity check for PATTERN/TESTCASE
  • Environment loading from .env
  • Debug mode support with delve
  • Consistent JUnit XML and HTML report generation

One observation: test-governance is not included in the test-all target (line 734). This appears intentional since governance tests may require specific infrastructure (running Bifrost server with governance config), but please confirm this is the expected behavior.

plugins/governance/test_utils.go (2)

15-67: LGTM - Thread-safe mock logger implementation.

The MockLogger correctly uses mutex protection for concurrent access. The logging methods appropriately capture only the format string for test verification purposes - args are ignored since the mock doesn't need to format actual messages.


69-194: Well-structured test data builders.

The builder functions follow a consistent pattern and properly link related entities (e.g., setting both Budget and BudgetID on virtual keys). The default duration of "1m" in buildRateLimit provides consistent test behavior.

tests/governance/test_utils.go (3)

90-141: LGTM - Clean HTTP request helper.

The MakeRequest function properly handles JSON marshaling, sets appropriate headers, and gracefully handles non-JSON responses by storing them in a "raw" key. The hardcoded localhost:8080 is appropriate for E2E tests.


384-446: LGTM - Robust cleanup with retry logic.

The deleteWithRetry function and Cleanup method implement sensible retry logic with progressive backoff (100ms, 200ms, etc.). Treating 404 as success during cleanup is the right approach since the resource may have already been deleted.


448-503: LGTM - Well-designed polling utilities.

WaitForCondition and WaitForAPICondition implement progressive backoff with sensible bounds (50-500ms and 100-500ms respectively). Returning the last response on timeout in WaitForAPICondition is helpful for debugging test failures.

docs/contributing/setting-up-repo.mdx (1)

170-172: LGTM - Documentation accurately reflects the new Makefile target.

The added governance test commands are correctly documented with appropriate examples matching the Makefile implementation.

tests/governance/go.mod (1)

1-5: LGTM - Module configuration looks appropriate.

The module path, Go version, and framework dependency are correctly configured for the governance tests. The Go version 1.25.5 is the latest stable release (as of January 2026) and is consistent with the Docker images used in the Makefile (golang:1.25.5-alpine3.22).

tests/governance/ratelimit_test.go (2)

759-760: LGTM - Extended wait for async PostHook.

The increased wait time (500ms → 2s) with a clarified comment appropriately accounts for the async PostHook goroutine completing usage updates in E2E tests.


914-915: LGTM - Consistent with other PostHook waits.

This change mirrors the update in TestRateLimitUsageTrackedInMemory, maintaining consistency across tests that wait for async usage tracking.

tests/governance/configupdatesync_test.go (5)

94-95: LGTM - Appropriate wait time for PostHook usage update.

The extended wait correctly distinguishes between async PostHook completion (2s) and config sync propagation (500ms used elsewhere in this test).


319-320: LGTM - Consistent PostHook wait pattern.


487-488: LGTM - Consistent PostHook wait pattern.


888-889: LGTM - Consistent PostHook wait pattern.


1052-1053: LGTM - Consistent PostHook wait pattern.

tests/governance/ratelimitenforcement_test.go (1)

590-618: Good improvement - Polling replaces fixed sleep.

The polling-based approach with WaitForCondition is more robust than a fixed sleep, and the guarded type assertions prevent potential panics.

Minor note: The check on lines 620-624 is somewhat redundant since rateLimitID is guaranteed to be non-empty when usageUpdated is true (due to the condition on line 613). However, the defensive check doesn't harm and provides a clearer log message.

.github/workflows/scripts/run-governance-e2e-tests.sh (1)

1-77: Well-structured E2E test orchestration script.

The cleanup trap, process management, and error handling are well implemented. The script correctly captures the exit code at the start of cleanup to preserve the original exit status.

.github/workflows/scripts/release-single-plugin.sh (1)

79-111: LGTM - Governance E2E test integration.

The flow correctly sequences unit tests with coverage → Codecov upload → E2E tests. Directory navigation is correct (cd ../.. from plugins/governance to repo root, then back via cd "$PLUGIN_DIR").

Good defensive practices:

  • Checking E2E script existence before execution
  • chmod +x as a fallback
  • Proper cleanup of coverage files in both branches
tests/governance/inmemorysync_test.go (2)

386-408: Good use of polling-based verification for eventual consistency.

The polling approach with WaitForCondition is a solid improvement over fixed sleeps, properly handling the async nature of in-memory sync. The type assertion checks before map access prevent panics.


424-446: Polling for VK removal is correctly implemented.

The logic correctly returns true when the VK is absent (!exists), and the timeout message clearly indicates the failure reason.

tests/governance/e2e_test.go (5)

439-440: Fixed sleep for async PostHook processing.

A 2-second wait is reasonable for async processing, but consider using polling with WaitForCondition for more deterministic behavior if this becomes flaky in CI.


478-479: Consistent async wait pattern.

Same pattern as above for PostHook completion.


942-943: Async wait for budget update.

Consistent with the pattern used elsewhere in this file.


1238-1260: Good polling-based VK existence verification.

The pattern is consistent with other tests and correctly handles type assertions.


1564-1681: Well-structured table-driven test for VK header formats.

The test covers all documented header formats (x-bf-vk, Authorization Bearer, x-api-key, x-goog-api-key) with clear descriptions. Using t.Run for subtests enables parallel execution and individual failure reporting.

One minor note: all test cases have expectedPass: true, so the else branch (lines 1670-1675) is currently dead code. Consider removing it or adding a negative test case for an unsupported header format if that's a valid scenario.

tests/governance/usagetracking_test.go (4)

212-213: Appropriate wait for async PostHook completion.

The 2-second wait is consistent with the pattern used in other tests for async budget updates.


263-275: Rate limit configuration added to enable usage tracking.

Good addition - the test now properly creates a VK with rate limiting to enable verification of in-memory usage tracking.


432-434: Explicit wait with logging for async updates.

The added logging helps with debugging test timing issues. The 3-second wait is reasonable given multiple requests need to complete their PostHook processing.


462-464: Increased wait time accounts for ticker interval and processing buffer.

The comment explains the reasoning well: 10s ticker interval + 30s budget reset duration + processing buffer = ~40s total wait. This is a pragmatic adjustment for test reliability.

Comment thread tests/governance/usagetracking_test.go
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 01-13-feat_governance_e2e_tests_added_on_ci_and_doc_updates branch 3 times, most recently from 3c2e67f to 2329d85 Compare January 13, 2026 13:22
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @docs/quickstart/gateway/tool-calling.mdx:
- Line 130: In the sentence "Read more about MCP connections and advanced end to
end tool execution in the [MCP Features](../../mcp/overview) section.",
hyphenate the compound adjective by replacing "end to end tool execution" with
"end-to-end tool execution" so it reads "...advanced end-to-end tool
execution...".

In @tests/governance/test_utils.go:
- Around line 301-322: The ExtractIDFromResponse function currently ignores the
keyPath parameter; update it to use keyPath to locate the id: if keyPath is
non-empty, split it by "." and iterate through resp.Body maps following each
segment to reach a nested map and return its "id" string; if keyPath is empty,
fall back to the existing parts list {"virtual_key","team","customer"} behavior;
ensure type assertions guard against non-map values and return t.Fatalf when id
cannot be found.
🧹 Nitpick comments (6)
plugins/governance/test_utils.go (1)

39-67: Logging methods discard format arguments.

The logging methods store only the format string and ignore the variadic args. This means logged messages won't contain interpolated values, reducing debug utility.

Proposed fix to format messages
 func (ml *MockLogger) Error(format string, args ...interface{}) {
 	ml.mu.Lock()
 	defer ml.mu.Unlock()
-	ml.errors = append(ml.errors, format)
+	ml.errors = append(ml.errors, fmt.Sprintf(format, args...))
 }

 func (ml *MockLogger) Warn(format string, args ...interface{}) {
 	ml.mu.Lock()
 	defer ml.mu.Unlock()
-	ml.warnings = append(ml.warnings, format)
+	ml.warnings = append(ml.warnings, fmt.Sprintf(format, args...))
 }

 func (ml *MockLogger) Info(format string, args ...interface{}) {
 	ml.mu.Lock()
 	defer ml.mu.Unlock()
-	ml.infos = append(ml.infos, format)
+	ml.infos = append(ml.infos, fmt.Sprintf(format, args...))
 }

 func (ml *MockLogger) Debug(format string, args ...interface{}) {
 	ml.mu.Lock()
 	defer ml.mu.Unlock()
-	ml.debugs = append(ml.debugs, format)
+	ml.debugs = append(ml.debugs, fmt.Sprintf(format, args...))
 }

 func (ml *MockLogger) Fatal(format string, args ...interface{}) {
 	ml.mu.Lock()
 	defer ml.mu.Unlock()
-	ml.errors = append(ml.errors, format)
+	ml.errors = append(ml.errors, fmt.Sprintf(format, args...))
 }

This requires adding "fmt" to the imports.

tests/governance/test_utils.go (3)

91-141: Consider adding a timeout to the HTTP client.

The http.Client{} is created without a timeout. If the Bifrost server becomes unresponsive during tests, the test will hang indefinitely rather than failing with a clear timeout error.

Proposed fix
 func MakeRequest(t *testing.T, req APIRequest) *APIResponse {
-	client := &http.Client{}
+	client := &http.Client{
+		Timeout: 30 * time.Second,
+	}
 	url := fmt.Sprintf("http://localhost:8080%s", req.Path)

Apply the same change to MakeRequestWithCustomHeaders.


197-206: Remove deprecated rand.Seed call.

rand.Seed is deprecated since Go 1.20. The global random generator is automatically seeded. Additionally, calling Seed on every invocation with time.Now().UnixNano() in concurrent tests can cause collisions if called within the same nanosecond.

Proposed fix
 func generateRandomID() string {
-	rand.Seed(time.Now().UnixNano())
 	const letters = "abcdefghijklmnopqrstuvwxyz0123456789"
 	b := make([]byte, 8)
 	for i := range b {
 		b[i] = letters[rand.Intn(len(letters))]
 	}
 	return string(b)
 }

505-541: Consider consolidating ParseDuration or document the intentional separation.

The function is duplicated identically in both framework/configstore/tables/utils.go and tests/governance/test_utils.go. While importing from the framework package would reduce duplication, the tests module is structured as an independent module (github.com/maximhq/bifrost/tests/governance) with no framework dependencies, suggesting test isolation may be intentional. If consolidation is desired, either extract to a shared utility module or document why the duplication is necessary for the test suite's independence.

tests/governance/e2e_test.go (1)

476-477: Consider polling for budget update confirmation.

While the 2-second fixed sleep works, this could be more robust with WaitForCondition polling for the budget usage to increase, similar to the pattern used in other tests. However, this is acceptable for now.

Makefile (1)

591-732: The test-governance target follows established patterns.

The implementation mirrors test-core closely, which ensures a consistent user experience across test targets. While there's notable code duplication, Makefile's limited abstraction capabilities make this acceptable. The target properly handles:

  • Mutual exclusivity of PATTERN/TESTCASE
  • Debug mode with delve
  • JUnit XML and HTML report generation
  • Colored failure summaries

Consider extracting the common test execution and reporting logic into a shared shell script helper in the future to reduce duplication across test targets.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 43667dd and 2329d85.

⛔ Files ignored due to path filters (1)
  • tests/governance/go.sum is excluded by !**/*.sum
📒 Files selected for processing (40)
  • .github/workflows/scripts/release-single-plugin.sh
  • .github/workflows/scripts/run-governance-e2e-tests.sh
  • Makefile
  • core/schemas/plugin_test.go
  • docs/architecture/core/mcp.mdx
  • docs/contributing/setting-up-repo.mdx
  • docs/docs.json
  • docs/enterprise/adaptive-load-balancing.mdx
  • docs/features/governance/mcp-tools.mdx
  • docs/features/governance/routing.mdx
  • docs/mcp/agent-mode.mdx
  • docs/mcp/code-mode.mdx
  • docs/mcp/connecting-to-servers.mdx
  • docs/mcp/filtering.mdx
  • docs/mcp/gateway-url.mdx
  • docs/mcp/overview.mdx
  • docs/mcp/tool-execution.mdx
  • docs/mcp/tool-hosting.mdx
  • docs/providers/provider-routing.mdx
  • docs/quickstart/gateway/cli-agents.mdx
  • docs/quickstart/gateway/setting-up.mdx
  • docs/quickstart/gateway/tool-calling.mdx
  • docs/quickstart/go-sdk/tool-calling.mdx
  • plugins/governance/fixtures_test.go
  • plugins/governance/test_utils.go
  • tests/governance/advancedscenarios_test.go
  • tests/governance/config.json
  • tests/governance/configupdatesync_test.go
  • tests/governance/customerbudget_test.go
  • tests/governance/e2e_test.go
  • tests/governance/edgecases_test.go
  • tests/governance/go.mod
  • tests/governance/inmemorysync_test.go
  • tests/governance/providerbudget_test.go
  • tests/governance/ratelimit_test.go
  • tests/governance/ratelimitenforcement_test.go
  • tests/governance/teambudget_test.go
  • tests/governance/test_utils.go
  • tests/governance/usagetracking_test.go
  • tests/governance/vkbudget_test.go
💤 Files with no reviewable changes (1)
  • plugins/governance/fixtures_test.go
✅ Files skipped from review due to trivial changes (3)
  • docs/quickstart/go-sdk/tool-calling.mdx
  • docs/providers/provider-routing.mdx
  • docs/mcp/agent-mode.mdx
🚧 Files skipped from review as they are similar to previous changes (6)
  • tests/governance/inmemorysync_test.go
  • tests/governance/ratelimit_test.go
  • tests/governance/go.mod
  • tests/governance/configupdatesync_test.go
  • docs/contributing/setting-up-repo.mdx
  • core/schemas/plugin_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • docs/features/governance/mcp-tools.mdx
  • docs/mcp/filtering.mdx
  • docs/architecture/core/mcp.mdx
  • docs/mcp/code-mode.mdx
  • docs/quickstart/gateway/cli-agents.mdx
  • Makefile
  • tests/governance/ratelimitenforcement_test.go
  • docs/features/governance/routing.mdx
  • docs/mcp/gateway-url.mdx
  • tests/governance/usagetracking_test.go
  • tests/governance/e2e_test.go
  • docs/enterprise/adaptive-load-balancing.mdx
  • plugins/governance/test_utils.go
  • docs/docs.json
  • docs/quickstart/gateway/setting-up.mdx
  • tests/governance/test_utils.go
  • tests/governance/config.json
  • docs/quickstart/gateway/tool-calling.mdx
🧠 Learnings (4)
📚 Learning: 2025-12-30T05:37:48.365Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1180
File: docs/features/mcp/connecting-to-servers.mdx:452-458
Timestamp: 2025-12-30T05:37:48.365Z
Learning: When reviewing documentation PRs in a Graphite-managed stack, first check related or previous PRs in the stack for feature implementations before flagging documentation as incorrect or unsupported. Documentation MDX files often reference features implemented in earlier stack PRs; verify that the documented behavior exists in earlier changes and that the docs accurately reflect the implemented state before requesting edits.

Applied to files:

  • docs/features/governance/mcp-tools.mdx
  • docs/mcp/filtering.mdx
  • docs/architecture/core/mcp.mdx
  • docs/mcp/code-mode.mdx
  • docs/quickstart/gateway/cli-agents.mdx
  • docs/features/governance/routing.mdx
  • docs/mcp/gateway-url.mdx
  • docs/enterprise/adaptive-load-balancing.mdx
  • docs/quickstart/gateway/setting-up.mdx
  • docs/quickstart/gateway/tool-calling.mdx
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • tests/governance/ratelimitenforcement_test.go
  • tests/governance/usagetracking_test.go
  • tests/governance/e2e_test.go
  • plugins/governance/test_utils.go
  • tests/governance/test_utils.go
📚 Learning: 2025-12-29T11:54:55.836Z
Learnt from: akshaydeo
Repo: maximhq/bifrost PR: 1153
File: framework/configstore/rdb.go:2221-2246
Timestamp: 2025-12-29T11:54:55.836Z
Learning: In Go reviews, do not flag range-over-int patterns like for i := range n as compile-time errors, assuming Go 1.22+ semantics. Only flag actual range-capable values (slices, arrays, maps, channels, strings) and other compile-time issues. This applies to all Go files across the repository.

Applied to files:

  • tests/governance/ratelimitenforcement_test.go
  • tests/governance/usagetracking_test.go
  • tests/governance/e2e_test.go
  • plugins/governance/test_utils.go
  • tests/governance/test_utils.go
📚 Learning: 2025-12-22T10:50:40.990Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1154
File: plugins/governance/store.go:1165-1186
Timestamp: 2025-12-22T10:50:40.990Z
Learning: In the Bifrost governance plugin, budgets and rate limits have 1:1 relationships with their parent entities (virtual keys, teams, customers). Do not assume sharing; ensure cascade deletion logic only deletes budgets/rate limits when there are no shared references. Enforce invariants in code and add tests to verify no cross-entity sharing and that cascade deletes only remove the specific child of the parent. If a counterexample arises, adjust data model or add guards.

Applied to files:

  • plugins/governance/test_utils.go
🧬 Code graph analysis (4)
tests/governance/usagetracking_test.go (2)
tests/governance/test_utils.go (5)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (209-218)
  • CreateRateLimitRequest (256-261)
  • WaitForCondition (450-473)
ui/lib/types/governance.ts (1)
  • RateLimit (13-25)
tests/governance/e2e_test.go (1)
tests/governance/test_utils.go (10)
  • ParseDuration (507-541)
  • WaitForCondition (450-473)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (209-218)
  • BudgetRequest (231-234)
  • ExtractIDFromResponse (302-322)
  • MakeRequestWithCustomHeaders (145-195)
  • ChatCompletionRequest (287-293)
  • ChatMessage (296-299)
plugins/governance/test_utils.go (2)
ui/lib/types/governance.ts (3)
  • Budget (5-11)
  • RateLimit (13-25)
  • VirtualKey (61-80)
plugins/governance/resolver.go (1)
  • Decision (15-15)
tests/governance/test_utils.go (2)
ui/lib/types/governance.ts (2)
  • Budget (5-11)
  • RateLimit (13-25)
core/schemas/models.go (1)
  • Model (107-127)
🪛 LanguageTool
docs/quickstart/gateway/tool-calling.mdx

[grammar] ~130-~130: Use a hyphen to join words.
Context: ...e about MCP connections and advanced end to end tool execution in the [MCP Featur...

(QB_NEW_EN_HYPHEN)


[grammar] ~130-~130: Use a hyphen to join words.
Context: ...bout MCP connections and advanced end to end tool execution in the [MCP Features]...

(QB_NEW_EN_HYPHEN)

🔇 Additional comments (48)
docs/mcp/code-mode.mdx (1)

8-10: LGTM!

The simplified note clearly communicates the version requirement without unnecessary instructions.

docs/features/governance/mcp-tools.mdx (1)

11-11: LGTM!

The updated link path correctly resolves to the new MCP overview location.

docs/quickstart/gateway/setting-up.mdx (1)

206-208: LGTM!

The updated links now point to specific landing pages rather than section roots, improving navigation clarity for users.

docs/quickstart/gateway/cli-agents.mdx (4)

356-356: LGTM!

Link correctly updated to the new MCP overview path.


367-369: LGTM!

The simplified note is consistent with similar changes throughout the MCP documentation in this PR.


399-399: LGTM!

Link correctly updated to the new MCP gateway-url path.


404-404: LGTM!

Link correctly updated to point to the specific virtual-keys page.

docs/mcp/filtering.mdx (2)

256-256: LGTM!

Link path correctly updated to navigate from /mcp to /features/governance/mcp-tools.


441-443: LGTM!

Card href correctly updated to match the inline link path above.

docs/quickstart/gateway/tool-calling.mdx (1)

163-165: LGTM!

The updated link paths correctly reflect the MCP documentation reorganization from features/mcp/* to mcp/*.

docs/enterprise/adaptive-load-balancing.mdx (1)

7-17: LGTM!

The new Info block effectively directs readers to the comprehensive Provider Routing Guide, and the Overview update clearly establishes the two-level architecture (provider + key selection). The cross-reference to /providers/provider-routing aligns with the new navigation entry in docs.json.

docs/architecture/core/mcp.mdx (1)

140-140: LGTM!

All MCP-related link updates correctly reflect the path reorganization from features/mcp/* to mcp/*. The relative paths from architecture/core/ are properly structured.

Also applies to: 275-275, 921-921, 977-977

tests/governance/config.json (2)

53-65: LGTM!

The client configuration is appropriate for E2E testing:

  • enable_governance: true and enforce_governance_header: true ensure governance features are tested
  • allow_direct_keys: false enforces governance routing
  • allowed_origins: ["*"] is acceptable in a test environment

43-49: Ensure database directory is created and cleaned up between test runs.

The SQLite database path ./data/governance-test.db uses a relative directory that must exist at runtime. While GORM will create the database file, it will not create the parent data/ directory if missing, causing tests to fail. Additionally, GlobalTestData.Cleanup() only removes API resources but not the database file itself, leaving test data persistent across runs and contributing to the SQLite lock contention already noted in the codebase.

Consider:

  • Creating the data/ directory during test initialization (or using a temporary directory)
  • Cleaning up the database file between test runs or using an in-memory SQLite database (:memory:)
  • Adding explicit database cleanup to the test teardown
docs/docs.json (3)

114-114: LGTM!

The new providers/provider-routing navigation entry is correctly placed within the "Providers & Guides" section, making it discoverable alongside other provider documentation.


165-172: LGTM!

The MCP Gateway page paths have been correctly updated from features/mcp/* to mcp/*, creating a cleaner top-level navigation structure for MCP documentation.


481-513: LGTM!

The redirects comprehensively map all old /features/mcp/* paths to the new /mcp/* paths, ensuring backward compatibility for existing links and bookmarks.

docs/mcp/gateway-url.mdx (2)

9-9: LGTM!

The simplified version note is cleaner and maintains the essential version requirement information.


219-219: LGTM!

The link updates correctly point to ../features/governance/mcp-tools - the governance documentation remains under features/governance/ while MCP docs moved to the top-level mcp/ directory.

Also applies to: 361-361

docs/features/governance/routing.mdx (4)

7-13: LGTM!

The Info block effectively directs readers to the comprehensive Provider Routing Guide while clarifying this page's focus on Virtual Key-specific governance routing.


34-42: LGTM!

The Model Validation section clearly explains the behavior for explicit vs. empty allowed_models, and the cross-provider routing note is a critical clarification that will prevent user confusion about automatic model routing between providers.


83-104: LGTM!

The empty allowed_models example effectively demonstrates Model Catalog-based routing behavior and reinforces that cross-provider routing doesn't happen automatically.


313-331: LGTM!

The troubleshooting section for Model Catalog sync failures provides actionable guidance with a realistic log example, helping users diagnose and resolve configuration issues.

plugins/governance/test_utils.go (2)

69-194: LGTM!

The test data builders are well-structured, follow consistent patterns, and correctly initialize the configstore table types.


196-222: LGTM!

The assertion helpers appropriately use t.Helper() for better stack traces and provide clean wrappers around testify assertions.

tests/governance/test_utils.go (4)

15-73: LGTM!

The model cost definitions and calculation function are well-structured for test purposes.


208-299: LGTM!

The request payload types are well-structured with appropriate JSON tags and align with the API contracts.


353-446: LGTM!

The GlobalTestData cleanup implementation is well-designed with appropriate retry logic and correct deletion order (VKs before teams/customers to respect referential integrity).


448-503: LGTM!

The polling utilities WaitForCondition and WaitForAPICondition are well-implemented with progressive backoff and appropriate logging.

tests/governance/ratelimitenforcement_test.go (1)

590-624: LGTM!

The replacement of fixed sleep with WaitForCondition is a good improvement. The polling logic correctly validates the response structure with guarded type assertions before checking for the rate limit ID.

tests/governance/e2e_test.go (4)

437-438: Acceptable use of fixed sleep.

For verifying that a failed request does not consume budget, a fixed sleep is reasonable since there's no positive condition to poll for. The 2-second wait provides adequate time for any erroneous async processing to complete.


1009-1033: LGTM!

Good use of WaitForAPICondition to poll for budget reset verification instead of a fixed sleep. The condition correctly checks for timestamp change.


1236-1298: LGTM!

Excellent improvement replacing fixed sleeps with WaitForCondition polling for both VK creation and deletion verification. The logic correctly handles the inverted condition for deletion (returns true when VK is not found).


1562-1679: LGTM!

Well-structured table-driven test for VK header format validation. The test covers all documented header formats with clear descriptions and uses subtests appropriately.

tests/governance/usagetracking_test.go (4)

212-213: LGTM!

Appropriate wait time for async budget update propagation before verification.


263-276: LGTM!

The VK creation correctly uses the RateLimit field with appropriate values for testing in-memory usage tracking.


327-396: LGTM!

Excellent conversion from fixed sleep to polling-based verification. The logic correctly validates the full chain: VK exists → rate limit ID present → rate limit data accessible → token usage > 0.


464-496: LGTM!

The sleep adjustments are well-documented. The 40-second wait properly accounts for the ticker interval (10s) and reset duration (30s) with buffer for processing.

.github/workflows/scripts/run-governance-e2e-tests.sh (4)

1-2: LGTM on script setup.

The set -euo pipefail is the correct approach for robust shell scripting, ensuring early failure on errors, unset variables, and pipeline failures.


35-74: Cleanup function is well-structured.

The pattern of capturing exit_code at entry, graceful kill with fallback to SIGKILL after a delay, and trap setup for multiple signals is correct. The cleanup preserves logs for debugging while removing the PID file and test database.


128-152: Health check polling is robust.

The implementation correctly handles both timeout scenarios and process death during startup, with helpful diagnostics (log tail) on failure. The 30-second timeout is reasonable for server startup.


156-175: Test execution approach is correct.

Using GOWORK=off ensures the module is tested independently without workspace interference. The 10-minute timeout is appropriate for E2E tests that involve network calls. Capturing the exit code rather than failing immediately allows proper result reporting.

.github/workflows/scripts/release-single-plugin.sh (2)

79-111: Governance plugin testing integration looks correct.

The flow properly:

  1. Runs unit tests with coverage
  2. Conditionally uploads to Codecov
  3. Navigates to repo root for E2E tests
  4. Returns to plugin directory after E2E completion

The existence check for the E2E script (line 101-104) and the chmod +x (line 105) are good defensive practices.


84-94: Codecov upload handling is clean.

Both the upload and skip paths properly clean up temporary files (codecov binary and coverage.txt). The flag naming convention plugin-${PLUGIN_NAME} maintains consistency with other plugins.

Makefile (4)

47-53: Help documentation updated correctly.

The DEBUG and TESTCASE documentation now includes test-governance, maintaining consistency with other test targets.


591-598: Good defensive checks at target entry.

The mutual exclusivity check for PATTERN and TESTCASE (lines 594-598) prevents confusing behavior. The directory existence check (lines 599-602) provides a clear error message if the governance tests directory is missing.


669-697: "Run all governance tests" branch is correctly implemented.

The default path (no TESTCASE or PATTERN) runs all tests with proper report naming (governance-all.xml). The implementation maintains consistency with the other test targets.


734-734: Consider whether test-governance should be included in test-all.

The test-all target runs test-core test-plugins test but excludes test-governance. This is likely intentional since governance E2E tests require OPENAI_API_KEY and a running Bifrost server, making them unsuitable for quick local test runs. However, this exclusion should be documented in the help text or PR description to avoid confusion.

Comment thread tests/governance/test_utils.go
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 01-13-feat_governance_e2e_tests_added_on_ci_and_doc_updates branch from 2329d85 to 2373fc5 Compare January 13, 2026 13:39
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @docs/quickstart/gateway/tool-calling.mdx:
- Line 130: Change the compound adjective "end to end" to the hyphenated form
"end-to-end" in the sentence that reads "Read more about MCP connections and
advanced end to end tool execution..." so it becomes "advanced end-to-end tool
execution"; update the text containing that phrase in the docs where the [MCP
Features] reference appears.
🧹 Nitpick comments (11)
plugins/governance/test_utils.go (1)

39-67: Logged messages discard format arguments, losing debugging context.

The logging methods (Error, Warn, Info, Debug, Fatal) store only the format string and ignore the args. When tests fail, the stored logs won't contain the actual interpolated values, making debugging harder.

♻️ Proposed fix to format the message before storing
 func (ml *MockLogger) Error(format string, args ...interface{}) {
 	ml.mu.Lock()
 	defer ml.mu.Unlock()
-	ml.errors = append(ml.errors, format)
+	ml.errors = append(ml.errors, fmt.Sprintf(format, args...))
 }

 func (ml *MockLogger) Warn(format string, args ...interface{}) {
 	ml.mu.Lock()
 	defer ml.mu.Unlock()
-	ml.warnings = append(ml.warnings, format)
+	ml.warnings = append(ml.warnings, fmt.Sprintf(format, args...))
 }

 func (ml *MockLogger) Info(format string, args ...interface{}) {
 	ml.mu.Lock()
 	defer ml.mu.Unlock()
-	ml.infos = append(ml.infos, format)
+	ml.infos = append(ml.infos, fmt.Sprintf(format, args...))
 }

 func (ml *MockLogger) Debug(format string, args ...interface{}) {
 	ml.mu.Lock()
 	defer ml.mu.Unlock()
-	ml.debugs = append(ml.debugs, format)
+	ml.debugs = append(ml.debugs, fmt.Sprintf(format, args...))
 }

 func (ml *MockLogger) Fatal(format string, args ...interface{}) {
 	ml.mu.Lock()
 	defer ml.mu.Unlock()
-	ml.errors = append(ml.errors, format)
+	ml.errors = append(ml.errors, fmt.Sprintf(format, args...))
 }

Note: You'll need to add "fmt" to the imports.

tests/governance/test_utils.go (2)

197-206: rand.Seed is deprecated since Go 1.20.

The global rand.Seed function is deprecated. In Go 1.20+, the global random number generator is automatically seeded. Calling rand.Seed in a loop (each time generateRandomID is called) can also cause poor randomness if called rapidly.

♻️ Proposed fix to remove deprecated seeding
 // generateRandomID generates a random ID for test resources
 func generateRandomID() string {
-	rand.Seed(time.Now().UnixNano())
 	const letters = "abcdefghijklmnopqrstuvwxyz0123456789"
 	b := make([]byte, 8)
 	for i := range b {
 		b[i] = letters[rand.Intn(len(letters))]
 	}
 	return string(b)
 }

505-541: ParseDuration duplicated from framework code.

The comment notes this is copied from framework/configstore/tables/utils.go. Consider importing the original function to avoid drift between implementations over time. If there's a reason to keep a copy (e.g., avoiding import cycles), adding a test to ensure parity would be valuable.

tests/governance/e2e_test.go (1)

573-574: Consider replacing fixed 500ms sleeps with polling for consistency.

While some fixed sleeps were replaced with WaitForCondition, a few 500ms sleeps remain for "in-memory store update". These could cause flaky tests in slow CI environments. Consider using the same polling pattern used elsewhere in this file.

Example for line 573:

WaitForCondition(t, func() bool {
    // Check if VK is now inactive in memory store
    resp := MakeRequest(t, APIRequest{
        Method: "GET",
        Path:   "/api/governance/virtual-keys?from_memory=true",
    })
    if resp.StatusCode != 200 {
        return false
    }
    vks := resp.Body["virtual_keys"].(map[string]interface{})
    if vkData, ok := vks[vkValue].(map[string]interface{}); ok {
        if active, ok := vkData["is_active"].(bool); ok {
            return !active
        }
    }
    return false
}, 2*time.Second, "VK deactivated in memory")

Also applies to: 618-619, 1161-1162

tests/governance/configupdatesync_test.go (5)

94-95: Consider using WaitForCondition instead of fixed sleep for consistency.

The test uses WaitForCondition in other places (e.g., TestTeamBudgetUpdateSyncToMemory at lines 689-703) which provides better test reliability. Fixed sleeps can lead to flaky tests if the async operation takes longer than expected, or waste time if it completes faster.

That said, the 2-second wait is reasonable for E2E tests and this is a minor consistency suggestion.

♻️ Optional: Replace fixed sleep with polling
-	// Wait for async PostHook goroutine to complete usage update
-	time.Sleep(2 * time.Second)
+	// Wait for usage to be updated in memory
+	usageUpdated := WaitForCondition(t, func() bool {
+		getRateLimitsResp := MakeRequest(t, APIRequest{
+			Method: "GET",
+			Path:   "/api/governance/rate-limits?from_memory=true",
+		})
+		rateLimitsMap := getRateLimitsResp.Body["rate_limits"].(map[string]interface{})
+		if rl, ok := rateLimitsMap[rateLimitID1].(map[string]interface{}); ok {
+			if usage, ok := rl["token_current_usage"].(float64); ok && usage > 0 {
+				return true
+			}
+		}
+		return false
+	}, 3*time.Second, "rate limit usage > 0")
+	if !usageUpdated {
+		t.Skip("Rate limit usage did not update in time")
+	}

319-320: Same suggestion applies: consider WaitForCondition for consistency.

This is another fixed sleep that could benefit from the polling pattern used elsewhere in this file.


487-488: Same suggestion applies: consider WaitForCondition for consistency.

This is another fixed sleep that could benefit from the polling pattern.


888-889: Same suggestion applies: consider WaitForCondition for consistency.

This is another fixed sleep that could benefit from the polling pattern.


1052-1053: Same suggestion applies: consider WaitForCondition for consistency.

This is the last fixed sleep that could benefit from the polling pattern used in TestTeamBudgetUpdateSyncToMemory.

tests/governance/advancedscenarios_test.go (1)

150-151: Consider using WaitForCondition for consistency.

Several tests in this file use fixed time.Sleep(500 * time.Millisecond) for waiting on in-memory updates. While this works, the WaitForCondition polling pattern used in ratelimitenforcement_test.go would be more robust against timing variations.

This is a minor improvement suggestion and not blocking.

.github/workflows/scripts/run-governance-e2e-tests.sh (1)

167-173: Consider adding -p flag for parallel test execution.

The governance tests use t.Parallel() internally. Adding -p flag to control parallelism could help manage resource usage in CI environments.

-if ! GOWORK=off go test -v -timeout 10m ./...; then
+if ! GOWORK=off go test -v -timeout 10m -p 4 ./...; then

This is optional and depends on CI resource constraints.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2329d85 and 2373fc5.

⛔ Files ignored due to path filters (1)
  • tests/governance/go.sum is excluded by !**/*.sum
📒 Files selected for processing (40)
  • .github/workflows/scripts/release-single-plugin.sh
  • .github/workflows/scripts/run-governance-e2e-tests.sh
  • Makefile
  • core/schemas/plugin_test.go
  • docs/architecture/core/mcp.mdx
  • docs/contributing/setting-up-repo.mdx
  • docs/docs.json
  • docs/enterprise/adaptive-load-balancing.mdx
  • docs/features/governance/mcp-tools.mdx
  • docs/features/governance/routing.mdx
  • docs/mcp/agent-mode.mdx
  • docs/mcp/code-mode.mdx
  • docs/mcp/connecting-to-servers.mdx
  • docs/mcp/filtering.mdx
  • docs/mcp/gateway-url.mdx
  • docs/mcp/overview.mdx
  • docs/mcp/tool-execution.mdx
  • docs/mcp/tool-hosting.mdx
  • docs/providers/provider-routing.mdx
  • docs/quickstart/gateway/cli-agents.mdx
  • docs/quickstart/gateway/setting-up.mdx
  • docs/quickstart/gateway/tool-calling.mdx
  • docs/quickstart/go-sdk/tool-calling.mdx
  • plugins/governance/fixtures_test.go
  • plugins/governance/test_utils.go
  • tests/governance/advancedscenarios_test.go
  • tests/governance/config.json
  • tests/governance/configupdatesync_test.go
  • tests/governance/customerbudget_test.go
  • tests/governance/e2e_test.go
  • tests/governance/edgecases_test.go
  • tests/governance/go.mod
  • tests/governance/inmemorysync_test.go
  • tests/governance/providerbudget_test.go
  • tests/governance/ratelimit_test.go
  • tests/governance/ratelimitenforcement_test.go
  • tests/governance/teambudget_test.go
  • tests/governance/test_utils.go
  • tests/governance/usagetracking_test.go
  • tests/governance/vkbudget_test.go
💤 Files with no reviewable changes (1)
  • plugins/governance/fixtures_test.go
🚧 Files skipped from review as they are similar to previous changes (11)
  • tests/governance/go.mod
  • docs/mcp/gateway-url.mdx
  • tests/governance/ratelimit_test.go
  • docs/quickstart/go-sdk/tool-calling.mdx
  • tests/governance/config.json
  • docs/architecture/core/mcp.mdx
  • docs/contributing/setting-up-repo.mdx
  • tests/governance/inmemorysync_test.go
  • docs/mcp/code-mode.mdx
  • docs/mcp/agent-mode.mdx
  • docs/mcp/filtering.mdx
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • tests/governance/customerbudget_test.go
  • tests/governance/vkbudget_test.go
  • tests/governance/teambudget_test.go
  • docs/quickstart/gateway/setting-up.mdx
  • docs/providers/provider-routing.mdx
  • Makefile
  • tests/governance/ratelimitenforcement_test.go
  • plugins/governance/test_utils.go
  • tests/governance/edgecases_test.go
  • tests/governance/configupdatesync_test.go
  • tests/governance/providerbudget_test.go
  • tests/governance/e2e_test.go
  • tests/governance/test_utils.go
  • docs/quickstart/gateway/cli-agents.mdx
  • docs/features/governance/routing.mdx
  • docs/features/governance/mcp-tools.mdx
  • tests/governance/usagetracking_test.go
  • core/schemas/plugin_test.go
  • docs/enterprise/adaptive-load-balancing.mdx
  • docs/docs.json
  • tests/governance/advancedscenarios_test.go
  • docs/quickstart/gateway/tool-calling.mdx
🧠 Learnings (4)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • tests/governance/customerbudget_test.go
  • tests/governance/vkbudget_test.go
  • tests/governance/teambudget_test.go
  • tests/governance/ratelimitenforcement_test.go
  • plugins/governance/test_utils.go
  • tests/governance/edgecases_test.go
  • tests/governance/configupdatesync_test.go
  • tests/governance/providerbudget_test.go
  • tests/governance/e2e_test.go
  • tests/governance/test_utils.go
  • tests/governance/usagetracking_test.go
  • core/schemas/plugin_test.go
  • tests/governance/advancedscenarios_test.go
📚 Learning: 2025-12-29T11:54:55.836Z
Learnt from: akshaydeo
Repo: maximhq/bifrost PR: 1153
File: framework/configstore/rdb.go:2221-2246
Timestamp: 2025-12-29T11:54:55.836Z
Learning: In Go reviews, do not flag range-over-int patterns like for i := range n as compile-time errors, assuming Go 1.22+ semantics. Only flag actual range-capable values (slices, arrays, maps, channels, strings) and other compile-time issues. This applies to all Go files across the repository.

Applied to files:

  • tests/governance/customerbudget_test.go
  • tests/governance/vkbudget_test.go
  • tests/governance/teambudget_test.go
  • tests/governance/ratelimitenforcement_test.go
  • plugins/governance/test_utils.go
  • tests/governance/edgecases_test.go
  • tests/governance/configupdatesync_test.go
  • tests/governance/providerbudget_test.go
  • tests/governance/e2e_test.go
  • tests/governance/test_utils.go
  • tests/governance/usagetracking_test.go
  • core/schemas/plugin_test.go
  • tests/governance/advancedscenarios_test.go
📚 Learning: 2025-12-30T05:37:48.365Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1180
File: docs/features/mcp/connecting-to-servers.mdx:452-458
Timestamp: 2025-12-30T05:37:48.365Z
Learning: When reviewing documentation PRs in a Graphite-managed stack, first check related or previous PRs in the stack for feature implementations before flagging documentation as incorrect or unsupported. Documentation MDX files often reference features implemented in earlier stack PRs; verify that the documented behavior exists in earlier changes and that the docs accurately reflect the implemented state before requesting edits.

Applied to files:

  • docs/quickstart/gateway/setting-up.mdx
  • docs/providers/provider-routing.mdx
  • docs/quickstart/gateway/cli-agents.mdx
  • docs/features/governance/routing.mdx
  • docs/features/governance/mcp-tools.mdx
  • docs/enterprise/adaptive-load-balancing.mdx
  • docs/quickstart/gateway/tool-calling.mdx
📚 Learning: 2025-12-22T10:50:40.990Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1154
File: plugins/governance/store.go:1165-1186
Timestamp: 2025-12-22T10:50:40.990Z
Learning: In the Bifrost governance plugin, budgets and rate limits have 1:1 relationships with their parent entities (virtual keys, teams, customers). Do not assume sharing; ensure cascade deletion logic only deletes budgets/rate limits when there are no shared references. Enforce invariants in code and add tests to verify no cross-entity sharing and that cascade deletes only remove the specific child of the parent. If a counterexample arises, adjust data model or add guards.

Applied to files:

  • plugins/governance/test_utils.go
🧬 Code graph analysis (10)
tests/governance/customerbudget_test.go (1)
tests/governance/test_utils.go (1)
  • ExtractIDFromResponse (302-322)
tests/governance/vkbudget_test.go (1)
tests/governance/test_utils.go (1)
  • ExtractIDFromResponse (302-322)
tests/governance/teambudget_test.go (1)
tests/governance/test_utils.go (1)
  • ExtractIDFromResponse (302-322)
tests/governance/ratelimitenforcement_test.go (1)
tests/governance/test_utils.go (4)
  • ExtractIDFromResponse (302-322)
  • WaitForCondition (450-473)
  • MakeRequest (91-141)
  • APIRequest (76-81)
tests/governance/edgecases_test.go (1)
tests/governance/test_utils.go (1)
  • ExtractIDFromResponse (302-322)
tests/governance/configupdatesync_test.go (1)
tests/governance/test_utils.go (1)
  • ExtractIDFromResponse (302-322)
tests/governance/providerbudget_test.go (1)
tests/governance/test_utils.go (1)
  • ExtractIDFromResponse (302-322)
tests/governance/test_utils.go (1)
ui/lib/types/governance.ts (2)
  • Budget (5-11)
  • RateLimit (13-25)
tests/governance/usagetracking_test.go (2)
tests/governance/test_utils.go (6)
  • ExtractIDFromResponse (302-322)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (209-218)
  • CreateRateLimitRequest (256-261)
  • WaitForCondition (450-473)
ui/lib/types/governance.ts (1)
  • RateLimit (13-25)
tests/governance/advancedscenarios_test.go (1)
tests/governance/test_utils.go (1)
  • ExtractIDFromResponse (302-322)
🪛 LanguageTool
docs/quickstart/gateway/tool-calling.mdx

[grammar] ~130-~130: Use a hyphen to join words.
Context: ...e about MCP connections and advanced end to end tool execution in the [MCP Featur...

(QB_NEW_EN_HYPHEN)


[grammar] ~130-~130: Use a hyphen to join words.
Context: ...bout MCP connections and advanced end to end tool execution in the [MCP Features]...

(QB_NEW_EN_HYPHEN)

🔇 Additional comments (58)
docs/quickstart/gateway/setting-up.mdx (1)

206-208: All link updates are valid and reference existing documentation.

The path changes have been verified: docs/mcp/overview.mdx, docs/features/governance/virtual-keys.mdx, and docs/deployment-guides/k8s.mdx all exist and contain substantive content. These are appropriate entry points for quickstart users seeking specific functionality details.

docs/features/governance/mcp-tools.mdx (1)

11-11: LGTM!

The MCP client reference link is correctly updated to point to the new ../../mcp/overview path, consistent with the broader MCP documentation reorganization in this PR.

core/schemas/plugin_test.go (2)

9-153: Well-structured table-driven tests.

The test cases comprehensively cover edge cases (nil map, empty key, key not found) and various case sensitivity scenarios for security-critical headers like authorization, x-bf-vk, and x-api-key.


1-1: LGTM!

The package declaration change from governance to schemas is correct—the caseInsensitiveLookup function is properly defined in core/schemas/plugin.go and the test is appropriately co-located. The test correctly covers case-insensitive lookups for various header and query parameters.

docs/quickstart/gateway/tool-calling.mdx (1)

163-165: LGTM!

The updated links correctly point to the new documentation paths, consistent with the broader documentation reorganization in this PR.

docs/enterprise/adaptive-load-balancing.mdx (1)

7-17: LGTM!

The new Info block provides clear navigation to the comprehensive Provider Routing Guide, and the updated Overview accurately describes the two-level architecture (provider selection + key selection). This improves documentation discoverability.

docs/quickstart/gateway/cli-agents.mdx (4)

356-356: LGTM!

MCP Integration link correctly updated to the new path.


368-368: LGTM!

The simplified version note is cleaner and more maintainable.


399-399: LGTM!

MCP Gateway URL link correctly updated to the new path structure.


404-404: LGTM!

The Governance link now points to the more specific virtual-keys page, which is contextually appropriate for the agent access control discussion in this section.

docs/features/governance/routing.mdx (4)

7-13: Good addition of cross-reference to the comprehensive Provider Routing Guide.

The Info block helps users navigate to more detailed documentation when needed, which improves the user experience.


34-42: Clear documentation of Model Validation behavior.

The explanation of explicit allowed_models vs empty allowed_models (Model Catalog) behavior is well-structured and helps users understand the validation flow.


83-104: Helpful example demonstrating empty allowed_models with Model Catalog.

The JSON example and outcome descriptions clearly illustrate how the Model Catalog determines routing when no explicit models are specified.


313-331: Useful troubleshooting section for Model Catalog Sync Failures.

The diagnostic steps and remediation guidance will help operators debug configuration issues in production.

plugins/governance/test_utils.go (2)

71-194: Well-structured test data builders.

The builder functions follow a clean pattern with good composability (e.g., buildVirtualKeyWithBudget building on buildVirtualKey). The use of pointer fields for optional budget/rate-limit IDs is appropriate.


198-222: Good use of t.Helper() in assertion functions.

This ensures test failure messages point to the actual test line rather than the helper function, improving debuggability.

tests/governance/test_utils.go (2)

448-503: Well-designed polling utilities with progressive backoff.

WaitForCondition and WaitForAPICondition implement sensible progressive backoff (50ms → 500ms and 100ms → 500ms respectively) which balances responsiveness with resource usage. The logging on timeout is helpful for debugging flaky tests.


384-423: Robust retry logic in deleteWithRetry with good edge case handling.

The function correctly handles 200, 204 (success), and 404 (resource already gone) status codes. The progressive backoff and informative logging will help diagnose cleanup issues.

tests/governance/e2e_test.go (3)

1236-1294: Good use of WaitForCondition for verifying VK lifecycle in memory store.

The polling-based approach for verifying VK existence and removal is more robust than fixed sleeps. The 5-second timeout with clear error messages will help diagnose issues.


1562-1679: Comprehensive test for all Virtual Key header formats.

The TestVirtualKeyHeaderFormats test validates all documented header formats (x-bf-vk, Authorization Bearer, x-api-key, x-goog-api-key) which ensures backward compatibility and cross-SDK support. The table-driven approach with subtests is clean and extensible.


43-43: Consistent update to simplified ExtractIDFromResponse signature.

All calls have been updated from ExtractIDFromResponse(t, resp, "id") to ExtractIDFromResponse(t, resp), aligning with the new implementation in tests/governance/test_utils.go that automatically navigates to find the ID.

Also applies to: 63-63, 84-84, 213-213, 235-235, 270-270, 383-383, 526-526, 675-675, 791-791, 888-888, 1098-1098, 1118-1118, 1228-1228, 1367-1367, 1501-1501, 1597-1597

tests/governance/usagetracking_test.go (4)

327-387: Excellent use of WaitForCondition for async usage verification.

The polling-based approach correctly handles the asynchronous nature of PostHook goroutine updates. The 3-second timeout with detailed condition checking (rate_limit_id existence → rate_limits fetch → token_current_usage > 0) provides a robust verification chain.


212-213: Appropriate increase in PostHook wait time.

Increasing from 500ms to 2s provides more margin for async budget updates to complete, which should reduce flakiness in slower CI environments.


464-496: Good adjustment to reset timing with clear documentation.

The comment explains the timing logic (reset ticker runs every 10s, budget resets at 30s). Adding a 40s wait instead of 35s provides buffer for processing delays.


35-35: Consistent update to ExtractIDFromResponse signature across all test functions.

All usages are aligned with the new simplified API.

Also applies to: 157-157, 283-283, 432-432, 558-558

tests/governance/customerbudget_test.go (5)

34-34: LGTM!

The updated ExtractIDFromResponse(t, createCustomerResp) call correctly uses the new simplified signature. The helper will automatically find the ID under the "customer" key in the response body.


57-57: LGTM!

Correctly updated to use the new ExtractIDFromResponse signature. The helper will find the ID under the "virtual_key" key.


188-188: LGTM!

Consistent with the signature update across the test suite.


211-211: LGTM!

The helper will correctly extract the ID from the "team" key in the response.


232-232: LGTM!

Consistent signature update for VK ID extraction.

tests/governance/providerbudget_test.go (1)

49-49: LGTM!

The updated ExtractIDFromResponse(t, createVKResp) call correctly uses the simplified signature. The helper will automatically locate the ID under the "virtual_key" key in the response body.

tests/governance/teambudget_test.go (2)

34-34: LGTM!

Correctly updated to use the new ExtractIDFromResponse signature. The helper will find the ID under the "team" key.


57-57: LGTM!

Consistent signature update for VK ID extraction.

tests/governance/edgecases_test.go (3)

35-35: LGTM!

Correctly updated to use the new ExtractIDFromResponse signature for customer ID extraction.


57-57: LGTM!

Consistent signature update for team ID extraction.


89-89: LGTM!

Consistent signature update for VK ID extraction.

tests/governance/configupdatesync_test.go (9)

40-40: LGTM!

Correctly updated to use the new ExtractIDFromResponse signature.


265-265: LGTM!

Consistent signature update for VK ID extraction.


430-430: LGTM!

Consistent signature update.


614-614: LGTM!

Consistent signature update for team ID extraction.


632-632: LGTM!

Consistent signature update for VK ID extraction.


798-798: LGTM!

Consistent signature update for customer ID extraction.


816-816: LGTM!

Consistent signature update for team ID extraction.


833-833: LGTM!

Consistent signature update for VK ID extraction.


995-995: LGTM!

Consistent signature update for VK ID extraction.

tests/governance/ratelimitenforcement_test.go (2)

37-37: LGTM!

The ExtractIDFromResponse call correctly uses the new two-argument signature, consistent with the updated utility function in test_utils.go.


590-618: Good improvement: polling replaces fixed sleep.

The WaitForCondition polling pattern is more robust than the previous fixed 1-second sleep. The implementation properly:

  • Validates response status codes
  • Guards against nil maps before accessing nested keys
  • Uses a reasonable 3-second timeout with a clear failure message

This approach handles variable in-memory propagation times gracefully.

tests/governance/advancedscenarios_test.go (1)

38-38: LGTM!

All ExtractIDFromResponse calls throughout this file correctly use the updated two-argument signature.

tests/governance/vkbudget_test.go (1)

33-34: LGTM!

The ExtractIDFromResponse call correctly uses the new signature, and the test logic for budget enforcement is sound.

.github/workflows/scripts/run-governance-e2e-tests.sh (2)

1-77: Well-structured E2E test orchestration script.

The script properly:

  • Uses strict bash options (set -euo pipefail)
  • Implements trap-based cleanup for reliable resource management
  • Preserves exit codes through cleanup
  • Handles force-kill fallback after graceful shutdown attempt

Good practice for CI/CD scripting.


132-152: Robust health check with process monitoring.

The polling loop correctly:

  • Checks both service availability and process liveness
  • Provides clear error diagnostics on failure
  • Uses reasonable timeout (30 seconds)
.github/workflows/scripts/release-single-plugin.sh (1)

79-111: LGTM!

The governance plugin release workflow correctly:

  1. Runs unit tests with coverage
  2. Handles optional Codecov upload
  3. Navigates to repo root for E2E test execution
  4. Returns to plugin directory after tests
  5. Provides clear error handling for missing scripts and test failures
docs/providers/provider-routing.mdx (1)

1-640: Comprehensive and well-structured documentation.

This new provider routing documentation effectively:

  • Explains the two routing methods (governance vs. adaptive load balancing)
  • Details the Model Catalog's role and syncing behavior
  • Provides clear request flow visualizations with Mermaid diagrams
  • Covers multiple scenarios with practical examples
  • Links to related documentation pages

The two-level architecture explanation (provider selection + key selection) is particularly helpful for understanding how governance and load balancing interact.

Makefile (2)

591-732: Well-implemented test-governance target.

The new Makefile target:

  • Follows existing patterns from test-core for consistency
  • Properly validates prerequisites (directory existence, mutual exclusivity)
  • Supports all expected modes (TESTCASE, PATTERN, DEBUG, all tests)
  • Generates JUnit XML reports with optional HTML conversion
  • Provides clear failure summaries

The implementation is comprehensive and aligns with the documented usage in PR objectives.


734-734: Consider including test-governance in test-all.

The test-all target currently runs test-core test-plugins test but doesn't include test-governance. If governance tests should be part of the full test suite, consider adding it:

-test-all: test-core test-plugins test ## Run all tests
+test-all: test-core test-plugins test test-governance ## Run all tests

Alternatively, if governance tests require a running Bifrost server and shouldn't be part of unit test runs, this separation is intentional.

docs/docs.json (3)

481-513: LGTM! Comprehensive redirects for MCP path migration.

All 8 MCP pages have corresponding redirects from the old /features/mcp/... paths to the new /mcp/... paths. This ensures existing links and bookmarks continue to work after the reorganization.


114-114: Verify that the docs/providers/provider-routing.mdx file exists and that all navigation entries reference valid documentation files.

The navigation entry "providers/provider-routing" should be confirmed to have a corresponding documentation file. Additionally, ensure all other related documentation files referenced in the docs.json structure exist at their specified paths, especially the MCP-related pages mentioned in the configuration.


165-172: LGTM! MCP documentation path reorganization.

The MCP Gateway pages have been properly moved from features/mcp/... to mcp/..., promoting MCP to a top-level documentation section. This is a clean structural improvement.

@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 01-13-feat_governance_e2e_tests_added_on_ci_and_doc_updates branch from 2373fc5 to 7e64ec7 Compare January 13, 2026 14:10
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/governance/usagetracking_test.go (1)

112-128: Missing assertion for rate limit reset verification.

The test waits 35 seconds for the rate limit to reset but only logs a success message without actually verifying the reset occurred. Consider fetching the rate limit data and asserting that token_current_usage is reset to 0 (similar to how TestUsageTrackingBudgetReset verifies usageAfterReset).

Suggested verification
 	// Wait for more than 30 seconds for the rate limit to reset
 	t.Logf("Waiting 35 seconds for rate limit ticker to reset...")
 	time.Sleep(35 * time.Second)

 	// Get rate limit data after reset
-	getDataResp3 := MakeRequest(t, APIRequest{
+	getRateLimitsResp := MakeRequest(t, APIRequest{
 		Method: "GET",
-		Path:   "/api/governance/virtual-keys?from_memory=true",
+		Path:   "/api/governance/rate-limits?from_memory=true",
 	})

-	if getDataResp3.StatusCode != 200 {
-		t.Fatalf("Failed to get governance data after reset wait: status %d", getDataResp3.StatusCode)
+	if getRateLimitsResp.StatusCode != 200 {
+		t.Fatalf("Failed to get rate limits after reset wait: status %d", getRateLimitsResp.StatusCode)
 	}

-	// Verify rate limit has been reset (usage should be 0 or close to it)
-	t.Logf("Rate limit reset should have occurred after 30s timeout ✓")
+	rateLimitsMap := getRateLimitsResp.Body["rate_limits"].(map[string]interface{})
+	rateLimitData := rateLimitsMap[rateLimitID].(map[string]interface{})
+	usageAfterReset, _ := rateLimitData["token_current_usage"].(float64)
+
+	if usageAfterReset > 0 {
+		t.Fatalf("Rate limit not reset after 30s timeout: usage is %d tokens (should be 0)", int(usageAfterReset))
+	}
+
+	t.Logf("Rate limit reset correctly after 30s timeout ✓")
🤖 Fix all issues with AI agents
In @docs/quickstart/gateway/tool-calling.mdx:
- Line 130: Update the phrase "end to end" to the hyphenated adjective form
"end-to-end" in the sentence referencing MCP connections and advanced end-to-end
tool execution (the line containing "Read more about MCP connections and
advanced end to end tool execution in the [MCP Features](../../mcp/overview)
section.").
🧹 Nitpick comments (11)
.github/workflows/scripts/run-governance-e2e-tests.sh (1)

79-94: Consider validating OPENAI_API_KEY format.

The prerequisites check validates that OPENAI_API_KEY is set but doesn't validate its format. A malformed key would cause cryptic test failures later.

💡 Optional: Add basic format validation
 # Check required environment variables
 if [ -z "${OPENAI_API_KEY:-}" ]; then
   echo -e "${RED}❌ OPENAI_API_KEY environment variable is required${NC}"
   echo -e "${YELLOW}Set it with: export OPENAI_API_KEY='sk-...'${NC}"
   exit 1
 fi
+
+# Basic format validation (OpenAI keys typically start with sk-)
+if [[ ! "$OPENAI_API_KEY" =~ ^sk- ]]; then
+  echo -e "${YELLOW}⚠️ Warning: OPENAI_API_KEY doesn't match expected format (sk-...)${NC}"
+fi
.github/workflows/scripts/release-single-plugin.sh (1)

96-111: Directory navigation could be fragile if script fails mid-execution.

After cd ../.. on line 99, if the E2E script fails or errors occur, the script exits but we're no longer in $PLUGIN_DIR. While the set -e ensures the script exits on failure, consider using a subshell or pushd/popd for safer directory handling.

💡 Optional: Use pushd/popd for safer navigation
       # Run E2E tests for governance plugin
       echo ""
       echo "🛡️ Running governance E2E tests..."
-      cd ../..
+      pushd ../.. > /dev/null
       E2E_SCRIPT=".github/workflows/scripts/run-governance-e2e-tests.sh"
       if [ ! -f "$E2E_SCRIPT" ]; then
         echo "❌ Governance E2E test script not found: $E2E_SCRIPT"
         exit 1
       fi
       chmod +x "$E2E_SCRIPT" || true
       if ! bash "$E2E_SCRIPT"; then
         echo "❌ Governance E2E tests failed"
         exit 1
       fi
       echo "✅ Governance E2E tests passed"
-      cd "$PLUGIN_DIR"
+      popd > /dev/null
tests/governance/ratelimit_test.go (2)

759-760: Consider using WaitForCondition for more robust synchronization.

The fixed 2-second sleep works but could be flaky under load or slow CI environments. The WaitForCondition polling pattern (already used in ratelimitenforcement_test.go) provides more robust synchronization with progressive backoff.

♻️ Optional: Replace fixed sleep with polling
-	// Wait for async PostHook goroutine to complete usage update
-	time.Sleep(2 * time.Second)
+	// Wait for async PostHook goroutine to complete usage update
+	WaitForCondition(t, func() bool {
+		getDataResp := MakeRequest(t, APIRequest{
+			Method: "GET",
+			Path:   "/api/governance/rate-limits?from_memory=true",
+		})
+		if getDataResp.StatusCode != 200 {
+			return false
+		}
+		rateLimitsMap, ok := getDataResp.Body["rate_limits"].(map[string]interface{})
+		if !ok {
+			return false
+		}
+		rateLimit, ok := rateLimitsMap[rateLimitID1].(map[string]interface{})
+		if !ok {
+			return false
+		}
+		tokenUsage, _ := rateLimit["token_current_usage"].(float64)
+		return tokenUsage > initialTokenUsage
+	}, 5*time.Second, "token usage updated in memory")

914-915: Same suggestion as above for consistency.

This is another location where WaitForCondition could replace the fixed sleep for more robust async synchronization.

tests/governance/inmemorysync_test.go (2)

506-533: Polling check only verifies HTTP status, not actual resource presence.

The WaitForCondition block checks that the endpoints return 200 status codes but doesn't verify that the specific created resources (vkID, teamID, customerID) are actually present in the response maps. This could lead to false positives if the endpoints return empty maps.

Consider enhancing the check to verify the specific resources exist:

♻️ Suggested enhancement
 allResourcesReady := WaitForCondition(t, func() bool {
   getVKResp := MakeRequest(t, APIRequest{
     Method: "GET",
     Path:   "/api/governance/virtual-keys?from_memory=true",
   })
   if getVKResp.StatusCode != 200 {
     return false
   }
+  vkMap, ok := getVKResp.Body["virtual_keys"].(map[string]interface{})
+  if !ok {
+    return false
+  }

   getTeamsResp := MakeRequest(t, APIRequest{
     Method: "GET",
     Path:   "/api/governance/teams?from_memory=true",
   })
   if getTeamsResp.StatusCode != 200 {
     return false
   }
+  teamsMap, ok := getTeamsResp.Body["teams"].(map[string]interface{})
+  if !ok || teamsMap[teamID] == nil {
+    return false
+  }

   getCustomersResp := MakeRequest(t, APIRequest{
     Method: "GET",
     Path:   "/api/governance/customers?from_memory=true",
   })
-  return getCustomersResp.StatusCode == 200
+  if getCustomersResp.StatusCode != 200 {
+    return false
+  }
+  customersMap, ok := getCustomersResp.Body["customers"].(map[string]interface{})
+  return ok && customersMap[customerID] != nil
 }, 3*time.Second, "all resources available in in-memory store")

51-63: Type assertions without safety checks can panic.

Multiple type assertions (e.g., lines 51, 59, 97, 104, 172, 284, 563-565) use direct casting without the , ok idiom. If the response structure differs unexpectedly, the test will panic rather than fail gracefully with a descriptive message.

This is less critical for tests but could make debugging harder when responses change. Consider using safe assertions for better failure diagnostics.

♻️ Example safe assertion pattern
-virtualKeysMap := getDataResp.Body["virtual_keys"].(map[string]interface{})
+virtualKeysMap, ok := getDataResp.Body["virtual_keys"].(map[string]interface{})
+if !ok {
+  t.Fatalf("Expected 'virtual_keys' to be a map, got: %T", getDataResp.Body["virtual_keys"])
+}
docs/features/governance/routing.mdx (1)

318-320: Consider using a generic timestamp in example log.

The example log entry uses a specific date (2026-01-13T14:18:53+05:30). While not incorrect, using a placeholder like 2025-XX-XXTXX:XX:XX+XX:XX or a clearly fictional date may age better.

plugins/governance/test_utils.go (1)

39-48: Log messages don't include formatted arguments.

The logging methods store only the format string, not the formatted result. For example, ml.errors = append(ml.errors, format) ignores the args. This may make test assertions harder if you need to verify specific log content.

♻️ Consider formatting the message
 func (ml *MockLogger) Error(format string, args ...interface{}) {
   ml.mu.Lock()
   defer ml.mu.Unlock()
-  ml.errors = append(ml.errors, format)
+  ml.errors = append(ml.errors, fmt.Sprintf(format, args...))
 }

Note: This would require adding "fmt" to imports.

tests/governance/test_utils.go (2)

197-206: rand.Seed is deprecated since Go 1.20.

The global rand.Seed function is deprecated. In Go 1.20+, the global random number generator is automatically seeded. You can simply remove this line, or use a local *rand.Rand instance if deterministic seeding is needed.

Suggested fix
 // generateRandomID generates a random ID for test resources
 func generateRandomID() string {
-	rand.Seed(time.Now().UnixNano())
 	const letters = "abcdefghijklmnopqrstuvwxyz0123456789"
 	b := make([]byte, 8)
 	for i := range b {
 		b[i] = letters[rand.Intn(len(letters))]
 	}
 	return string(b)
 }

91-195: Consider reducing duplication between MakeRequest and MakeRequestWithCustomHeaders.

These two functions share nearly identical code. MakeRequest could delegate to MakeRequestWithCustomHeaders to eliminate duplication.

Suggested refactor
 // MakeRequest makes an HTTP request to the Bifrost API
 func MakeRequest(t *testing.T, req APIRequest) *APIResponse {
-	client := &http.Client{}
-	url := fmt.Sprintf("http://localhost:8080%s", req.Path)
-
-	var body io.Reader
-	if req.Body != nil {
-		bodyBytes, err := json.Marshal(req.Body)
-		if err != nil {
-			t.Fatalf("Failed to marshal request body: %v", err)
-		}
-		body = bytes.NewReader(bodyBytes)
-	}
-
-	httpReq, err := http.NewRequest(req.Method, url, body)
-	if err != nil {
-		t.Fatalf("Failed to create HTTP request: %v", err)
-	}
-
-	httpReq.Header.Set("Content-Type", "application/json")
-
-	// Add virtual key header if provided
-	if req.VKHeader != nil {
-		httpReq.Header.Set("x-bf-vk", *req.VKHeader)
-	}
-
-	resp, err := client.Do(httpReq)
-	if err != nil {
-		t.Fatalf("Failed to execute HTTP request: %v", err)
-	}
-	defer resp.Body.Close()
-
-	rawBody, err := io.ReadAll(resp.Body)
-	if err != nil {
-		t.Fatalf("Failed to read response body: %v", err)
-	}
-
-	var responseBody map[string]interface{}
-	if len(rawBody) > 0 {
-		err = json.Unmarshal(rawBody, &responseBody)
-		if err != nil {
-			// If unmarshaling fails, store the raw response
-			responseBody = map[string]interface{}{"raw": string(rawBody)}
-		}
-	}
-
-	return &APIResponse{
-		StatusCode: resp.StatusCode,
-		Body:       responseBody,
-		RawBody:    rawBody,
-	}
+	headers := make(map[string]string)
+	if req.VKHeader != nil {
+		headers["x-bf-vk"] = *req.VKHeader
+	}
+	return MakeRequestWithCustomHeaders(t, req, headers)
 }
tests/governance/usagetracking_test.go (1)

212-213: Consider using WaitForCondition for async update verification.

The fixed 2-second sleep works but could be flaky under load. Given that TestInMemoryUsageUpdateOnRequest was refactored to use WaitForCondition for similar async propagation, consider applying the same pattern here for consistency and reliability.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2373fc5 and 7e64ec7.

⛔ Files ignored due to path filters (1)
  • tests/governance/go.sum is excluded by !**/*.sum
📒 Files selected for processing (41)
  • .github/workflows/release-pipeline.yml
  • .github/workflows/scripts/release-single-plugin.sh
  • .github/workflows/scripts/run-governance-e2e-tests.sh
  • Makefile
  • core/schemas/plugin_test.go
  • docs/architecture/core/mcp.mdx
  • docs/contributing/setting-up-repo.mdx
  • docs/docs.json
  • docs/enterprise/adaptive-load-balancing.mdx
  • docs/features/governance/mcp-tools.mdx
  • docs/features/governance/routing.mdx
  • docs/mcp/agent-mode.mdx
  • docs/mcp/code-mode.mdx
  • docs/mcp/connecting-to-servers.mdx
  • docs/mcp/filtering.mdx
  • docs/mcp/gateway-url.mdx
  • docs/mcp/overview.mdx
  • docs/mcp/tool-execution.mdx
  • docs/mcp/tool-hosting.mdx
  • docs/providers/provider-routing.mdx
  • docs/quickstart/gateway/cli-agents.mdx
  • docs/quickstart/gateway/setting-up.mdx
  • docs/quickstart/gateway/tool-calling.mdx
  • docs/quickstart/go-sdk/tool-calling.mdx
  • plugins/governance/fixtures_test.go
  • plugins/governance/test_utils.go
  • tests/governance/advancedscenarios_test.go
  • tests/governance/config.json
  • tests/governance/configupdatesync_test.go
  • tests/governance/customerbudget_test.go
  • tests/governance/e2e_test.go
  • tests/governance/edgecases_test.go
  • tests/governance/go.mod
  • tests/governance/inmemorysync_test.go
  • tests/governance/providerbudget_test.go
  • tests/governance/ratelimit_test.go
  • tests/governance/ratelimitenforcement_test.go
  • tests/governance/teambudget_test.go
  • tests/governance/test_utils.go
  • tests/governance/usagetracking_test.go
  • tests/governance/vkbudget_test.go
💤 Files with no reviewable changes (1)
  • plugins/governance/fixtures_test.go
🚧 Files skipped from review as they are similar to previous changes (12)
  • docs/architecture/core/mcp.mdx
  • tests/governance/providerbudget_test.go
  • docs/quickstart/go-sdk/tool-calling.mdx
  • docs/quickstart/gateway/setting-up.mdx
  • tests/governance/advancedscenarios_test.go
  • Makefile
  • docs/enterprise/adaptive-load-balancing.mdx
  • tests/governance/go.mod
  • tests/governance/config.json
  • docs/contributing/setting-up-repo.mdx
  • docs/mcp/agent-mode.mdx
  • tests/governance/edgecases_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/schemas/plugin_test.go
  • docs/features/governance/mcp-tools.mdx
  • tests/governance/inmemorysync_test.go
  • docs/features/governance/routing.mdx
  • tests/governance/ratelimitenforcement_test.go
  • tests/governance/teambudget_test.go
  • tests/governance/usagetracking_test.go
  • tests/governance/ratelimit_test.go
  • docs/mcp/filtering.mdx
  • plugins/governance/test_utils.go
  • docs/quickstart/gateway/cli-agents.mdx
  • docs/providers/provider-routing.mdx
  • tests/governance/vkbudget_test.go
  • tests/governance/customerbudget_test.go
  • tests/governance/configupdatesync_test.go
  • tests/governance/e2e_test.go
  • tests/governance/test_utils.go
  • docs/mcp/gateway-url.mdx
  • docs/mcp/code-mode.mdx
  • docs/docs.json
  • docs/quickstart/gateway/tool-calling.mdx
🧠 Learnings (4)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • core/schemas/plugin_test.go
  • tests/governance/inmemorysync_test.go
  • tests/governance/ratelimitenforcement_test.go
  • tests/governance/teambudget_test.go
  • tests/governance/usagetracking_test.go
  • tests/governance/ratelimit_test.go
  • plugins/governance/test_utils.go
  • tests/governance/vkbudget_test.go
  • tests/governance/customerbudget_test.go
  • tests/governance/configupdatesync_test.go
  • tests/governance/e2e_test.go
  • tests/governance/test_utils.go
📚 Learning: 2025-12-29T11:54:55.836Z
Learnt from: akshaydeo
Repo: maximhq/bifrost PR: 1153
File: framework/configstore/rdb.go:2221-2246
Timestamp: 2025-12-29T11:54:55.836Z
Learning: In Go reviews, do not flag range-over-int patterns like for i := range n as compile-time errors, assuming Go 1.22+ semantics. Only flag actual range-capable values (slices, arrays, maps, channels, strings) and other compile-time issues. This applies to all Go files across the repository.

Applied to files:

  • core/schemas/plugin_test.go
  • tests/governance/inmemorysync_test.go
  • tests/governance/ratelimitenforcement_test.go
  • tests/governance/teambudget_test.go
  • tests/governance/usagetracking_test.go
  • tests/governance/ratelimit_test.go
  • plugins/governance/test_utils.go
  • tests/governance/vkbudget_test.go
  • tests/governance/customerbudget_test.go
  • tests/governance/configupdatesync_test.go
  • tests/governance/e2e_test.go
  • tests/governance/test_utils.go
📚 Learning: 2025-12-30T05:37:48.365Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1180
File: docs/features/mcp/connecting-to-servers.mdx:452-458
Timestamp: 2025-12-30T05:37:48.365Z
Learning: When reviewing documentation PRs in a Graphite-managed stack, first check related or previous PRs in the stack for feature implementations before flagging documentation as incorrect or unsupported. Documentation MDX files often reference features implemented in earlier stack PRs; verify that the documented behavior exists in earlier changes and that the docs accurately reflect the implemented state before requesting edits.

Applied to files:

  • docs/features/governance/mcp-tools.mdx
  • docs/features/governance/routing.mdx
  • docs/mcp/filtering.mdx
  • docs/quickstart/gateway/cli-agents.mdx
  • docs/providers/provider-routing.mdx
  • docs/mcp/gateway-url.mdx
  • docs/mcp/code-mode.mdx
  • docs/quickstart/gateway/tool-calling.mdx
📚 Learning: 2025-12-22T10:50:40.990Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1154
File: plugins/governance/store.go:1165-1186
Timestamp: 2025-12-22T10:50:40.990Z
Learning: In the Bifrost governance plugin, budgets and rate limits have 1:1 relationships with their parent entities (virtual keys, teams, customers). Do not assume sharing; ensure cascade deletion logic only deletes budgets/rate limits when there are no shared references. Enforce invariants in code and add tests to verify no cross-entity sharing and that cascade deletes only remove the specific child of the parent. If a counterexample arises, adjust data model or add guards.

Applied to files:

  • plugins/governance/test_utils.go
🧬 Code graph analysis (10)
tests/governance/inmemorysync_test.go (1)
tests/governance/test_utils.go (4)
  • ExtractIDFromResponse (302-322)
  • WaitForCondition (450-473)
  • MakeRequest (91-141)
  • APIRequest (76-81)
tests/governance/ratelimitenforcement_test.go (1)
tests/governance/test_utils.go (4)
  • ExtractIDFromResponse (302-322)
  • WaitForCondition (450-473)
  • MakeRequest (91-141)
  • APIRequest (76-81)
tests/governance/teambudget_test.go (1)
tests/governance/test_utils.go (1)
  • ExtractIDFromResponse (302-322)
tests/governance/usagetracking_test.go (2)
tests/governance/test_utils.go (6)
  • ExtractIDFromResponse (302-322)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (209-218)
  • CreateRateLimitRequest (256-261)
  • WaitForCondition (450-473)
ui/lib/types/governance.ts (1)
  • RateLimit (13-25)
tests/governance/ratelimit_test.go (1)
tests/governance/test_utils.go (1)
  • ExtractIDFromResponse (302-322)
plugins/governance/test_utils.go (2)
ui/lib/types/governance.ts (3)
  • Budget (5-11)
  • RateLimit (13-25)
  • VirtualKey (61-80)
plugins/governance/resolver.go (1)
  • Decision (15-15)
tests/governance/vkbudget_test.go (1)
tests/governance/test_utils.go (1)
  • ExtractIDFromResponse (302-322)
tests/governance/customerbudget_test.go (1)
tests/governance/test_utils.go (1)
  • ExtractIDFromResponse (302-322)
tests/governance/configupdatesync_test.go (1)
tests/governance/test_utils.go (1)
  • ExtractIDFromResponse (302-322)
tests/governance/test_utils.go (2)
ui/lib/types/governance.ts (2)
  • Budget (5-11)
  • RateLimit (13-25)
core/schemas/models.go (1)
  • Model (107-127)
🪛 LanguageTool
docs/quickstart/gateway/tool-calling.mdx

[grammar] ~130-~130: Use a hyphen to join words.
Context: ...e about MCP connections and advanced end to end tool execution in the [MCP Featur...

(QB_NEW_EN_HYPHEN)


[grammar] ~130-~130: Use a hyphen to join words.
Context: ...bout MCP connections and advanced end to end tool execution in the [MCP Features]...

(QB_NEW_EN_HYPHEN)

🔇 Additional comments (58)
docs/mcp/code-mode.mdx (1)

8-10: LGTM!

The changelog link removal is a reasonable documentation cleanup. The version availability information (v1.4.0-prerelease1 and above) is preserved, which is the essential detail users need. This aligns with the broader documentation cleanup happening across MCP-related docs in this PR.

.github/workflows/scripts/run-governance-e2e-tests.sh (7)

1-32: Script setup and configuration look solid.

The script uses strict mode (set -euo pipefail), has clear configuration variables, and properly resolves the repository root. The color codes and timeouts are well-defined.


34-74: Cleanup function is well-implemented with graceful shutdown.

The cleanup logic handles the PID file properly, attempts graceful termination before force-killing, and preserves logs for debugging. The exit code preservation via local exit_code=$? at the start is correct.


76-77: Trap setup is correct.

Trapping EXIT INT TERM ensures cleanup runs on normal exit, Ctrl+C, and termination signals.


96-112: Build step is well-structured.

Using LOCAL=1 make build to test local plugin code and validating the binary exists are good practices.


114-153: Robust server startup with proper health polling.

The startup logic correctly:

  • Creates the data directory
  • Captures the PID and saves it
  • Polls /health with timeout
  • Checks if the process died during startup
  • Shows logs on failure

168-174: Test exit code handling has a subtle issue.

When set -e is active and a command in an if condition fails, the exit code is captured but set -e won't trigger script exit. However, the current pattern works correctly because the failure is handled in the if block. The issue is that TEST_EXIT_CODE=0 initialization followed by the if ! pattern is slightly redundant.

This works correctly as-is, so just a minor observation.


157-176: Test execution section is clean.

Disabling the Go workspace with GOWORK=off and using a 10-minute timeout are appropriate for E2E tests.

.github/workflows/scripts/release-single-plugin.sh (3)

79-94: Unit test coverage handling for governance plugin looks good.

The coverage file is properly cleaned up in both the Codecov upload path and the skip path.


100-108: E2E script invocation is well-guarded.

The script checks for existence, sets execute permissions, and properly handles failure. Using bash "$E2E_SCRIPT" explicitly (rather than just "$E2E_SCRIPT") is a safe choice.


112-127: Non-governance plugin test path is consistent.

The else branch maintains the original behavior with proper coverage handling.

docs/features/governance/mcp-tools.mdx (1)

11-11: LGTM!

The link path update to ../../mcp/overview is consistent with the broader MCP documentation restructuring across this PR.

core/schemas/plugin_test.go (1)

1-154: LGTM!

Well-structured table-driven test with comprehensive coverage for case-insensitive header lookup. The test cases appropriately cover:

  • Edge cases (nil map, empty key, key not found)
  • Exact matches and various case combinations
  • Real-world header variations (x-bf-vk, authorization, x-api-key, x-goog-api-key)

The package declaration correctly matches the directory structure.

docs/mcp/filtering.mdx (2)

256-256: LGTM!

Link path update correctly points to the new governance mcp-tools location.


441-441: LGTM!

Card href update is consistent with the link update on line 256.

docs/quickstart/gateway/tool-calling.mdx (1)

163-165: LGTM!

Link path updates are consistent with the documentation restructuring across the PR.

docs/mcp/gateway-url.mdx (3)

9-9: LGTM!

Simplifying the note by removing the changelog reference is appropriate.


219-219: LGTM!

Link update is consistent with the documentation restructuring.


361-361: LGTM!

Card href update aligns with the link update on line 219.

tests/governance/ratelimit_test.go (1)

35-36: LGTM!

The updated ExtractIDFromResponse signature aligns with the refactored helper in test_utils.go that auto-discovers the ID from nested response structures.

tests/governance/ratelimitenforcement_test.go (2)

37-38: LGTM!

Signature update aligns with the refactored helper.


590-618: LGTM! Good use of polling pattern.

Replacing the fixed sleep with WaitForCondition provides more robust synchronization:

  • Progressive backoff reduces unnecessary waits
  • Proper nil checks guard against race conditions
  • Clear timeout and failure message aid debugging

This pattern is preferable to fixed sleeps for async operations.

tests/governance/inmemorysync_test.go (3)

33-33: LGTM! Consistent use of simplified ExtractIDFromResponse signature.

The migration to the two-argument form aligns with the updated ExtractIDFromResponse utility that automatically navigates through virtual_key, team, or customer nested fields.

Also applies to: 157-157, 269-269, 380-380, 471-471, 487-487, 503-503


386-408: Good improvement: polling-based verification for VK creation.

Replacing fixed sleeps with WaitForCondition polling improves test reliability by waiting only as long as needed while still respecting a timeout. The status code check and type assertion guard inside the polling function correctly handle transient failures.


424-446: Good improvement: polling-based verification for VK deletion.

The polling approach for deletion verification is well-structured—returning true when the VK is no longer found correctly captures the expected state.

docs/features/governance/routing.mdx (4)

7-13: Good addition: cross-reference to comprehensive Provider Routing Guide.

The Info block clearly directs users to the comprehensive guide while establishing the scope of this page. This improves navigation and reduces content duplication.


34-42: Clear explanation of Model Validation behavior.

The documentation clearly explains the three scenarios (explicit allowed_models, empty allowed_models with Model Catalog, and the warning about sync failures). The note about cross-provider routing is important to prevent user confusion.


83-104: Helpful example demonstrating Model Catalog behavior.

The JSON example with empty allowed_models arrays and the subsequent behavior explanation clarifies how the Model Catalog determines routing eligibility per provider.


313-331: Well-structured troubleshooting section.

The troubleshooting guidance includes the warning message format, explains what it means, and provides actionable remediation steps. This will help users diagnose Model Catalog sync issues effectively.

plugins/governance/test_utils.go (4)

15-67: Well-implemented MockLogger with thread-safe logging.

The MockLogger correctly uses a mutex to protect concurrent access to the log slices, which is important for parallel tests. The implementation satisfies the schemas.Logger interface.


63-67: MockLogger.Fatal doesn't terminate execution.

The Fatal method appends to the errors slice but doesn't panic or terminate, which differs from typical logger Fatal behavior that calls os.Exit(1). This is likely intentional for testability, but tests relying on Fatal to stop execution may continue unexpectedly.

If this is intentional, consider adding a comment to clarify the behavior.


69-194: Clean builder pattern for test data.

The builder functions (buildVirtualKey, buildBudget, buildRateLimit, buildTeam, buildCustomer, buildProviderConfig) provide a clear and maintainable way to construct test fixtures. The variations with suffixes (e.g., WithBudget, WithRateLimit, WithUsage) follow a sensible naming convention.


196-222: Good use of testify wrappers with t.Helper().

The assertion helpers properly call t.Helper() to ensure stack traces point to the calling test rather than the helper function. Wrapping require and assert provides consistent test assertion patterns.

docs/docs.json (3)

114-114: Good addition: provider-routing navigation entry.

The new providers/provider-routing entry in the navigation aligns with the cross-reference added in docs/features/governance/routing.mdx.


165-172: Clean reorganization of MCP documentation paths.

Moving MCP pages from features/mcp/* to mcp/* elevates MCP to a top-level documentation section, which reflects its importance as a feature.


481-513: Complete redirect coverage for MCP path migration.

All 8 MCP pages have corresponding redirects from the old features/mcp/* paths to the new mcp/* paths, ensuring existing links and bookmarks continue to work.

docs/quickstart/gateway/cli-agents.mdx (4)

356-356: Link correctly updated to new MCP path.

The MCP Integration link now points to ../../mcp/overview, consistent with the MCP documentation reorganization.


368-369: Simplified version note.

The note now states the version requirement without the changelog link, making it cleaner while still informing users of the minimum version.


399-399: MCP Gateway URL link updated correctly.

The link points to the new ../../mcp/gateway-url path, consistent with the documentation reorganization.


404-404: Governance link now points to specific virtual-keys page.

The link to ../../features/governance/virtual-keys is more specific than a general governance link, providing users with more relevant content for agent configuration.

docs/providers/provider-routing.mdx (1)

1-641: Comprehensive and well-structured documentation.

The provider routing documentation is thorough, covering governance-based routing, adaptive load balancing, and the model catalog with clear explanations, diagrams, and practical examples. The two-level architecture explanation with mermaid diagrams effectively illustrates the interaction between governance and load balancing.

tests/governance/test_utils.go (1)

448-503: Well-designed polling utilities with progressive backoff.

WaitForCondition and WaitForAPICondition implement sensible progressive backoff (capped at 500ms) for polling async state changes. This is a good pattern for E2E tests dealing with eventual consistency.

tests/governance/configupdatesync_test.go (2)

40-41: Correctly updated to new ExtractIDFromResponse signature.

The call aligns with the updated helper function that auto-discovers the ID from known response structures (virtual_key, team, customer).


94-96: Appropriate wait time for async PostHook completion.

The 2-second wait allows the async PostHook goroutine to complete usage updates. This is a pragmatic approach for E2E tests. For more robust testing, consider using WaitForCondition (as done in TestTeamBudgetUpdateSyncToMemory) to poll until the expected state is reached.

tests/governance/e2e_test.go (2)

1562-1679: Excellent addition of VK header format tests.

TestVirtualKeyHeaderFormats systematically verifies all documented header formats (x-bf-vk, Authorization: Bearer, x-api-key, x-goog-api-key). The table-driven test structure with subtests is clean and maintainable.


1236-1298: Good use of polling for async state verification.

Using WaitForCondition to verify VK existence and removal from the in-memory store (instead of fixed sleeps) makes this test more reliable and avoids flakiness from timing variations.

tests/governance/customerbudget_test.go (1)

34-35: Correctly updated ExtractIDFromResponse usage.

The updated call aligns with the new helper signature that auto-discovers the ID from the response structure.

tests/governance/vkbudget_test.go (1)

33-34: Correctly updated ExtractIDFromResponse usage.

The call aligns with the updated helper function signature.

tests/governance/teambudget_test.go (1)

34-35: Correctly updated ExtractIDFromResponse usage.

Both calls in this file (lines 34 and 57) align with the updated helper function signature.

tests/governance/usagetracking_test.go (8)

35-35: LGTM!

The ExtractIDFromResponse call correctly uses the two-argument signature matching the helper in test_utils.go.


157-157: LGTM!

Correct usage of the updated ExtractIDFromResponse signature.


263-276: Good improvement: Adding rate limit for proper usage tracking.

The test now creates a VK with an explicit rate limit configuration, which is necessary to verify token usage tracking. The 100k token limit and 1-hour reset duration provide ample headroom for testing without hitting limits.


327-396: Excellent refactor: Polling replaces fixed sleep.

The WaitForCondition pattern is a significant reliability improvement over fixed sleeps. The implementation correctly:

  • Handles all type assertions defensively
  • Verifies the rate limit ID exists before checking usage
  • Confirms token_current_usage > 0 as the success condition
  • Uses a reasonable 3-second timeout

432-432: LGTM!

Correct usage of the updated ExtractIDFromResponse signature.


464-466: Fixed sleep for async updates.

Similar to the earlier comment, this could benefit from WaitForCondition for consistency with TestInMemoryUsageUpdateOnRequest. However, given the subsequent budget verification logic and the reset timing test nature, the fixed 3-second wait is acceptable here.


494-496: Good documentation of timing rationale.

The increased wait time from 35s to 40s with the comment explaining the reset ticker cadence (10s interval) and budget reset timing (30s) provides helpful context for future maintainers.


558-558: LGTM!

Correct usage of the updated ExtractIDFromResponse signature.

.github/workflows/release-pipeline.yml (1)

228-232: Remove the unnecessary Node.js setup from plugins-release job.

The Set up Node.js step (lines 228-232) is not needed for the plugins-release job. Unlike the bifrost-http-release job which runs npx @maximhq/bifrost, the plugins-release job only executes release-all-plugins.sh, which uses Go, Docker, jq, and curl—not Node.js. The governance E2E tests also run purely Go tests without any Node.js tooling.

Likely an incorrect or invalid review comment.

Comment thread docs/quickstart/gateway/tool-calling.mdx
Copy link
Copy Markdown
Contributor

akshaydeo commented Jan 13, 2026

Merge activity

  • Jan 13, 2:47 PM UTC: A user started a stack merge that includes this pull request via Graphite.
  • Jan 13, 2:47 PM UTC: @akshaydeo merged this pull request with Graphite.

@akshaydeo akshaydeo merged commit 201ad5d into main Jan 13, 2026
10 checks passed
@akshaydeo akshaydeo deleted the 01-13-feat_governance_e2e_tests_added_on_ci_and_doc_updates branch January 13, 2026 14:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants