test: add async job TTL integration tests#2861
Conversation
📝 WalkthroughSummary by CodeRabbit
WalkthroughAdds a comprehensive end-to-end async inference test suite (README, go.mod, helpers, fixtures, and many test files) covering submission, polling, VK auth scoping, integration-route async flow, TTL/expiry, validation, and lifecycle across endpoints; and adjusts router/logstore error paths to distinguish internal vs not-found failures. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant Router as Router
participant Exec as AsyncExecutor
participant Store as LogStore
Client->>Router: GET /v1/async/.../{id}
Router->>Exec: RetrieveJob(id)
Exec->>Store: FindAsyncJobByID(id)
alt Store returns job
Store-->>Exec: job
Exec-->>Router: job (nil error)
Router-->>Client: 200 OK + job
else Store returns ErrNotFound
Store-->>Exec: ErrNotFound
Exec-->>Router: ErrNotFound
Router-->>Client: 404 Not Found
else Store returns internal error
Store-->>Exec: internal error
Exec-->>Router: wraps as ErrJobInternal
Router-->>Client: 500 Internal Server Error
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.11.4)level=error msg="[linters_context] typechecking error: pattern ./...: directory prefix . does not contain main module or its selected dependencies" Comment |
Confidence Score: 5/5Safe to merge; all findings are minor style/consistency suggestions with no blocking issues. Both production code changes are minimal and correct. The test suite is comprehensive and prior review feedback has been fully addressed (UUID segment-length validation, shortTTL increased to 10s, relative path replacing runtime.Caller, pollIntegration sending transports/bifrost-http/integrations/utils.go line 420 (zero-TTL guard inconsistency) Important Files Changed
Reviews (5): Last reviewed commit: "tests: async inference e2e tests added" | Re-trigger Greptile |
There was a problem hiding this comment.
Actionable comments posted: 9
🧹 Nitpick comments (3)
tests/async/fixtures_test.go (1)
119-121: Cache generated/read fixture bytes to reduce repeated work across tests.
sampleAudio()andsamplePNG()are re-executed everyendpointCases()call; caching once per process will speed up and stabilize heavy test runs.♻️ Suggested refactor
import ( "bytes" "image" "image/color" "image/png" "os" "path/filepath" "runtime" + "sync" ) + +var ( + sampleAudioOnce sync.Once + sampleAudioData []byte + sampleAudioErr error + + samplePNGOnce sync.Once + samplePNGData []byte + samplePNGErr error +) func sampleAudio() []byte { - _, thisFile, _, _ := runtime.Caller(0) - mediaPath := filepath.Join(filepath.Dir(thisFile), "..", "..", "core", "internal", "llmtests", "scenarios", "media", "sample.mp3") - data, err := os.ReadFile(mediaPath) - if err != nil { - panic("sampleAudio: cannot read " + mediaPath + ": " + err.Error()) - } - return data + sampleAudioOnce.Do(func() { + _, thisFile, _, _ := runtime.Caller(0) + mediaPath := filepath.Join(filepath.Dir(thisFile), "..", "..", "core", "internal", "llmtests", "scenarios", "media", "sample.mp3") + sampleAudioData, sampleAudioErr = os.ReadFile(mediaPath) + }) + if sampleAudioErr != nil { + panic("sampleAudio: " + sampleAudioErr.Error()) + } + return sampleAudioData } func samplePNG() []byte { - img := image.NewRGBA(image.Rect(0, 0, 256, 256)) - white := color.RGBA{R: 255, G: 255, B: 255, A: 255} - for y := range 256 { - for x := range 256 { - img.Set(x, y, white) - } - } - var buf bytes.Buffer - if err := png.Encode(&buf, img); err != nil { - panic("samplePNG: encode failed: " + err.Error()) - } - return buf.Bytes() + samplePNGOnce.Do(func() { + img := image.NewRGBA(image.Rect(0, 0, 256, 256)) + white := color.RGBA{R: 255, G: 255, B: 255, A: 255} + for y := range 256 { + for x := range 256 { + img.Set(x, y, white) + } + } + var buf bytes.Buffer + samplePNGErr = png.Encode(&buf, img) + samplePNGData = buf.Bytes() + }) + if samplePNGErr != nil { + panic("samplePNG: encode failed: " + samplePNGErr.Error()) + } + return samplePNGData }Also applies to: 146-147, 161-162, 196-221
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/async/fixtures_test.go` around lines 119 - 121, The tests repeatedly call sampleAudio() and samplePNG() within endpointCases(), causing redundant work; change sampleAudio() and samplePNG() to populate and return cached byte slices (e.g., package-level vars like cachedSampleAudio, cachedSamplePNG) using sync.Once or an init helper so the heavy decoding/read happens once per process, then update endpointCases() to use these cached variables instead of calling the generator functions repeatedly; ensure function names sampleAudio, samplePNG and endpointCases are updated accordingly and keep thread-safety with sync.Once if tests run in parallel.tests/async/lifecycle_test.go (1)
43-55: Add a default assertion for unexpected status values.Right now, a non-
completed/failedstatus can slip through without an explicit failure in this switch.Suggested patch
switch job.Status { case "completed": if len(job.Result) == 0 || string(job.Result) == "null" { t.Error("completed job must have a non-null result") } case "failed": if len(job.Error) == 0 || string(job.Error) == "null" { t.Error("failed job must have a non-null error") } if job.StatusCode == 0 { t.Error("failed job must carry a non-zero status_code") } + default: + t.Errorf("unexpected terminal status %q", job.Status) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/async/lifecycle_test.go` around lines 43 - 55, The switch handling job.Status currently only checks "completed" and "failed" and can silently ignore unexpected statuses; add a default case in the switch that calls t.Errorf (or t.Fatalf) to fail the test for any unrecognized job.Status and include the unexpected job.Status (and optionally job.ID) in the error message so tests fail loudly when new/invalid statuses appear.tests/async/validation_test.go (1)
188-188: Avoid live external URL in missing-model validation input.Line 188 introduces an unnecessary external dependency for a test that only asserts
400on missing model.Suggested patch
- "image_url": envOr("ASYNC_OCR_IMAGE_URL", "https://pestworldcdn-dcf2a8gbggazaghf.z01.azurefd.net/media/561791/carpenter-ant4.jpg"), + "image_url": envOr("ASYNC_OCR_IMAGE_URL", "https://example.com/test-image.jpg"),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/async/validation_test.go` at line 188, The test injects a live external URL via envOr for the "image_url" field which creates an unnecessary external dependency; change the value in tests/async/validation_test.go to a stable local or neutral placeholder (e.g., a local test asset path like "file://testdata/fixture.jpg" or a non-network placeholder like "https://example.com/image.jpg") instead of envOr("ASYNC_OCR_IMAGE_URL", "..."); update the JSON input where "image_url" is set (refer to the "image_url" key and the envOr call) so the missing-model validation still triggers a 400 without making network requests.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/async/auth_test.go`:
- Around line 39-41: The current test polling checks (e.g., the conditional that
tests `if code == http.StatusNotFound`) treat any non-404 response as success
and miss error statuses like 500; create a small test helper named
`assertPollSuccess(t *testing.T, code int, body string)` and replace each `if
code == http.StatusNotFound` (and the similar checks at the other locations)
with calls to `assertPollSuccess` that assert the status code is one of the
expected success codes (http.StatusOK or http.StatusAccepted) and fail
otherwise, including the response body in the error message so regressions are
caught reliably.
In `@tests/async/fixtures_test.go`:
- Around line 185-188: The default OCR test input currently uses an external CDN
URL (the "image_url" value in the "document" map) which causes flaky tests;
change the envOr fallback to point to a local bundled test fixture instead of
the third‑party URL. Replace the string default in the envOr call used for
"image_url" with a path or file:// URL to a repository test asset (or a data URI
/ embedded fixture) and ensure the test harness loads local files accordingly;
update the usage around the "document" map and the envOr helper so tests no
longer rely on the external CDN.
In `@tests/async/helpers_test.go`:
- Around line 99-100: The code currently discards the error returned by
http.NewRequest when building requests (e.g., the call that assigns to req, _),
which masks problems like malformed cfg.BaseURL; change these to capture the
error (req, err := http.NewRequest(...)) and immediately handle it (e.g., fail
the test with t.Fatalf("http.NewRequest failed: %v", err) or return the error)
before calling req.Header.Set; apply the same fix to the other instances
mentioned (the calls around lines creating requests at the other sites
referenced) so all http.NewRequest uses (variables req and err, using path, raw,
cfg.BaseURL) check and handle err instead of discarding it.
- Around line 224-229: Replace uses of http.DefaultClient.Do and http.Get with
an explicit http.Client that sets a reasonable Timeout (e.g., create client :=
&http.Client{Timeout: <duration>}) and call client.Do(req) so requests can’t
hang; also ensure every response body is closed after use (close resp.Body in
TestMain’s health-check and any helper that calls io.ReadAll on resp.Body)
rather than leaking it—look for the resp variable and the TestMain and helper
functions in tests/async/helpers_test.go to apply these changes.
In `@tests/async/integration_route_test.go`:
- Around line 124-126: Replace the loose "if code < 400" checks with a focused
helper that asserts the response is a 4xx client error: add a test helper
function assert4xx(t *testing.T, code int, body string, context string) that
calls t.Helper() and fails unless 400 <= code && code < 500, including the
response body in the failure message, then update the three places currently
using "if code < 400" (the checks around lines where the integration route
non-existent-job assertions occur) to call assert4xx(t, code, body,
"non-existent job on integration route") (or an appropriate context string) so
tests fail on 5xx responses instead of treating them as acceptable.
In `@tests/async/README.md`:
- Line 45: The table row for ASYNC_OCR_IMAGE_URL has three pipe-separated cells
but the model-overrides table expects two columns; update the row in
tests/async/README.md by combining the current second and third cells into a
single description cell (or remove the extra cell) so the row has exactly two
columns—identify and edit the row containing the `ASYNC_OCR_IMAGE_URL` entry to
correct the pipe count and ensure the description includes the CDN URL and
purpose in one cell.
In `@tests/async/submit_test.go`:
- Around line 60-63: The test should assert the submit response status is 202
before proceeding to polling: after calling submitCase(t, ec, mode.headers) and
before checking submitted.ID, add a check that submitted indicates HTTP 202 (use
submitted.StatusCode == http.StatusAccepted or submitted.Status == "202
Accepted"/"202" depending on the response type) and call t.Fatalf with a clear
message if it is not 202; keep the existing ID check afterwards. This targets
the submitCase result named submitted to fail fast on submit errors.
In `@tests/async/ttl_test.go`:
- Around line 121-127: Replace the fixed Sleep with deadline-based polling: loop
until a deadline (e.g., start := time.Now(); timeout :=
time.Duration(shortTTL+10)*time.Second or a fixed extra buffer) and repeatedly
call pollOnce(t, pollPath, mode.headers) at short intervals (e.g., 100-200ms)
until it returns http.StatusNotFound or the deadline passes; if the deadline
passes without a 404, fail the test with t.Fatalf/t.Errorf. Use the existing
symbols pollOnce, pollPath, mode.headers, shortTTL and http.StatusNotFound to
implement the timeout-based poll and remove the hard sleep.
In `@tests/async/validation_test.go`:
- Around line 294-296: The test currently ignores the error from http.NewRequest
and calls http.DefaultClient.Do with no timeout, risking CI hangs; update the
request to use a context with timeout (context.WithTimeout) and either
http.NewRequestWithContext or attach the context to the request, check and
handle the error returned by http.NewRequest, defer cancel the context, and use
the context-aware request when calling the HTTP client (or set a client timeout)
so the test fails fast on transport stalls; reference the existing
http.NewRequest/http.DefaultClient.Do calls and cfg.BaseURL in your changes.
---
Nitpick comments:
In `@tests/async/fixtures_test.go`:
- Around line 119-121: The tests repeatedly call sampleAudio() and samplePNG()
within endpointCases(), causing redundant work; change sampleAudio() and
samplePNG() to populate and return cached byte slices (e.g., package-level vars
like cachedSampleAudio, cachedSamplePNG) using sync.Once or an init helper so
the heavy decoding/read happens once per process, then update endpointCases() to
use these cached variables instead of calling the generator functions
repeatedly; ensure function names sampleAudio, samplePNG and endpointCases are
updated accordingly and keep thread-safety with sync.Once if tests run in
parallel.
In `@tests/async/lifecycle_test.go`:
- Around line 43-55: The switch handling job.Status currently only checks
"completed" and "failed" and can silently ignore unexpected statuses; add a
default case in the switch that calls t.Errorf (or t.Fatalf) to fail the test
for any unrecognized job.Status and include the unexpected job.Status (and
optionally job.ID) in the error message so tests fail loudly when new/invalid
statuses appear.
In `@tests/async/validation_test.go`:
- Line 188: The test injects a live external URL via envOr for the "image_url"
field which creates an unnecessary external dependency; change the value in
tests/async/validation_test.go to a stable local or neutral placeholder (e.g., a
local test asset path like "file://testdata/fixture.jpg" or a non-network
placeholder like "https://example.com/image.jpg") instead of
envOr("ASYNC_OCR_IMAGE_URL", "..."); update the JSON input where "image_url" is
set (refer to the "image_url" key and the envOr call) so the missing-model
validation still triggers a 400 without making network requests.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: a0aa133d-eb34-4cee-9e37-6077869acc08
📒 Files selected for processing (10)
tests/async/README.mdtests/async/auth_test.gotests/async/fixtures_test.gotests/async/go.modtests/async/helpers_test.gotests/async/integration_route_test.gotests/async/lifecycle_test.gotests/async/submit_test.gotests/async/ttl_test.gotests/async/validation_test.go
7f7228a to
75d28ed
Compare
f311b40 to
8b96adb
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/async/helpers_test.go (1)
46-55: Treat non-positive poll durations as invalid env input.
0or negative values can cause immediate timeout or aggressive polling loops. Falling back to defaults improves resilience for misconfigured CI env vars.♻️ Proposed hardening
func parseDuration(s string, fallback time.Duration) time.Duration { if s == "" { return fallback } d, err := time.ParseDuration(s) - if err != nil { + if err != nil || d <= 0 { return fallback } return d }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/async/helpers_test.go` around lines 46 - 55, The parseDuration function currently accepts zero or negative durations as valid; change its validation so that parseDuration(s string, fallback time.Duration) treats parsed durations <= 0 as invalid and returns fallback instead. Update the logic in parseDuration to parse the string with time.ParseDuration, then if err != nil OR d <= 0 return fallback; otherwise return d, referencing the parseDuration function to locate the change and keep the existing string-empty guard.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/async/validation_test.go`:
- Around line 303-305: The test currently treats any non-404/405 as passing by
only failing on 200/202, which masks server errors; change the assertion in
tests/async/validation_test.go around the POST poll path check to explicitly
assert that resp.StatusCode is either http.StatusNotFound (404) or
http.StatusMethodNotAllowed (405) and fail the test for any other status,
updating the t.Errorf message to report the unexpected status code (use
resp.StatusCode) so only the expected router outcomes pass; locate the check
using resp.StatusCode and t.Errorf to make the change.
---
Nitpick comments:
In `@tests/async/helpers_test.go`:
- Around line 46-55: The parseDuration function currently accepts zero or
negative durations as valid; change its validation so that parseDuration(s
string, fallback time.Duration) treats parsed durations <= 0 as invalid and
returns fallback instead. Update the logic in parseDuration to parse the string
with time.ParseDuration, then if err != nil OR d <= 0 return fallback; otherwise
return d, referencing the parseDuration function to locate the change and keep
the existing string-empty guard.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: a50cb4c4-10c5-418f-87fe-37b28acfe523
📒 Files selected for processing (10)
tests/async/README.mdtests/async/auth_test.gotests/async/fixtures_test.gotests/async/go.modtests/async/helpers_test.gotests/async/integration_route_test.gotests/async/lifecycle_test.gotests/async/submit_test.gotests/async/ttl_test.gotests/async/validation_test.go
✅ Files skipped from review due to trivial changes (4)
- tests/async/README.md
- tests/async/go.mod
- tests/async/integration_route_test.go
- tests/async/lifecycle_test.go
🚧 Files skipped from review as they are similar to previous changes (4)
- tests/async/fixtures_test.go
- tests/async/submit_test.go
- tests/async/auth_test.go
- tests/async/ttl_test.go
|
|
8b96adb to
b24af34
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/async/helpers_test.go`:
- Line 147: The multipart writer Close() error is ignored at the end of the
test; capture and handle it to avoid silent malformed bodies by checking the
return value of w.Close() and failing the test on error (e.g. replace the bare
w.Close() with if err := w.Close(); err != nil { t.Fatalf("closing multipart
writer: %v", err) }). Ensure you reference the multipart writer variable w and
the test handler t so any failure aborts the test.
- Line 244: The test currently ignores errors from io.ReadAll(resp.Body) which
can hide transport/truncation issues; update the read to capture the error
(body, err := io.ReadAll(resp.Body)) and fail fast on error (e.g. if err != nil
{ t.Fatalf("read response body: %v", err) } or use your test helper/assertion
library) so any body-read failure in the test using resp is reported
immediately; locate the read usage of io.ReadAll(resp.Body) in
tests/async/helpers_test.go to apply this change.
In `@transports/bifrost-http/integrations/router.go`:
- Around line 1539-1543: The current handler always maps any error from
executor.RetrieveJob to 404; change the logic in the router where
executor.RetrieveJob(bifrostCtx, jobID, vkValue, config.GetHTTPRequestType(ctx))
is called so it examines the returned error and only creates a NotFound bifrost
error when the error indicates "job not found or expired" (or matches the
executor's not-found sentinel/message); for any other error produce an internal
server error (use newBifrostErrorWithCode(..., "failed to retrieve async job",
fasthttp.StatusInternalServerError)) and pass that to g.sendError instead.
Ensure you reference the same symbols: executor.RetrieveJob, g.sendError,
newBifrostErrorWithCode.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: bffb444d-5002-4f0d-a0c4-5255819c7eb0
📒 Files selected for processing (12)
tests/async/README.mdtests/async/auth_test.gotests/async/fixtures_test.gotests/async/go.modtests/async/helpers_test.gotests/async/integration_route_test.gotests/async/lifecycle_test.gotests/async/submit_test.gotests/async/ttl_test.gotests/async/validation_test.gotransports/bifrost-http/integrations/router.gotransports/bifrost-http/integrations/utils.go
✅ Files skipped from review due to trivial changes (4)
- tests/async/go.mod
- tests/async/ttl_test.go
- tests/async/README.md
- tests/async/auth_test.go
🚧 Files skipped from review as they are similar to previous changes (4)
- tests/async/submit_test.go
- tests/async/fixtures_test.go
- tests/async/lifecycle_test.go
- tests/async/integration_route_test.go
b24af34 to
f4e589d
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
tests/async/fixtures_test.go (1)
184-186:⚠️ Potential issue | 🟠 MajorAvoid a third-party URL as the default OCR fixture.
This makes the async suite flaky for reasons unrelated to Bifrost: CDN outages, network policy differences, and remote content drift. Please switch the default to a bundled/local asset or require
ASYNC_OCR_IMAGE_URLexplicitly. The same fallback is duplicated intests/async/validation_test.go, so centralizing it in one helper would remove both copies.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/async/fixtures_test.go` around lines 184 - 186, The default OCR fixture uses an external CDN URL via envOr("ASYNC_OCR_IMAGE_URL", "...") which makes tests flaky; replace the fallback with a bundled/local test asset (e.g., a file under tests/testdata or fixtures) or require ASYNC_OCR_IMAGE_URL to be set (no remote default), and remove the duplicate fallback in tests/async/validation_test.go by centralizing the logic into a single helper (e.g., create a new helper function GetAsyncOCRImageURL or Update envOr to accept a sentinel to force missing) and update fixtures_test.go and validation_test.go to call that helper instead of inlining the same envOr(...) default.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@transports/bifrost-http/integrations/router.go`:
- Around line 1542-1547: The 404 branch is leaking the original error (which may
reveal VK/type mismatch details) by passing err into newBifrostErrorWithCode;
update the else branch handling the RetrieveJob result so it does not forward
the original err to sendError/newBifrostErrorWithCode — instead pass a sanitized
error (e.g., a freshly created error with the message "job not found or expired"
or nil) so callers only see the generic 404 message; modify the else branch
where sendError is called (involving RetrieveJob, sendError,
newBifrostErrorWithCode, and logstore.ErrJobInternal) to use the masked error.
---
Duplicate comments:
In `@tests/async/fixtures_test.go`:
- Around line 184-186: The default OCR fixture uses an external CDN URL via
envOr("ASYNC_OCR_IMAGE_URL", "...") which makes tests flaky; replace the
fallback with a bundled/local test asset (e.g., a file under tests/testdata or
fixtures) or require ASYNC_OCR_IMAGE_URL to be set (no remote default), and
remove the duplicate fallback in tests/async/validation_test.go by centralizing
the logic into a single helper (e.g., create a new helper function
GetAsyncOCRImageURL or Update envOr to accept a sentinel to force missing) and
update fixtures_test.go and validation_test.go to call that helper instead of
inlining the same envOr(...) default.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 742ae991-5b0d-4730-88ab-0821fbaf48c8
📒 Files selected for processing (14)
framework/logstore/asyncjob.goframework/logstore/errors.gotests/async/README.mdtests/async/auth_test.gotests/async/fixtures_test.gotests/async/go.modtests/async/helpers_test.gotests/async/integration_route_test.gotests/async/lifecycle_test.gotests/async/submit_test.gotests/async/ttl_test.gotests/async/validation_test.gotransports/bifrost-http/integrations/router.gotransports/bifrost-http/integrations/utils.go
✅ Files skipped from review due to trivial changes (6)
- transports/bifrost-http/integrations/utils.go
- tests/async/go.mod
- framework/logstore/errors.go
- tests/async/README.md
- tests/async/submit_test.go
- tests/async/auth_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
- tests/async/lifecycle_test.go
- tests/async/ttl_test.go
75d28ed to
bbec714
Compare
f4e589d to
6d6ca5e
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
framework/logstore/asyncjob.go (1)
57-64:⚠️ Potential issue | 🟠 MajorPropagate
ErrJobInternalto the regular async poll handler too.This new wrapper only fixes callers that explicitly check
errors.Is(err, logstore.ErrJobInternal).transports/bifrost-http/handlers/asyncinference.go:508-512still turns everyRetrieveJoberror into a404and sendserr.Error()to the client, so store failures on the regular/v1/async/*/{id}path remain misclassified and now expose the wrapped backend message.Suggested follow-up in the async handler
+import "errors" + job, err := h.executor.RetrieveJob(bifrostCtx, jobID, getVirtualKeyFromContext(bifrostCtx), operationType) if err != nil { - SendError(ctx, fasthttp.StatusNotFound, err.Error()) + if errors.Is(err, logstore.ErrJobInternal) { + SendError(ctx, fasthttp.StatusInternalServerError, "failed to retrieve async job") + } else { + SendError(ctx, fasthttp.StatusNotFound, "job not found or expired") + } return }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@framework/logstore/asyncjob.go` around lines 57 - 64, RetrieveJob currently wraps store errors as ErrJobInternal but the async HTTP handler that calls RetrieveJob still maps every error to 404 and returns err.Error() to clients; update that handler to check errors.Is(err, logstore.ErrJobInternal) (or ErrJobInternal) and return a 500/appropriate server error instead of 404, and avoid echoing the wrapped backend error to the client (return a generic "internal server error retrieving job" message). Concretely: in the async inference HTTP handler where RetrieveJob is called, replace the unconditional 404 branch with an errors.Is check for ErrJobInternal to produce a 500, keep the not-found branch (errors.Is(err, ErrNotFound)) as 404, and change the response body to a safe, non-sensitive error string rather than err.Error().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/async/helpers_test.go`:
- Around line 46-55: The parseDuration helper currently returns parsed durations
even if they are zero or negative; update parseDuration(s string, fallback
time.Duration) so that after calling time.ParseDuration it also checks if d <= 0
and returns fallback in that case (in addition to returning fallback for empty
string and parse errors), ensuring parseDuration always returns a positive
duration for use by pollUntilTerminal and timeout logic.
---
Outside diff comments:
In `@framework/logstore/asyncjob.go`:
- Around line 57-64: RetrieveJob currently wraps store errors as ErrJobInternal
but the async HTTP handler that calls RetrieveJob still maps every error to 404
and returns err.Error() to clients; update that handler to check errors.Is(err,
logstore.ErrJobInternal) (or ErrJobInternal) and return a 500/appropriate server
error instead of 404, and avoid echoing the wrapped backend error to the client
(return a generic "internal server error retrieving job" message). Concretely:
in the async inference HTTP handler where RetrieveJob is called, replace the
unconditional 404 branch with an errors.Is check for ErrJobInternal to produce a
500, keep the not-found branch (errors.Is(err, ErrNotFound)) as 404, and change
the response body to a safe, non-sensitive error string rather than err.Error().
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 97b45962-1269-4460-a87b-dbb81b9901fb
📒 Files selected for processing (14)
framework/logstore/asyncjob.goframework/logstore/errors.gotests/async/README.mdtests/async/auth_test.gotests/async/fixtures_test.gotests/async/go.modtests/async/helpers_test.gotests/async/integration_route_test.gotests/async/lifecycle_test.gotests/async/submit_test.gotests/async/ttl_test.gotests/async/validation_test.gotransports/bifrost-http/integrations/router.gotransports/bifrost-http/integrations/utils.go
✅ Files skipped from review due to trivial changes (4)
- transports/bifrost-http/integrations/utils.go
- tests/async/go.mod
- tests/async/README.md
- tests/async/fixtures_test.go
🚧 Files skipped from review as they are similar to previous changes (5)
- framework/logstore/errors.go
- transports/bifrost-http/integrations/router.go
- tests/async/submit_test.go
- tests/async/ttl_test.go
- tests/async/lifecycle_test.go
Merge activity
|
The base branch was changed.
## Summary Adds a dedicated end-to-end test suite for Bifrost's async inference feature, covering all `/v1/async/*` endpoints and the integration route header mechanism (`x-bf-async` / `x-bf-async-id`). ## Changes - Added `tests/async/` as a standalone Go module with six test files: - `submit_test.go` — verifies all 11 async endpoints return 202 with a well-formed job envelope and that immediate polling returns the correct status code - `lifecycle_test.go` — polls jobs to terminal state, validates response shape for completed jobs, confirms 404 for non-existent jobs and wrong-endpoint-type polling - `auth_test.go` — covers VK scoping (same key succeeds, different key returns 404, missing key on VK-scoped job returns 404, public jobs are accessible without a VK), and all three auth header formats (`Authorization: Bearer`, `x-api-key`, `x-goog-api-key`) - `ttl_test.go` — validates default TTL (~3600s), custom TTL via `x-bf-async-job-result-ttl`, fallback to default for invalid/zero/negative values, and 404 after TTL expiry - `validation_test.go` — confirms rejection of `stream=true`, `stream_format=sse` on speech, malformed JSON, missing required fields, missing multipart files, and wrong HTTP method - `integration_route_test.go` — tests `x-bf-async` submit and `x-bf-async-id` retrieve on the `/openai/v1/responses` integration route, including stream rejection and VK isolation - Added `fixtures_test.go` with endpoint fixtures for all 11 async endpoints, model overrides via `ASYNC_*_MODEL` environment variables, and generated test media (sample PNG, sample MP3 reference) - Added `helpers_test.go` with shared HTTP helpers, config sourced from environment variables, and a `TestMain` health-check gate that skips the suite when the gateway is unreachable - Added `README.md` documenting how to run the tests, all environment variables, model override keys, and per-file coverage descriptions - Tests run in two modes — global (no VK) and `with_vk` — wherever VK presence is not the focus of the test ## Type of change - [ ] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [x] Chore/CI ## Affected areas - [x] Core (Go) - [x] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [ ] Docs ## How to test ```sh # From the repo root — requires a running Bifrost gateway on localhost:8080 cd tests/async go test ./... -timeout 300s # With virtual keys to enable VK-scoped auth and cross-VK isolation tests BIFROST_VK=sk-bf-... BIFROST_ALT_VK=sk-bf-... go test ./... -timeout 300s # Override the gateway URL or individual models BIFROST_BASE_URL=http://my-gateway:8080 \ ASYNC_CHAT_COMPLETION_MODEL=openai/gpt-4o \ go test ./... -timeout 300s ``` **Environment variables:** | Variable | Default | Description | |---|---|---| | `BIFROST_BASE_URL` | `http://localhost:8080` | Bifrost gateway URL | | `BIFROST_VK` | — | Primary virtual key; enables VK-mode tests | | `BIFROST_ALT_VK` | — | Second virtual key; enables cross-VK isolation tests | | `BIFROST_POLL_TIMEOUT` | `30s` | Max time to wait for a job to reach terminal state | | `BIFROST_POLL_INTERVAL` | `500ms` | Polling cadence | | `BIFROST_INTEGRATION_PATH` | `/openai/v1/responses` | Override integration route path | | `BIFROST_INTEGRATION_MODEL` | `openai/gpt-4o-mini` | Override model for integration route tests | | `ASYNC_*_MODEL` | see README | Per-endpoint model overrides | If the gateway is unreachable at startup, the entire suite exits cleanly with a skip message rather than failing. ## Screenshots/Recordings N/A ## Breaking changes - [x] No ## Related issues N/A ## Security considerations Virtual keys passed via `BIFROST_VK` and `BIFROST_ALT_VK` are read from environment variables and never logged or written to disk. Cross-VK isolation tests confirm that job IDs cannot be accessed across VK boundaries. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [x] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

Summary
Adds a dedicated end-to-end test suite for Bifrost's async inference feature, covering all
/v1/async/*endpoints and the integration route header mechanism (x-bf-async/x-bf-async-id).Changes
tests/async/as a standalone Go module with six test files:submit_test.go— verifies all 11 async endpoints return 202 with a well-formed job envelope and that immediate polling returns the correct status codelifecycle_test.go— polls jobs to terminal state, validates response shape for completed jobs, confirms 404 for non-existent jobs and wrong-endpoint-type pollingauth_test.go— covers VK scoping (same key succeeds, different key returns 404, missing key on VK-scoped job returns 404, public jobs are accessible without a VK), and all three auth header formats (Authorization: Bearer,x-api-key,x-goog-api-key)ttl_test.go— validates default TTL (~3600s), custom TTL viax-bf-async-job-result-ttl, fallback to default for invalid/zero/negative values, and 404 after TTL expiryvalidation_test.go— confirms rejection ofstream=true,stream_format=sseon speech, malformed JSON, missing required fields, missing multipart files, and wrong HTTP methodintegration_route_test.go— testsx-bf-asyncsubmit andx-bf-async-idretrieve on the/openai/v1/responsesintegration route, including stream rejection and VK isolationfixtures_test.gowith endpoint fixtures for all 11 async endpoints, model overrides viaASYNC_*_MODELenvironment variables, and generated test media (sample PNG, sample MP3 reference)helpers_test.gowith shared HTTP helpers, config sourced from environment variables, and aTestMainhealth-check gate that skips the suite when the gateway is unreachableREADME.mddocumenting how to run the tests, all environment variables, model override keys, and per-file coverage descriptionswith_vk— wherever VK presence is not the focus of the testType of change
Affected areas
How to test
Environment variables:
BIFROST_BASE_URLhttp://localhost:8080BIFROST_VKBIFROST_ALT_VKBIFROST_POLL_TIMEOUT30sBIFROST_POLL_INTERVAL500msBIFROST_INTEGRATION_PATH/openai/v1/responsesBIFROST_INTEGRATION_MODELopenai/gpt-4o-miniASYNC_*_MODELIf the gateway is unreachable at startup, the entire suite exits cleanly with a skip message rather than failing.
Screenshots/Recordings
N/A
Breaking changes
Related issues
N/A
Security considerations
Virtual keys passed via
BIFROST_VKandBIFROST_ALT_VKare read from environment variables and never logged or written to disk. Cross-VK isolation tests confirm that job IDs cannot be accessed across VK boundaries.Checklist
docs/contributing/README.mdand followed the guidelines