-
Notifications
You must be signed in to change notification settings - Fork 21
log previous assisted-chat-eval-test container's logs #140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: giladravid16 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
WalkthroughCI test script now selects pods in the specified namespace and logs previous container logs on restart. Prow entrypoint extracts a raw OAuth token and re-enables eval.py invocation. Template configuration switches the llama-stack inference provider to Vertex AI and adds three Vertex AI Gemini model entries. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant CI as scripts/ci_test.sh
participant OC as oc / OpenShift
participant Pod as test pod
CI->>OC: List pods (-n NAMESPACE, label=assisted-chat-eval-test)
OC-->>CI: Pod list
loop monitor pod
CI->>OC: Check pod status / restarts
alt restarts > 0
Note right of CI #e6f7ff: Print restart start/end markers
CI->>OC: oc logs -p -n NAMESPACE POD_NAME
OC-->>CI: Previous container logs
CI->>OC: Get container exit code
OC-->>CI: Exit code
else no restarts
CI->>OC: Get container exit code
OC-->>CI: Exit code
end
end
sequenceDiagram
autonumber
participant EP as test/prow/entrypoint.sh
participant Auth as OAuth server
participant Eval as eval.py
EP->>Auth: Request token (JSON)
Auth-->>EP: { "access_token": "..." }
Note right of EP #e6f7ff: Use `jq -r` to extract raw access_token
EP->>Eval: Run `eval.py --agent_endpoint ... --endpoint_type ...`
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
|
Hi @giladravid16. Thanks for your PR. I'm waiting for a rh-ecosystem-edge member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
scripts/ci_test.sh (1)
27-29: Nice: previous container logs are now explicitly captured with clear markersIncluding
oc logs -pin the restart path matches the PR goal to fetch prior-run logs and will greatly help post-mortems.To maximize context, consider printing both previous and current logs when available:
oc logs -p -n "$NAMESPACE" "$POD_NAME" || true echo "-------------------------- Current logs ---------------------------" oc logs -n "$NAMESPACE" "$POD_NAME" || true(Use
|| trueto avoid masking exit-code retrieval if logs are temporarily unavailable.)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
scripts/ci_test.sh(2 hunks)test/prow/entrypoint.sh(1 hunks)
| cd test/evals | ||
|
|
||
| #python eval.py --agent_endpoint "${AGENT_URL}:${AGENT_PORT}" | ||
| python eval.py --agent_endpoint "${AGENT_URL}:${AGENT_PORT}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Re-enable eval: prefer python3/-u and ensure token is raw (jq -r) to avoid auth breakage
- Using
pythonmay fail on images that only shippython3. Also,-ugives unbuffered logs in CI. - Critical: the token currently saved via
jq '.access_token'includes quotes. Many clients expect a raw token; enablingeval.pynow will likely read a quoted token and fail auth.
Apply this diff for the eval invocation:
-python eval.py --agent_endpoint "${AGENT_URL}:${AGENT_PORT}"
+python3 -u eval.py --agent_endpoint "${AGENT_URL}:${AGENT_PORT}"Additionally, update the token extraction earlier in the script to write the raw access token (no quotes):
Verification checklist:
- Confirm AGENT_URL and AGENT_PORT are set in the job env (set -o nounset will fail fast if missing).
- Confirm
python3exists in the image; otherwise switch to/usr/bin/env python3.
🏁 Script executed:
# Replace the existing assignment with the raw-output variant
OCM_TOKEN=$(curl -sS -X POST https://sso.redhat.com/auth/realms/redhat-external/protocol/openid-connect/token \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "grant_type=client_credentials" \
-d "client_id=$CLIENT_ID" \
-d "client_secret=$CLIENT_SECRET" | jq -r '.access_token')Length of output: 258
🏁 Script executed:
#!/bin/bash
set -e
echo "🔍 Checking OCM_TOKEN assignment in test/prow/entrypoint.sh..."
rg -n --context 3 "OCM_TOKEN" test/prow/entrypoint.sh || true
echo "🔍 Checking python invocation in test/prow/entrypoint.sh..."
rg -n --context 3 "python " test/prow/entrypoint.sh || true
echo "🔍 Checking for 'set -o nounset' in test/prow/entrypoint.sh..."
rg -n --context 3 "set -o nounset" test/prow/entrypoint.sh || trueLength of output: 1149
🏁 Script executed:
#!/bin/bash
set -e
echo "🔍 Locating test/evals/eval.py..."
if [ -f test/evals/eval.py ]; then
echo "Found test/evals/eval.py, showing first 5 lines:"
head -n 5 test/evals/eval.py
else
echo "eval.py not found at test/evals/eval.py. Searching for any eval.py:"
find test -type f -name eval.py
fiLength of output: 327
Re-enable eval with unbuffered Python3 and raw token parsing
The current entrypoint.sh at lines 8–12 and 18 will break CI evaluation and auth:
- The
OCM_TOKENassignment usesjq '.access_token', which preserves surrounding quotes. Downstream clients will see'"TOKEN"'instead ofTOKEN, causing authentication failures. - The
python eval.py …invocation uses the defaultpythonbinary and buffered output. Many minimal CI images only includepython3, and without-u, log streaming is delayed.
Suggested fixes:
-
Update the token extraction to strip quotes:
-OCM_TOKEN=$(curl -sS -X POST https://sso.redhat.com/auth/realms/redhat-external/protocol/openid-connect/token \ - -H "Content-Type: application/x-www-form-urlencoded" \ - -d "grant_type=client_credentials" \ - -d "client_id=$CLIENT_ID" \ - -d "client_secret=$CLIENT_SECRET" | jq '.access_token') +OCM_TOKEN=$(curl -sS -X POST https://sso.redhat.com/auth/realms/redhat-external/protocol/openid-connect/token \ + -H "Content-Type: application/x-www-form-urlencoded" \ + -d "grant_type=client_credentials" \ + -d "client_id=$CLIENT_ID" \ + -d "client_secret=$CLIENT_SECRET" | jq -r '.access_token')
-
Change the eval invocation to use unbuffered Python3, falling back to
/usr/bin/envif necessary:-python eval.py --agent_endpoint "${AGENT_URL}:${AGENT_PORT}" +python3 -u eval.py --agent_endpoint "${AGENT_URL}:${AGENT_PORT}" # Or, if python3 isn’t guaranteed on PATH: +# /usr/bin/env python3 -u eval.py --agent_endpoint "${AGENT_URL}:${AGENT_PORT}"
-
Ensure safety with strict shell options (
set -o nounsetis already present at line 3) and confirm thatAGENT_URLandAGENT_PORTare exported in the CI environment so that nounset errors fail fast.
These changes are critical to avoid CI log buffering issues and authentication failures when re-enabling eval.py.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| python eval.py --agent_endpoint "${AGENT_URL}:${AGENT_PORT}" | |
| #!/bin/bash | |
| set -o nounset | |
| # Acquire OCM token | |
| OCM_TOKEN=$(curl -sS -X POST https://sso.redhat.com/auth/realms/redhat-external/protocol/openid-connect/token \ | |
| -H "Content-Type: application/x-www-form-urlencoded" \ | |
| -d "grant_type=client_credentials" \ | |
| -d "client_id=$CLIENT_ID" \ | |
| -d "client_secret=$CLIENT_SECRET" | jq -r '.access_token') | |
| # … other CI bootstrap steps … | |
| # Run evaluation with unbuffered Python3 output | |
| python3 -u eval.py --agent_endpoint "${AGENT_URL}:${AGENT_PORT}" | |
| # Fallback if python3 isn’t on PATH: | |
| # /usr/bin/env python3 -u eval.py --agent_endpoint "${AGENT_URL}:${AGENT_PORT}" |
🤖 Prompt for AI Agents
In test/prow/entrypoint.sh around line 18, the script currently extracts
OCM_TOKEN with jq that preserves surrounding quotes and invokes Python via the
default buffered "python" binary; this breaks CI auth and log streaming. Update
the token extraction to strip quotes (use jq -r or otherwise trim surrounding
quotes) so OCM_TOKEN is the raw token string, change the eval invocation to run
unbuffered Python3 (invoke python3 -u, preferably via /usr/bin/env python3 -u
for portability), and verify AGENT_URL and AGENT_PORT are exported in the CI
environment so set -o nounset will fail fast on missing values.
|
/hold - I just want the logs from the eval-test for now |
|
/ok-to-test |
|
/test eval-test |
1 similar comment
|
/test eval-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (1)
test/prow/entrypoint.sh (1)
18-18: Use unbuffered Python 3 for timely CI logs and broader compatibility
Using python may fail on images that only include python3, and buffering delays logs (you’re explicitly trying to capture eval-test logs).Apply:
-python eval.py --agent_endpoint "${AGENT_URL}:${AGENT_PORT}" --endpoint_type query +python3 -u eval.py --agent_endpoint "${AGENT_URL}:${AGENT_PORT}" --endpoint_type queryOptional portability:
/usr/bin/env python3 -u eval.py --agent_endpoint "${AGENT_URL}:${AGENT_PORT}" --endpoint_type queryTo verify the runtime has python3 and that eval.py accepts --endpoint_type:
#!/bin/bash set -euo pipefail command -v python3 python3 - <<'PY' import argparse p=argparse.ArgumentParser() p.add_argument('--endpoint_type') print("argparse OK; --endpoint_type likely supported if eval.py uses argparse") PY
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
scripts/ci_test.sh(2 hunks)test/prow/entrypoint.sh(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- scripts/ci_test.sh
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Red Hat Konflux / assisted-chat-saas-main-on-pull-request
| -d "grant_type=client_credentials" \ | ||
| -d "client_id=$CLIENT_ID" \ | ||
| -d "client_secret=$CLIENT_SECRET" | jq '.access_token') | ||
| -d "client_secret=$CLIENT_SECRET" | jq -r '.access_token') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Good fix: jq -r for raw token; add guard and curl hardening to avoid “null” tokens sneaking through
jq -r is correct. However, if the token request returns 4xx/5xx (or a JSON error without access_token), the pipeline can still succeed and you’ll proceed with OCM_TOKEN="null" or empty, causing confusing auth failures later.
Recommend: make curl fail on HTTP errors and validate the token right after assignment.
Add immediately after the assignment:
# Fail early if token is empty or 'null'
if [ -z "${OCM_TOKEN:-}" ] || [ "$OCM_TOKEN" = "null" ]; then
echo "Failed to obtain OCM access token" >&2
exit 1
fiOptionally harden the curl call:
# Add --fail-with-body, retries, and timeout for robustness
OCM_TOKEN=$(curl -sS --fail-with-body --retry 3 --retry-delay 2 --max-time 15 \
-X POST https://sso.redhat.com/auth/realms/redhat-external/protocol/openid-connect/token \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "grant_type=client_credentials" \
-d "client_id=$CLIENT_ID" \
-d "client_secret=$CLIENT_SECRET" | jq -r '.access_token')🤖 Prompt for AI Agents
In test/prow/entrypoint.sh around line 12, the script currently parses the
access token with jq -r but doesn't make curl fail on HTTP errors nor validate
the token afterward; update the curl invocation to harden it (e.g. use
--fail-with-body, add --retry, --retry-delay and --max-time or equivalent flags)
so HTTP error responses cause failure, then immediately validate OCM_TOKEN after
assignment and exit non‑zero with an error message if it's empty or the literal
"null". Ensure the validation uses a POSIX-safe test like [ -z "${OCM_TOKEN:-}"
] || [ "$OCM_TOKEN" = "null" ] and prints to stderr before exiting.
|
/test eval-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
test/evals/eval.py (2)
7-35: Make the httpx monkey‑patch idempotent, safer, and more diagnosableGood direction. A few issues to tighten up:
- Repeated calls will wrap the property again (risk of layered wrappers). Add an idempotency guard and keep a handle to the original property.
- The inner
except Exception:is too broad and silently swallows decode/transport issues. Prefer catchinghttpx.StreamErrorand log a warning; keep a last‑resort catch if you must.- Preserve the original property’s docstring.
- Optional: allow opting out via an env var to reduce blast radius if needed in CI. Note that calling
read()will buffer the entire stream into memory; that’s acceptable here but should be explicit.Apply this diff:
def patch_httpx_response_error(): - """Patch httpx Response to handle streaming response text access safely.""" - try: - import httpx - - # Store original text property - original_text = httpx.Response.text - - def safe_text(self): - """Safely access response text, handling streaming responses.""" - try: - return original_text.fget(self) - except httpx.ResponseNotRead: - # If it's a streaming response that hasn't been read, read it first - try: - self.read() - return original_text.fget(self) - except Exception: - # If we still can't read it, return a safe fallback - return f"<Streaming response - status {self.status_code}>" - - # Replace the text property with our safe version - httpx.Response.text = property(safe_text) - - except ImportError: - # httpx not available, skip patching - pass + """Patch httpx.Response.text to be resilient for streaming responses. + + Idempotent and logs a warning on fallback. + Controlled by env EVAL_PATCH_HTTPX: set to '0' or 'false' to disable. + """ + try: + import os as _os + import httpx + # Allow opting out via env to reduce blast radius if needed + if _os.getenv("EVAL_PATCH_HTTPX", "1").lower() in ("0", "false", "no"): + return + # Idempotency: don't patch twice + if getattr(httpx.Response, "_orig_text_property", None) is not None: + return + original_text = httpx.Response.text + + def safe_text(self): + """Safely access response text, handling streaming responses.""" + try: + return original_text.fget(self) + except httpx.ResponseNotRead: + # For unread streaming responses, read the body first (buffers into memory). + try: + self.read() + return original_text.fget(self) + except httpx.StreamError as exc: + logging.warning( + "httpx streaming read failed for %s %s: %s; returning fallback.", + getattr(getattr(self, "request", None), "method", "?"), + getattr(getattr(self, "request", None), "url", "?"), + exc, + ) + return f"<Streaming response - status {self.status_code}>" + except Exception as exc: + logging.warning("Unexpected error reading streaming response: %s", exc) + return f"<Streaming response - status {self.status_code}>" + # Preserve original for potential unpatching and keep docstring + httpx.Response._orig_text_property = original_text + httpx.Response.text = property(safe_text, doc=getattr(original_text, "__doc__", None)) + except ImportError: + logging.debug("httpx not installed; skipping Response.text patch.")
36-38: Guard the patch application to keep the runner resilientWrap the call so a patch failure won’t break the runner, and add a low‑noise log for visibility.
-# Apply the patch before importing lsc_agent_eval -patch_httpx_response_error() +# Apply the patch before importing lsc_agent_eval; ignore failures so tests still run +try: + patch_httpx_response_error() +except Exception as exc: + logging.debug("Failed to apply httpx.Response.text patch: %s", exc)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
test/evals/eval.py(1 hunks)test/prow/entrypoint.sh(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- test/prow/entrypoint.sh
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Red Hat Konflux / assisted-chat-saas-main-on-pull-request
|
/test eval-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
template.yaml (1)
361-376: Provider choice vs. Vertex usage — avoid dead paths.Your providers.inference config declares a remote::gemini provider with a dummy API key, while deployment mounts Vertex credentials and comments say you “use Vertex and not Gemini directly”. Registering only gemini models may route inference through a provider that cannot authenticate. If inference is supposed to go through Vertex, consider:
- registering a remote::vertex provider and using that as default_provider, or
- keep gemini as default_provider only if the runtime actually proxies to Vertex using the mounted Google credentials.
Otherwise you’ll get runtime auth failures the moment these models are selected.
Would you like a follow-up patch that adds a vertex provider stanza and re-keys these model entries under provider_id: vertex?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
template.yaml(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Red Hat Konflux / assisted-chat-saas-main-on-pull-request
🔇 Additional comments (1)
template.yaml (1)
361-376: EnsureASSISTED_CHAT_DEFAULT_MODELaligns with registeredmodel_idIt looks like you’ve set
inference.default_provider: gemini(template.yaml :171)ASSISTED_CHAT_DEFAULT_MODEL: gemini-2.0-flash(template.yaml :110)But your model registrations all use fully-qualified IDs (
model_id: gemini/gemini-2.0-flash, etc. at template.yaml :361–376). If your runtime requires an exactmodel_idmatch, the unqualified default (gemini-2.0-flash) won’t resolve. Please confirm whether:
- the runtime automatically prefixes
default_providertodefault_model, or- it expects a fully-qualified string.
If it’s the latter, update the parameter definition to:
- name: ASSISTED_CHAT_DEFAULT_MODEL - value: gemini-2.0-flash + name: ASSISTED_CHAT_DEFAULT_MODEL + value: gemini/gemini-2.0-flash
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
template.yaml (1)
170-173: Default provider mismatch—update tovertex_aiThe
default_provider: geminiunder theinferenceblock no longer matches any entry in theproviders.inferencelist (which now only includesprovider_id: vertex_ai), causing runtime failures when selecting the default model.Locations to fix:
- template.yaml, around lines 170–172
Apply this minimal diff:
inference: default_model: ${ASSISTED_CHAT_DEFAULT_MODEL} - default_provider: gemini + default_provider: vertex_aiAdditional context from your local checks:
- No remaining
provider_id: geminiin the fileASSISTED_CHAT_DEFAULT_MODELis set togemini-2.0-flash(provider_model_id forvertex_ai)
→ if your runtime requires fully-qualified model IDs, switch the default_model tovertex_ai/gemini-2.0-flashinstead.
🧹 Nitpick comments (1)
template.yaml (1)
362-377: Models block looks consistent with Vertex AI provider; provider_model_id is correctly provider-native
- model_id uses the
vertex_ai/<model>convention.- provider_id is
vertex_ai.- provider_model_id is the provider-native name (no redundant
vertex_ai/orgemini/prefix).Optional: add
display_nameor additional metadata (e.g., context window, modality) inmetadatato help operators distinguish these at runtime, if supported by your stack.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
template.yaml(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Red Hat Konflux / assisted-chat-saas-main-on-pull-request
🔇 Additional comments (1)
template.yaml (1)
428-481: Confirm secret key name for the mounted Vertex service account JSONThe mount uses
subPath: service_account, so the Secret${VERTEX_API_SECRET_NAME}must contain a data key literally namedservice_account. Mismatches here will result in a missing file atGOOGLE_APPLICATION_CREDENTIALS.Please confirm the Secret’s data key. If it differs (e.g.,
key.json), updatesubPathaccordingly. Example adjustment:- - name: google-vertex-service-account - mountPath: /app-root/google-vertex-service-account.json - subPath: service_account + - name: google-vertex-service-account + mountPath: /app-root/google-vertex-service-account.json + subPath: key.json
| - provider_id: vertex_ai | ||
| provider_type: remote::vertexai | ||
| config: | ||
| api_key: dummy-to-stop-llama-stack-from-complaining-even-though-we-use-vertex-and-not-gemini-directly | ||
| project: "" | ||
| location: us-central1 | ||
| vector_io: [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Empty Vertex AI project config will likely break inference provider initialization
The remote::vertexai provider typically requires a non-empty GCP project. Leaving project: "" will cause provider init or first call to fail. Make project a required parameter and wire it here. Consider parameterizing location as well.
Apply this diff within this block:
inference:
providers:
- - provider_id: vertex_ai
- provider_type: remote::vertexai
- config:
- project: ""
- location: us-central1
+ - provider_id: vertex_ai
+ provider_type: remote::vertexai
+ config:
+ project: ${VERTEX_PROJECT_ID}
+ location: ${VERTEX_LOCATION}And add template parameters (outside this hunk):
parameters:
+ - name: VERTEX_PROJECT_ID
+ value: ""
+ required: true
+ description: "GCP Project ID used by Vertex AI (from the service account's project)"
+ - name: VERTEX_LOCATION
+ value: "us-central1"
+ description: "Vertex AI location/region for model inference"If you prefer to keep location hard-coded, only add VERTEX_PROJECT_ID and leave location: us-central1.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - provider_id: vertex_ai | |
| provider_type: remote::vertexai | |
| config: | |
| api_key: dummy-to-stop-llama-stack-from-complaining-even-though-we-use-vertex-and-not-gemini-directly | |
| project: "" | |
| location: us-central1 | |
| vector_io: [] | |
| inference: | |
| providers: | |
| - - provider_id: vertex_ai | |
| - provider_type: remote::vertexai | |
| - config: | |
| - project: "" | |
| - provider_id: vertex_ai | |
| provider_type: remote::vertexai | |
| config: | |
| project: ${VERTEX_PROJECT_ID} | |
| location: ${VERTEX_LOCATION} | |
| vector_io: [] |
| - provider_id: vertex_ai | |
| provider_type: remote::vertexai | |
| config: | |
| api_key: dummy-to-stop-llama-stack-from-complaining-even-though-we-use-vertex-and-not-gemini-directly | |
| project: "" | |
| location: us-central1 | |
| vector_io: [] | |
| parameters: | |
| - name: VERTEX_PROJECT_ID | |
| value: "" | |
| required: true | |
| description: "GCP Project ID used by Vertex AI (from the service account's project)" | |
| - name: VERTEX_LOCATION | |
| value: "us-central1" | |
| description: "Vertex AI location/region for model inference" |
🤖 Prompt for AI Agents
In template.yaml around lines 297-302, the Vertex AI provider block uses an
empty project string which will break initialization; update the block to
reference a template parameter for the GCP project (e.g. VERTEX_PROJECT_ID)
instead of "" and wire location to a parameter if you want it configurable (or
leave as us-central1 if you prefer). Also add corresponding template parameters
near the top-level parameters section (outside this hunk): add VERTEX_PROJECT_ID
as required (string) and optionally VERTEX_LOCATION with a default of
us-central1 if you want it configurable. Ensure the provider config uses the
parameter names exactly and validate the template parameter types so deployment
fails early if VERTEX_PROJECT_ID is not supplied.
|
@giladravid16: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Need the logs for debugging the issue
Summary by CodeRabbit
New Features
Bug Fixes
Chores