Skip to content

Conversation

@giladravid16
Copy link

@giladravid16 giladravid16 commented Aug 21, 2025

Need the logs for debugging the issue

Summary by CodeRabbit

  • New Features

    • Re-enabled automated evaluation runs via the configured agent endpoint.
    • Added support for three Vertex AI Gemini LLM models in deployment configuration.
  • Bug Fixes

    • Corrected CI pod selection to target the intended resources.
    • Fixed token parsing to extract a raw access token and prevent auth issues.
  • Chores

    • Improved CI logging on pod restarts by marking and retrieving prior logs for easier debugging.

@openshift-ci openshift-ci bot requested review from jhernand and maorfr August 21, 2025 09:38
@openshift-ci
Copy link

openshift-ci bot commented Aug 21, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: giladravid16
Once this PR has been reviewed and has the lgtm label, please assign omertuc for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai
Copy link

coderabbitai bot commented Aug 21, 2025

Walkthrough

CI test script now selects pods in the specified namespace and logs previous container logs on restart. Prow entrypoint extracts a raw OAuth token and re-enables eval.py invocation. Template configuration switches the llama-stack inference provider to Vertex AI and adds three Vertex AI Gemini model entries.

Changes

Cohort / File(s) Summary of Changes
CI pod selection & restart logging
scripts/ci_test.sh
Query pods with -n "$NAMESPACE" and correct label assisted-chat-eval-test. When restarts > 0, print start/end markers and fetch previous logs via `oc logs -p -n "$NAMESPACE" "$POD_NAME"
Prow entrypoint token & eval execution
test/prow/entrypoint.sh
Extract OAuth token using jq -r to obtain a raw access_token (no JSON quoting). Uncomment and invoke eval.py with --agent_endpoint and --endpoint_type.
LLM inference provider & models
template.yaml
Switch llama-stack client inference provider from Gemini (remote::gemini) to Vertex AI (remote::vertexai), replace api_key with project and location: us-central1. Replace models: [] with three Vertex AI Gemini model entries (vertex_ai/gemini-2.0-flash, vertex_ai/gemini-2.5-pro, vertex_ai/gemini-2.5-flash) using provider_id: vertex_ai and corresponding provider_model_id values.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant CI as scripts/ci_test.sh
  participant OC as oc / OpenShift
  participant Pod as test pod

  CI->>OC: List pods (-n NAMESPACE, label=assisted-chat-eval-test)
  OC-->>CI: Pod list
  loop monitor pod
    CI->>OC: Check pod status / restarts
    alt restarts > 0
      Note right of CI #e6f7ff: Print restart start/end markers
      CI->>OC: oc logs -p -n NAMESPACE POD_NAME
      OC-->>CI: Previous container logs
      CI->>OC: Get container exit code
      OC-->>CI: Exit code
    else no restarts
      CI->>OC: Get container exit code
      OC-->>CI: Exit code
    end
  end
Loading
sequenceDiagram
  autonumber
  participant EP as test/prow/entrypoint.sh
  participant Auth as OAuth server
  participant Eval as eval.py

  EP->>Auth: Request token (JSON)
  Auth-->>EP: { "access_token": "..." }
  Note right of EP #e6f7ff: Use `jq -r` to extract raw access_token
  EP->>Eval: Run `eval.py --agent_endpoint ... --endpoint_type ...`
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested labels

lgtm

Suggested reviewers

  • jhernand

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@openshift-ci
Copy link

openshift-ci bot commented Aug 21, 2025

Hi @giladravid16. Thanks for your PR.

I'm waiting for a rh-ecosystem-edge member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
scripts/ci_test.sh (1)

27-29: Nice: previous container logs are now explicitly captured with clear markers

Including oc logs -p in the restart path matches the PR goal to fetch prior-run logs and will greatly help post-mortems.

To maximize context, consider printing both previous and current logs when available:

oc logs -p -n "$NAMESPACE" "$POD_NAME" || true
echo "-------------------------- Current logs ---------------------------"
oc logs -n "$NAMESPACE" "$POD_NAME" || true

(Use || true to avoid masking exit-code retrieval if logs are temporarily unavailable.)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between c515669 and 1a91e8e.

📒 Files selected for processing (2)
  • scripts/ci_test.sh (2 hunks)
  • test/prow/entrypoint.sh (1 hunks)

cd test/evals

#python eval.py --agent_endpoint "${AGENT_URL}:${AGENT_PORT}"
python eval.py --agent_endpoint "${AGENT_URL}:${AGENT_PORT}"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Re-enable eval: prefer python3/-u and ensure token is raw (jq -r) to avoid auth breakage

  • Using python may fail on images that only ship python3. Also, -u gives unbuffered logs in CI.
  • Critical: the token currently saved via jq '.access_token' includes quotes. Many clients expect a raw token; enabling eval.py now will likely read a quoted token and fail auth.

Apply this diff for the eval invocation:

-python eval.py --agent_endpoint "${AGENT_URL}:${AGENT_PORT}"
+python3 -u eval.py --agent_endpoint "${AGENT_URL}:${AGENT_PORT}"

Additionally, update the token extraction earlier in the script to write the raw access token (no quotes):

Verification checklist:

  • Confirm AGENT_URL and AGENT_PORT are set in the job env (set -o nounset will fail fast if missing).
  • Confirm python3 exists in the image; otherwise switch to /usr/bin/env python3.

🏁 Script executed:

# Replace the existing assignment with the raw-output variant
OCM_TOKEN=$(curl -sS -X POST https://sso.redhat.com/auth/realms/redhat-external/protocol/openid-connect/token \
  -H "Content-Type: application/x-www-form-urlencoded" \
  -d "grant_type=client_credentials" \
  -d "client_id=$CLIENT_ID" \
  -d "client_secret=$CLIENT_SECRET" | jq -r '.access_token')

Length of output: 258


🏁 Script executed:

#!/bin/bash
set -e

echo "🔍 Checking OCM_TOKEN assignment in test/prow/entrypoint.sh..."
rg -n --context 3 "OCM_TOKEN" test/prow/entrypoint.sh || true

echo "🔍 Checking python invocation in test/prow/entrypoint.sh..."
rg -n --context 3 "python " test/prow/entrypoint.sh || true

echo "🔍 Checking for 'set -o nounset' in test/prow/entrypoint.sh..."
rg -n --context 3 "set -o nounset" test/prow/entrypoint.sh || true

Length of output: 1149


🏁 Script executed:

#!/bin/bash
set -e

echo "🔍 Locating test/evals/eval.py..."
if [ -f test/evals/eval.py ]; then
  echo "Found test/evals/eval.py, showing first 5 lines:"
  head -n 5 test/evals/eval.py
else
  echo "eval.py not found at test/evals/eval.py. Searching for any eval.py:"
  find test -type f -name eval.py
fi

Length of output: 327


Re-enable eval with unbuffered Python3 and raw token parsing

The current entrypoint.sh at lines 8–12 and 18 will break CI evaluation and auth:

  • The OCM_TOKEN assignment uses jq '.access_token', which preserves surrounding quotes. Downstream clients will see '"TOKEN"' instead of TOKEN, causing authentication failures.
  • The python eval.py … invocation uses the default python binary and buffered output. Many minimal CI images only include python3, and without -u, log streaming is delayed.

Suggested fixes:

  1. Update the token extraction to strip quotes:

    -OCM_TOKEN=$(curl -sS -X POST https://sso.redhat.com/auth/realms/redhat-external/protocol/openid-connect/token \
    -  -H "Content-Type: application/x-www-form-urlencoded" \
    -  -d "grant_type=client_credentials" \
    -  -d "client_id=$CLIENT_ID" \
    -  -d "client_secret=$CLIENT_SECRET" | jq '.access_token')
    +OCM_TOKEN=$(curl -sS -X POST https://sso.redhat.com/auth/realms/redhat-external/protocol/openid-connect/token \
    +  -H "Content-Type: application/x-www-form-urlencoded" \
    +  -d "grant_type=client_credentials" \
    +  -d "client_id=$CLIENT_ID" \
    +  -d "client_secret=$CLIENT_SECRET" | jq -r '.access_token')
  2. Change the eval invocation to use unbuffered Python3, falling back to /usr/bin/env if necessary:

    -python eval.py --agent_endpoint "${AGENT_URL}:${AGENT_PORT}"
    +python3 -u eval.py --agent_endpoint "${AGENT_URL}:${AGENT_PORT}"
    # Or, if python3 isn’t guaranteed on PATH:
    +# /usr/bin/env python3 -u eval.py --agent_endpoint "${AGENT_URL}:${AGENT_PORT}"
  3. Ensure safety with strict shell options (set -o nounset is already present at line 3) and confirm that AGENT_URL and AGENT_PORT are exported in the CI environment so that nounset errors fail fast.

These changes are critical to avoid CI log buffering issues and authentication failures when re-enabling eval.py.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
python eval.py --agent_endpoint "${AGENT_URL}:${AGENT_PORT}"
#!/bin/bash
set -o nounset
# Acquire OCM token
OCM_TOKEN=$(curl -sS -X POST https://sso.redhat.com/auth/realms/redhat-external/protocol/openid-connect/token \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "grant_type=client_credentials" \
-d "client_id=$CLIENT_ID" \
-d "client_secret=$CLIENT_SECRET" | jq -r '.access_token')
# … other CI bootstrap steps …
# Run evaluation with unbuffered Python3 output
python3 -u eval.py --agent_endpoint "${AGENT_URL}:${AGENT_PORT}"
# Fallback if python3 isn’t on PATH:
# /usr/bin/env python3 -u eval.py --agent_endpoint "${AGENT_URL}:${AGENT_PORT}"
🤖 Prompt for AI Agents
In test/prow/entrypoint.sh around line 18, the script currently extracts
OCM_TOKEN with jq that preserves surrounding quotes and invokes Python via the
default buffered "python" binary; this breaks CI auth and log streaming. Update
the token extraction to strip quotes (use jq -r or otherwise trim surrounding
quotes) so OCM_TOKEN is the raw token string, change the eval invocation to run
unbuffered Python3 (invoke python3 -u, preferably via /usr/bin/env python3 -u
for portability), and verify AGENT_URL and AGENT_PORT are exported in the CI
environment so set -o nounset will fail fast on missing values.

@giladravid16
Copy link
Author

/hold - I just want the logs from the eval-test for now

@omertuc
Copy link
Member

omertuc commented Aug 21, 2025

/ok-to-test

@giladravid16
Copy link
Author

/test eval-test

1 similar comment
@giladravid16
Copy link
Author

/test eval-test

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
test/prow/entrypoint.sh (1)

18-18: Use unbuffered Python 3 for timely CI logs and broader compatibility
Using python may fail on images that only include python3, and buffering delays logs (you’re explicitly trying to capture eval-test logs).

Apply:

-python eval.py --agent_endpoint "${AGENT_URL}:${AGENT_PORT}" --endpoint_type query
+python3 -u eval.py --agent_endpoint "${AGENT_URL}:${AGENT_PORT}" --endpoint_type query

Optional portability:

/usr/bin/env python3 -u eval.py --agent_endpoint "${AGENT_URL}:${AGENT_PORT}" --endpoint_type query

To verify the runtime has python3 and that eval.py accepts --endpoint_type:

#!/bin/bash
set -euo pipefail
command -v python3
python3 - <<'PY'
import argparse
p=argparse.ArgumentParser()
p.add_argument('--endpoint_type')
print("argparse OK; --endpoint_type likely supported if eval.py uses argparse")
PY
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between b09ceaa and 9b262b6.

📒 Files selected for processing (2)
  • scripts/ci_test.sh (2 hunks)
  • test/prow/entrypoint.sh (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • scripts/ci_test.sh
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Red Hat Konflux / assisted-chat-saas-main-on-pull-request

-d "grant_type=client_credentials" \
-d "client_id=$CLIENT_ID" \
-d "client_secret=$CLIENT_SECRET" | jq '.access_token')
-d "client_secret=$CLIENT_SECRET" | jq -r '.access_token')
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Good fix: jq -r for raw token; add guard and curl hardening to avoid “null” tokens sneaking through

jq -r is correct. However, if the token request returns 4xx/5xx (or a JSON error without access_token), the pipeline can still succeed and you’ll proceed with OCM_TOKEN="null" or empty, causing confusing auth failures later.

Recommend: make curl fail on HTTP errors and validate the token right after assignment.

Add immediately after the assignment:

# Fail early if token is empty or 'null'
if [ -z "${OCM_TOKEN:-}" ] || [ "$OCM_TOKEN" = "null" ]; then
  echo "Failed to obtain OCM access token" >&2
  exit 1
fi

Optionally harden the curl call:

# Add --fail-with-body, retries, and timeout for robustness
OCM_TOKEN=$(curl -sS --fail-with-body --retry 3 --retry-delay 2 --max-time 15 \
  -X POST https://sso.redhat.com/auth/realms/redhat-external/protocol/openid-connect/token \
  -H "Content-Type: application/x-www-form-urlencoded" \
  -d "grant_type=client_credentials" \
  -d "client_id=$CLIENT_ID" \
  -d "client_secret=$CLIENT_SECRET" | jq -r '.access_token')
🤖 Prompt for AI Agents
In test/prow/entrypoint.sh around line 12, the script currently parses the
access token with jq -r but doesn't make curl fail on HTTP errors nor validate
the token afterward; update the curl invocation to harden it (e.g. use
--fail-with-body, add --retry, --retry-delay and --max-time or equivalent flags)
so HTTP error responses cause failure, then immediately validate OCM_TOKEN after
assignment and exit non‑zero with an error message if it's empty or the literal
"null". Ensure the validation uses a POSIX-safe test like [ -z "${OCM_TOKEN:-}"
] || [ "$OCM_TOKEN" = "null" ] and prints to stderr before exiting.

@giladravid16
Copy link
Author

/test eval-test

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
test/evals/eval.py (2)

7-35: Make the httpx monkey‑patch idempotent, safer, and more diagnosable

Good direction. A few issues to tighten up:

  • Repeated calls will wrap the property again (risk of layered wrappers). Add an idempotency guard and keep a handle to the original property.
  • The inner except Exception: is too broad and silently swallows decode/transport issues. Prefer catching httpx.StreamError and log a warning; keep a last‑resort catch if you must.
  • Preserve the original property’s docstring.
  • Optional: allow opting out via an env var to reduce blast radius if needed in CI. Note that calling read() will buffer the entire stream into memory; that’s acceptable here but should be explicit.

Apply this diff:

 def patch_httpx_response_error():
-    """Patch httpx Response to handle streaming response text access safely."""
-    try:
-        import httpx
-        
-        # Store original text property
-        original_text = httpx.Response.text
-        
-        def safe_text(self):
-            """Safely access response text, handling streaming responses."""
-            try:
-                return original_text.fget(self)
-            except httpx.ResponseNotRead:
-                # If it's a streaming response that hasn't been read, read it first
-                try:
-                    self.read()
-                    return original_text.fget(self)
-                except Exception:
-                    # If we still can't read it, return a safe fallback
-                    return f"<Streaming response - status {self.status_code}>"
-        
-        # Replace the text property with our safe version
-        httpx.Response.text = property(safe_text)
-        
-    except ImportError:
-        # httpx not available, skip patching
-        pass
+    """Patch httpx.Response.text to be resilient for streaming responses.
+
+    Idempotent and logs a warning on fallback.
+    Controlled by env EVAL_PATCH_HTTPX: set to '0' or 'false' to disable.
+    """
+    try:
+        import os as _os
+        import httpx
+        # Allow opting out via env to reduce blast radius if needed
+        if _os.getenv("EVAL_PATCH_HTTPX", "1").lower() in ("0", "false", "no"):
+            return
+        # Idempotency: don't patch twice
+        if getattr(httpx.Response, "_orig_text_property", None) is not None:
+            return
+        original_text = httpx.Response.text
+
+        def safe_text(self):
+            """Safely access response text, handling streaming responses."""
+            try:
+                return original_text.fget(self)
+            except httpx.ResponseNotRead:
+                # For unread streaming responses, read the body first (buffers into memory).
+                try:
+                    self.read()
+                    return original_text.fget(self)
+                except httpx.StreamError as exc:
+                    logging.warning(
+                        "httpx streaming read failed for %s %s: %s; returning fallback.",
+                        getattr(getattr(self, "request", None), "method", "?"),
+                        getattr(getattr(self, "request", None), "url", "?"),
+                        exc,
+                    )
+                    return f"<Streaming response - status {self.status_code}>"
+                except Exception as exc:
+                    logging.warning("Unexpected error reading streaming response: %s", exc)
+                    return f"<Streaming response - status {self.status_code}>"
+        # Preserve original for potential unpatching and keep docstring
+        httpx.Response._orig_text_property = original_text
+        httpx.Response.text = property(safe_text, doc=getattr(original_text, "__doc__", None))
+    except ImportError:
+        logging.debug("httpx not installed; skipping Response.text patch.")

36-38: Guard the patch application to keep the runner resilient

Wrap the call so a patch failure won’t break the runner, and add a low‑noise log for visibility.

-# Apply the patch before importing lsc_agent_eval
-patch_httpx_response_error()
+# Apply the patch before importing lsc_agent_eval; ignore failures so tests still run
+try:
+    patch_httpx_response_error()
+except Exception as exc:
+    logging.debug("Failed to apply httpx.Response.text patch: %s", exc)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 9b262b6 and 3ebf256.

📒 Files selected for processing (2)
  • test/evals/eval.py (1 hunks)
  • test/prow/entrypoint.sh (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/prow/entrypoint.sh
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Red Hat Konflux / assisted-chat-saas-main-on-pull-request

@giladravid16
Copy link
Author

/test eval-test

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
template.yaml (1)

361-376: Provider choice vs. Vertex usage — avoid dead paths.

Your providers.inference config declares a remote::gemini provider with a dummy API key, while deployment mounts Vertex credentials and comments say you “use Vertex and not Gemini directly”. Registering only gemini models may route inference through a provider that cannot authenticate. If inference is supposed to go through Vertex, consider:

  • registering a remote::vertex provider and using that as default_provider, or
  • keep gemini as default_provider only if the runtime actually proxies to Vertex using the mounted Google credentials.

Otherwise you’ll get runtime auth failures the moment these models are selected.

Would you like a follow-up patch that adds a vertex provider stanza and re-keys these model entries under provider_id: vertex?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 7fc3502 and 8d89a8f.

📒 Files selected for processing (1)
  • template.yaml (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Red Hat Konflux / assisted-chat-saas-main-on-pull-request
🔇 Additional comments (1)
template.yaml (1)

361-376: Ensure ASSISTED_CHAT_DEFAULT_MODEL aligns with registered model_id

It looks like you’ve set

  • inference.default_provider: gemini (template.yaml :171)
  • ASSISTED_CHAT_DEFAULT_MODEL: gemini-2.0-flash (template.yaml :110)

But your model registrations all use fully-qualified IDs (model_id: gemini/gemini-2.0-flash, etc. at template.yaml :361–376). If your runtime requires an exact model_id match, the unqualified default (gemini-2.0-flash) won’t resolve. Please confirm whether:

  • the runtime automatically prefixes default_provider to default_model, or
  • it expects a fully-qualified string.

If it’s the latter, update the parameter definition to:

- name: ASSISTED_CHAT_DEFAULT_MODEL
- value: gemini-2.0-flash
+ name: ASSISTED_CHAT_DEFAULT_MODEL
+ value: gemini/gemini-2.0-flash

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
template.yaml (1)

170-173: Default provider mismatch—update to vertex_ai

The default_provider: gemini under the inference block no longer matches any entry in the providers.inference list (which now only includes provider_id: vertex_ai), causing runtime failures when selecting the default model.

Locations to fix:

  • template.yaml, around lines 170–172

Apply this minimal diff:

     inference:
       default_model: ${ASSISTED_CHAT_DEFAULT_MODEL}
-      default_provider: gemini
+      default_provider: vertex_ai

Additional context from your local checks:

  • No remaining provider_id: gemini in the file
  • ASSISTED_CHAT_DEFAULT_MODEL is set to gemini-2.0-flash (provider_model_id for vertex_ai)
    → if your runtime requires fully-qualified model IDs, switch the default_model to vertex_ai/gemini-2.0-flash instead.
🧹 Nitpick comments (1)
template.yaml (1)

362-377: Models block looks consistent with Vertex AI provider; provider_model_id is correctly provider-native

  • model_id uses the vertex_ai/<model> convention.
  • provider_id is vertex_ai.
  • provider_model_id is the provider-native name (no redundant vertex_ai/ or gemini/ prefix).

Optional: add display_name or additional metadata (e.g., context window, modality) in metadata to help operators distinguish these at runtime, if supported by your stack.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between a3201ad and 4e31882.

📒 Files selected for processing (1)
  • template.yaml (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Red Hat Konflux / assisted-chat-saas-main-on-pull-request
🔇 Additional comments (1)
template.yaml (1)

428-481: Confirm secret key name for the mounted Vertex service account JSON

The mount uses subPath: service_account, so the Secret ${VERTEX_API_SECRET_NAME} must contain a data key literally named service_account. Mismatches here will result in a missing file at GOOGLE_APPLICATION_CREDENTIALS.

Please confirm the Secret’s data key. If it differs (e.g., key.json), update subPath accordingly. Example adjustment:

-          - name: google-vertex-service-account
-            mountPath: /app-root/google-vertex-service-account.json
-            subPath: service_account
+          - name: google-vertex-service-account
+            mountPath: /app-root/google-vertex-service-account.json
+            subPath: key.json

Comment on lines +297 to 302
- provider_id: vertex_ai
provider_type: remote::vertexai
config:
api_key: dummy-to-stop-llama-stack-from-complaining-even-though-we-use-vertex-and-not-gemini-directly
project: ""
location: us-central1
vector_io: []
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Empty Vertex AI project config will likely break inference provider initialization

The remote::vertexai provider typically requires a non-empty GCP project. Leaving project: "" will cause provider init or first call to fail. Make project a required parameter and wire it here. Consider parameterizing location as well.

Apply this diff within this block:

         inference:
           providers:
-        - provider_id: vertex_ai
-          provider_type: remote::vertexai
-          config:
-            project: ""
-            location: us-central1
+        - provider_id: vertex_ai
+          provider_type: remote::vertexai
+          config:
+            project: ${VERTEX_PROJECT_ID}
+            location: ${VERTEX_LOCATION}

And add template parameters (outside this hunk):

 parameters:
+ - name: VERTEX_PROJECT_ID
+   value: ""
+   required: true
+   description: "GCP Project ID used by Vertex AI (from the service account's project)"
+ - name: VERTEX_LOCATION
+   value: "us-central1"
+   description: "Vertex AI location/region for model inference"

If you prefer to keep location hard-coded, only add VERTEX_PROJECT_ID and leave location: us-central1.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- provider_id: vertex_ai
provider_type: remote::vertexai
config:
api_key: dummy-to-stop-llama-stack-from-complaining-even-though-we-use-vertex-and-not-gemini-directly
project: ""
location: us-central1
vector_io: []
inference:
providers:
- - provider_id: vertex_ai
- provider_type: remote::vertexai
- config:
- project: ""
- provider_id: vertex_ai
provider_type: remote::vertexai
config:
project: ${VERTEX_PROJECT_ID}
location: ${VERTEX_LOCATION}
vector_io: []
Suggested change
- provider_id: vertex_ai
provider_type: remote::vertexai
config:
api_key: dummy-to-stop-llama-stack-from-complaining-even-though-we-use-vertex-and-not-gemini-directly
project: ""
location: us-central1
vector_io: []
parameters:
- name: VERTEX_PROJECT_ID
value: ""
required: true
description: "GCP Project ID used by Vertex AI (from the service account's project)"
- name: VERTEX_LOCATION
value: "us-central1"
description: "Vertex AI location/region for model inference"
🤖 Prompt for AI Agents
In template.yaml around lines 297-302, the Vertex AI provider block uses an
empty project string which will break initialization; update the block to
reference a template parameter for the GCP project (e.g. VERTEX_PROJECT_ID)
instead of "" and wire location to a parameter if you want it configurable (or
leave as us-central1 if you prefer). Also add corresponding template parameters
near the top-level parameters section (outside this hunk): add VERTEX_PROJECT_ID
as required (string) and optionally VERTEX_LOCATION with a default of
us-central1 if you want it configurable. Ensure the provider config uses the
parameter names exactly and validate the template parameter types so deployment
fails early if VERTEX_PROJECT_ID is not supplied.

@openshift-ci
Copy link

openshift-ci bot commented Aug 26, 2025

@giladravid16: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/eval-test 4e31882 link true /test eval-test

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants