-
Notifications
You must be signed in to change notification settings - Fork 58
LCORE-378: Lightspeed core needs to fully support Red Hat AI Inference server LLM provider #652
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds a scheduled/manual GitHub Actions workflow to run end-to-end tests against an RHAIIS (vLLM) endpoint, introduces a run configuration for RHAIIS, updates docker-compose to pass RHAIIS credentials, and documents RHAIIS as a supported provider. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User as Trigger
participant GH as GitHub Actions
participant Repo as Repo (checkout)
participant RHAIIS as RHAIIS API
participant Compose as Docker Compose
participant E2E as E2E runner
User->>GH: dispatch / scheduled run (env=rhaiis)
GH->>Repo: checkout
GH->>GH: generate lightspeed config & select run-rhaiis.yaml
GH->>RHAIIS: GET /v1/models (use RHAIIS_API_KEY)
alt reachable (200)
GH->>Compose: docker compose up (mount run-rhaiis.yaml)
GH->>Compose: wait for health
GH->>RHAIIS: quick connectivity test
GH->>E2E: install deps & run tests
alt tests pass
E2E-->>GH: success
else tests fail
GH->>Compose: collect logs (llama-stack / lightspeed-stack)
E2E-->>GH: failure
end
else unreachable / auth fail
GH-->>User: fail early and log diagnostics
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
TODO: add RHAIIS_URL and RHAIIS_API_KEY in secrets |
tisnik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just one nit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Nitpick comments (8)
docs/providers.md (1)
60-65: Clarify dependency and link to provider details
- Please confirm that the required dependency is the OpenAI Python client used by llama‑stack for remote::vllm. If so, consider naming it explicitly as openai>=1.0.0 and linking to upstream docs.
- Optional: add a short note that RHAIIS exposes an OpenAI‑compatible API consumed via llama‑stack provider_type remote::vllm for consistency with README.
README.md (2)
25-25: Fix list indentation (markdownlint MD007)Adjust bullet indentation to 2 spaces to match the linter’s expectation.
Apply:
- * [Supported providers](#supported-providers) + * [Supported providers](#supported-providers)
161-164: Use descriptive link text (markdownlint MD059)Replace “here” with descriptive text.
-For a comprehensive list of supported providers, take a look [here](docs/providers.md). +See the full list in [Supported Providers documentation](docs/providers.md).tests/e2e/configs/run-rhaiis.yaml (1)
20-27: Avoid “ollama” in sqlite paths for RHAIIS configUse a neutral or provider‑specific directory to prevent confusion during debugging and artifact collection.
- db_path: .llama/distributions/ollama/inference_store.db + db_path: .llama/distributions/rhaiis/inference_store.db ... - db_path: .llama/distributions/ollama/registry.db + db_path: .llama/distributions/rhaiis/registry.db.github/workflows/e2e_tests_rhaiis.yaml (4)
4-8: Harden workflow: least-privilege and concurrencyAdd explicit permissions and prevent overlapping scheduled runs.
on: schedule: - cron: "0 0 * * *" # Runs once a day at midnight UTC workflow_dispatch: + +permissions: + contents: read + +concurrency: + group: ${{ github.workflow }}-${{ github.ref || github.run_id }} + cancel-in-progress: true
112-121: Fix comment and path substitution
- Comment says “Set OpenAI API key” but this replaces RHAIIS variables.
- The tmp path substitution targets "tmp/" (relative), but your config uses "/tmp" (absolute) and ".llama/...". The second sed likely never matches. Consider removing it.
- # Set OpenAI API key for container + # Resolve RHAIIS URL and API key for container sed -i "s|\${env\.RHAIIS_URL}|$RHAIIS_URL|g" run.yaml sed -i "s|\${env\.RHAIIS_API_KEY}|$RHAIIS_API_KEY|g" run.yaml - - sed -i 's|db_path: tmp/|db_path: /app-root/.llama/distributions/|g' run.yaml
148-152: Replace fixed sleep with health/wait loopAvoid flakes; poll the endpoint with retries/backoff.
- sleep 20 # adjust depending on boot time + for i in {1..30}; do + if curl -sf http://localhost:8080/v1/models >/dev/null; then + echo "Services are ready"; break + fi + echo "Waiting for services... ($i)"; sleep 5 + done
139-146: More robust container failure detectiondocker compose ps parsing via grep can be brittle. Prefer status inspection.
- if docker compose ps | grep -E 'Exit|exited|stopped'; then + if docker compose ps --format json | jq -e '.[] | select(.State|test("exited|dead|removing|paused|restarting"))' >/dev/null; then echo "Some services failed to start - showing logs:" docker compose logs exit 1 else echo "All services started successfully" fi
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
.github/workflows/e2e_tests_rhaiis.yaml(1 hunks)README.md(3 hunks)docs/providers.md(1 hunks)tests/e2e/configs/run-rhaiis.yaml(1 hunks)
🧰 Additional context used
🪛 markdownlint-cli2 (0.18.1)
README.md
25-25: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
163-163: Link text should be descriptive
(MD059, descriptive-link-text)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-pr
- GitHub Check: e2e_tests (ci)
🔇 Additional comments (1)
README.md (1)
127-127: RHAIIS row looks good; ensure provider_type matches llama‑stack configLGTM. Please verify remote::vllm is the exact llama‑stack provider_type used in your run‑rhaiis.yaml to keep docs and configs aligned.
| inference: | ||
| - provider_id: sentence-transformers # Can be any embedding provider | ||
| provider_type: inline::sentence-transformers | ||
| config: {} | ||
| - provider_id: vllm | ||
| provider_type: remote::vllm | ||
| config: | ||
| url: http://${env.RHAIIS_URL}:8000/v1/ | ||
| api_token: ${env.RHAIIS_API_KEY} | ||
| tls_verify: false | ||
| max_tokens: 2048 | ||
| post_training: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify vLLM provider config keys and URL scheme
- Key name: Is api_token the correct field for llama‑stack remote::vllm? Many providers expect api_key. Please confirm and align.
- URL: You embed http://${env.RHAIIS_URL}:8000/v1/. If RHAIIS_URL ever includes a scheme, this will double‑prefix. Consider a single env like RHAIIS_BASE_URL that includes scheme and port.
- tls_verify: With http scheme, TLS verification is irrelevant. Either switch to https and set tls_verify appropriately, or omit.
Suggested safer structure:
- url: http://${env.RHAIIS_URL}:8000/v1/
- api_token: ${env.RHAIIS_API_KEY}
- tls_verify: false
+ base_url: ${env.RHAIIS_BASE_URL} # e.g., https://host:8000/v1/
+ api_key: ${env.RHAIIS_API_KEY}
+ tls_verify: true # if using https with valid certs(Based on learnings)
🏁 Script executed:
#!/bin/bash
set -e
echo "---- remote::vllm references ----"
rg -n "remote::vllm"
echo "---- Provider registration or class definitions mentioning VLLM ----"
rg -nP "class\s+.*VLLM" || true
echo "---- Config key occurrences: api_token ----"
rg -n "api_token"
echo "---- Config key occurrences: api_key ----"
rg -n "api_key"
echo "---- tls_verify occurrences ----"
rg -n "tls_verify"
echo "---- base_url occurrences ----"
rg -n "base_url"Length of output: 8563
🏁 Script executed:
#!/bin/bash
set -e
echo "---- LlamaStackConfiguration class ----"
rg -nP "class\s+LlamaStackConfiguration" -C5 src/models/config.py
echo "---- Fields around url, api_key, tls_verify ----"
sed -n '170,260p' src/models/config.py
echo "---- LlamaStackClient definition ----"
rg -nP "class\s+LlamaStackClient" -C5 src/client.py
echo "---- Client init parameters ----"
rg -n "base_url" -C3 src/client.py
rg -n "tls" -C3 src/client.pyLength of output: 3880
🏁 Script executed:
#!/bin/bash
set -e
echo "---- src/client.py (first 200 lines) ----"
sed -n '1,200p' src/client.pyLength of output: 2102
Align vLLM provider config with code schema
- Use api_key (not api_token) –
LlamaStackConfigurationdefines api_key. - Set url: ${env.RHAIIS_BASE_URL} (e.g. https://host:8000/v1/) – maps to url in config.
- Remove tls_verify –
AsyncLlamaStackClientdoesn’t support it; ensure HTTPS is used.
🤖 Prompt for AI Agents
In tests/e2e/configs/run-rhaiis.yaml around lines 71 to 82, the vLLM provider
config uses the wrong key names and unsupported fields: replace api_token with
api_key to match LlamaStackConfiguration, change the URL variable from
${env.RHAIIS_URL} to ${env.RHAIIS_BASE_URL} ensuring it contains the full scheme
and path (e.g. https://host:8000/v1/), and remove the tls_verify field
(AsyncLlamaStackClient does not support it); keep other settings like max_tokens
as-is and ensure the URL uses HTTPS.
| scoring: | ||
| - config: {} | ||
| provider_id: basic | ||
| provider_type: inline::basic | ||
| - config: {} | ||
| provider_id: llm-as-judge | ||
| provider_type: inline::llm-as-judge | ||
| - config: | ||
| openai_api_key: '********' | ||
| provider_id: braintrust | ||
| provider_type: inline::braintrust |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Braintrust scoring configured with placeholder secret
This likely fails provider initialization during startup. Either supply a real secret via env substitution or remove the provider from tests.
- - config:
- openai_api_key: '********'
- provider_id: braintrust
- provider_type: inline::braintrust
+ # Braintrust disabled in E2E until a key is provided
+ # - config:
+ # openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY}
+ # provider_id: braintrust
+ # provider_type: inline::braintrust📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| scoring: | |
| - config: {} | |
| provider_id: basic | |
| provider_type: inline::basic | |
| - config: {} | |
| provider_id: llm-as-judge | |
| provider_type: inline::llm-as-judge | |
| - config: | |
| openai_api_key: '********' | |
| provider_id: braintrust | |
| provider_type: inline::braintrust | |
| scoring: | |
| - config: {} | |
| provider_id: basic | |
| provider_type: inline::basic | |
| - config: {} | |
| provider_id: llm-as-judge | |
| provider_type: inline::llm-as-judge | |
| # Braintrust disabled in E2E until a key is provided | |
| # - config: | |
| # openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY} | |
| # provider_id: braintrust | |
| # provider_type: inline::braintrust |
🤖 Prompt for AI Agents
In tests/e2e/configs/run-rhaiis.yaml around lines 95 to 105 the Braintrust
scoring provider is configured with a placeholder API key ('********') which
will cause provider initialization to fail; either replace the literal
placeholder with an environment-variable substitution (e.g. read from an env var
like ${BRAINTRUST_API_KEY} or the repo's config interpolation mechanism) so a
real secret is injected at runtime, or remove the entire Braintrust provider
block from the scoring list in this YAML so tests don't attempt to initialize
it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
.github/workflows/e2e_tests_rhaiis.yaml(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-pr
- GitHub Check: e2e_tests (ci)
| - name: Test RHAIIS connectivity | ||
| env: | ||
| RHAIIS_URL: ${{ secrets.RHAIIS_URL }} | ||
| RHAIIS_API_KEY: ${{ secrets.RHAIIS_API_KEY }} | ||
| run: | | ||
| curl ${RHAIIS_URL}:8000/v1/models -H "Authorization: Bearer ${RHAIIS_API_KEY}" | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make the connectivity probe fail fast on HTTP errors.
Right now curl ${RHAIIS_URL}:8000/v1/models will exit 0 even if the endpoint returns a 401/500, so the workflow keeps going with a broken configuration. Add -f (or check the status code) and quote the URL to ensure the step actually fails on bad responses.
- curl ${RHAIIS_URL}:8000/v1/models -H "Authorization: Bearer ${RHAIIS_API_KEY}"
+ curl -f "${RHAIIS_URL}:8000/v1/models" \
+ -H "Authorization: Bearer ${RHAIIS_API_KEY}"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - name: Test RHAIIS connectivity | |
| env: | |
| RHAIIS_URL: ${{ secrets.RHAIIS_URL }} | |
| RHAIIS_API_KEY: ${{ secrets.RHAIIS_API_KEY }} | |
| run: | | |
| curl ${RHAIIS_URL}:8000/v1/models -H "Authorization: Bearer ${RHAIIS_API_KEY}" | |
| - name: Test RHAIIS connectivity | |
| env: | |
| RHAIIS_URL: ${{ secrets.RHAIIS_URL }} | |
| RHAIIS_API_KEY: ${{ secrets.RHAIIS_API_KEY }} | |
| run: | | |
| curl -f "${RHAIIS_URL}:8000/v1/models" \ | |
| -H "Authorization: Bearer ${RHAIIS_API_KEY}" |
🤖 Prompt for AI Agents
.github/workflows/e2e_tests_rhaiis.yaml around lines 117 to 123: the curl
connectivity probe currently won’t fail on HTTP errors and can return success on
4xx/5xx; update the run command to use curl -f and quote the URL (e.g. curl -f
"${RHAIIS_URL}:8000/v1/models" -H "Authorization: Bearer ${RHAIIS_API_KEY}") so
the step fails on non-2xx responses; alternatively check and exit non-zero based
on the HTTP status code if you prefer explicit handling.
| pip install uv | ||
| uv sync |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix the uv installation step.
uv isn’t distributed via PyPI, so pip install uv will fail and the workflow never reaches the tests. Please switch to the official installer (or whatever method you use elsewhere) and invoke the resulting binary explicitly.
- pip install uv
- uv sync
+ curl -LsSf https://astral.sh/uv/install.sh | sh
+ ~/.local/bin/uv sync📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| pip install uv | |
| uv sync | |
| curl -LsSf https://astral.sh/uv/install.sh | sh | |
| ~/.local/bin/uv sync |
🤖 Prompt for AI Agents
.github/workflows/e2e_tests_rhaiis.yaml lines 158-159: the workflow attempts to
install `uv` via `pip install uv` which fails because `uv` is not on PyPI;
replace that step with the official installer used elsewhere in this repo (e.g.,
download or run the project's provided install script/release binary), ensure
the installer places the `uv` executable in PATH (or reference the downloaded
binary path), and then invoke the explicit `uv` binary (e.g., `uv sync` using
the installed binary path) so the subsequent test steps can run.
b93388c to
46a68f5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/e2e/features/conversations.feature (1)
11-26: Provider-specific assertion conflicts with provider-agnostic query.Line 16 now sends a provider-agnostic query, but line 25 still validates against hardcoded
"gpt-4-turbo"and"openai". If RHAIIS (or any other provider) handles this request, the test will fail because the actuallast_used_modelandlast_used_providerwon't match the expected values.Consider one of these approaches:
- Make the assertion provider-agnostic (preferred for reusable tests):
- """ - {"last_used_model": "gpt-4-turbo", "last_used_provider": "openai", "message_count": 1} - """ + """ + {"message_count": 1} + """Then add a separate step to verify
last_used_modelandlast_used_providerare non-empty strings without hardcoding values.
- Keep the query provider-specific (if testing a specific provider):
- {"query": "Say hello"} + {"query": "Say hello", "model": "gpt-4-turbo", "provider": "openai"}The same issue affects scenarios on lines 28-46, 47-92, 94-111, 119-131, 133-150, and 163-175.
♻️ Duplicate comments (3)
tests/e2e/configs/run-rhaiis.yaml (3)
71-82: Unresolved: Verify vLLM provider config keys and URL scheme.This issue was flagged in a previous review and remains unaddressed. The configuration uses incorrect field names and includes unsupported options:
- Line 79: Uses
api_tokeninstead of the expectedapi_keyfield- Line 78: Hardcodes
http://scheme and:8000port, risking double-prefix ifRHAIIS_URLincludes a scheme- Line 80: Sets
tls_verify: falsewhich is not supported byAsyncLlamaStackClientand is irrelevant withhttp://schemeBased on learnings from the previous review, apply this diff:
- url: http://${env.RHAIIS_URL}:8000/v1/ - api_token: ${env.RHAIIS_API_KEY} - tls_verify: false + url: ${env.RHAIIS_BASE_URL} # e.g., https://host:8000/v1/ + api_key: ${env.RHAIIS_API_KEY}Ensure the environment variable contains the complete base URL with scheme and path.
83-90: Unresolved: Provider mismatch between huggingface-gpu and cpu device.This issue was flagged in a previous review and remains unaddressed. The
provider_typespecifies GPU butdeviceis set tocpu, causing a configuration mismatch.Apply this diff to align provider and device:
- provider_type: inline::huggingface-gpu + provider_type: inline::huggingface-cpuOr if GPU usage is intended:
- device: cpu + device: cuda
102-105: Unresolved: Braintrust scoring uses placeholder secret.This issue was flagged in a previous review and remains unaddressed. The hardcoded placeholder
'********'will cause provider initialization to fail.Either provide a real secret via environment variable or remove this provider:
- - config: - openai_api_key: '********' - provider_id: braintrust - provider_type: inline::braintrust + # Braintrust disabled until a valid API key is provided + # - config: + # openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY} + # provider_id: braintrust + # provider_type: inline::braintrust
🧹 Nitpick comments (2)
docker-compose.yaml (1)
17-18: Consider making RHAIIS credentials optional with defaults.Unlike
BRAVE_SEARCH_API_KEYandTAVILY_SEARCH_API_KEY(lines 15-16), these variables don't use the:-syntax for optional defaults. IfRHAIIS_URLorRHAIIS_API_KEYare unset,docker-composewill fail to start.Apply this diff to allow running without RHAIIS credentials:
- - RHAIIS_URL=${RHAIIS_URL} - - RHAIIS_API_KEY=${RHAIIS_API_KEY} + - RHAIIS_URL=${RHAIIS_URL:-} + - RHAIIS_API_KEY=${RHAIIS_API_KEY:-}Alternatively, if RHAIIS is now a required provider, document this requirement in the README or add a note in the PR description.
README.md (1)
161-163: Improve link text for better accessibility.The link text "here" is not descriptive and triggers a markdown lint warning (MD059). Screen readers and SEO benefit from descriptive link text.
Apply this diff:
## Supported providers -For a comprehensive list of supported providers, take a look [here](docs/providers.md). +For a comprehensive list of supported providers, see the [providers documentation](docs/providers.md).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
.github/workflows/e2e_tests_rhaiis.yaml(1 hunks)README.md(3 hunks)docker-compose.yaml(1 hunks)docs/providers.md(1 hunks)tests/e2e/configs/run-rhaiis.yaml(1 hunks)tests/e2e/features/conversations.feature(7 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- .github/workflows/e2e_tests_rhaiis.yaml
🧰 Additional context used
📓 Path-based instructions (1)
tests/e2e/features/**/*.feature
📄 CodeRabbit inference engine (CLAUDE.md)
Write E2E tests as Gherkin feature files for behave
Files:
tests/e2e/features/conversations.feature
🪛 markdownlint-cli2 (0.18.1)
README.md
25-25: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
163-163: Link text should be descriptive
(MD059, descriptive-link-text)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-pr
- GitHub Check: e2e_tests (ci)
🔇 Additional comments (2)
docs/providers.md (1)
60-65: LGTM! Clear RHAIIS provider documentation.The Red Hat-specific provider table follows the established format, specifies the tested version and platform, and correctly identifies
openaias the pip dependency for vLLM's OpenAI-compatible API.tests/e2e/configs/run-rhaiis.yaml (1)
134-137: LGTM! Provider model ID is now correctly set.The previously reported issue with
provider_model_id: nullhas been addressed. The model is now properly configured with the correct identifier.
radofuchs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description
Added e2e tests for RHAIIS, updated docs.
e2e tests are set to run every day at midnight UTC.
Type of change
Related Tickets & Documents
Checklist before requesting a review
Testing
E2E test for RHAIIS shows that the inference works, thus is supported.
Summary by CodeRabbit
Documentation
Tests
Chores