Skip to content

Conversation

@are-ces
Copy link
Contributor

@are-ces are-ces commented Aug 20, 2025

…nfigurations

Description

Added configuration examples (llama-stack) for inference (OpenAI, Llama 3.1 - RHEL AI vLLM, Granite - vLLM, Gemini, Qwen - vLLM) and vector stores (FAISS, pgvector).

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement

Related Tickets & Documents

  • Related Issue # LCORE-169

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • Documentation

    • Added a RAG Configuration guide with FAISS and new pgvector examples, expanded provider/tooling notes (vLLM, OpenAI, VertexAI, Ollama, Azure) and cross-references.
    • README: inserted RAG Configuration section (duplicated) and references docs/rag_guide.md (underscore).
  • Examples

    • Added runnable example configs for OpenAI+FAISS, OpenAI+pgvector, Gemini, vLLM (Granite, Llama+FAISS, Qwen3), and VertexAI with env-based creds and SQLite/Postgres stores.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 20, 2025

Walkthrough

Adds duplicated RAG Configuration entries to README, expands docs/rag_guide.md with FAISS, pgvector, provider/tooling notes and examples, and adds multiple new example YAMLs demonstrating OpenAI, pgvector, FAISS, vLLM (Granite, Llama, Qwen3) and Gemini setups. All edits are documentation and examples only.

Changes

Cohort / File(s) Summary
README RAG section
README.md
Inserts two identical “RAG Configuration” blocks linking to docs/rag_guide.md and listing example configs; both blocks reference docs/rag_guide.md (underscore).
RAG guide updates
docs/rag_guide.md
Adds a “### FAISS Example” header, a new pgvector example (installation, provider/vector_db config), vLLM (Llama 3.1) and OpenAI examples, and provider/tooling notes for Azure OpenAI/Ollama/vLLM Mistral; inserts cross-links to full config examples.
OpenAI RAG examples
examples/openai-faiss-run.yaml, examples/openai-pgvector-run.yaml
New example YAMLs: OpenAI inference with FAISS and with pgvector; embedding model setup, providers, rag-runtime tooling, vector_db registrations, and persistence/provider configurations.
vLLM RAG examples
examples/vllm-llama-faiss-run.yaml, examples/vllm-qwen3-run.yaml
New vLLM examples: Llama 3.1 + FAISS and Qwen3 via remote vLLM, with provider/auth details, SQLite/Postgres-backed stores, rag-runtime tooling, and environment-driven settings.
vLLM Granite (no RAG)
examples/vllm-granite-run.yaml
New example configuring Granite via remote vLLM: extensive providers (safety, telemetry, eval, datasetio, scoring, files, post_training), sqlite stores, tool groups; explicitly notes no RAG.
Gemini example
examples/gemini-run.yaml
New Gemini example: Gemini as remote inference provider, Postgres-backed persistence, telemetry, scoring, rag-runtime/tooling, MCP integration, and server config.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant Agent
  participant RAG_Tool as RAG Tool Runtime
  participant VecIO as Vector I/O
  participant Emb as Embedding Model
  participant LLM as Inference Provider

  User->>Agent: Query
  Agent->>RAG_Tool: Invoke RAG
  RAG_Tool->>Emb: Embed(query)
  RAG_Tool->>VecIO: Search(embedding, top_k)
  VecIO-->>RAG_Tool: Retrieved chunks
  RAG_Tool->>Agent: Contextualized prompt
  Agent->>LLM: Prompt + Context
  LLM-->>Agent: Completion
  Agent-->>User: Answer
  note over VecIO,Emb: VecIO provider: FAISS / pgvector
  note over LLM: Provider: vLLM / OpenAI / Gemini
Loading
sequenceDiagram
  autonumber
  actor Operator
  participant Stack as Llama Stack
  participant Remote as Remote Provider

  Operator->>Stack: Deploy/Configure model (url, token, env)
  Stack->>Remote: /v1/chat/completions (auth)
  Remote-->>Stack: Response
  Stack-->>Operator: Served completion
  note over Stack,Remote: Applies to Granite, Llama, Qwen3, Gemini examples
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • TamiTakamiya
  • tisnik
  • jrobertboos

Poem

I thump my paw—new configs bloom,
FAISS and pgvector light the room.
vLLM hums, Gemini gives a wink,
OpenAI maps the burrow I think.
Hop, commit—may merges link! 🐇✨

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (34)
examples/openai-faiss-run.yaml (5)

2-6: Clean up trailing spaces to satisfy YAML linters and CI.

Multiple lines have trailing spaces flagged by yamllint. Remove them to keep CI green.

-# 
+#
-# 
+#
-# - You will need an OpenAI API key␠
+# - You will need an OpenAI API key
-# - You can generate the vector index with the rag-content tool (https://github.com/lightspeed-core/rag-content)␠
+# - You can generate the vector index with the rag-content tool (https://github.com/lightspeed-core/rag-content)
- - model_id: gpt-test ␠
+ - model_id: gpt-test
-  provider_model_id: /home/USER/lightspeed-stack/embedding_models/all-mpnet-base-v2 ␠
+  provider_model_id: /home/USER/lightspeed-stack/embedding_models/all-mpnet-base-v2
-  - provider_id: sentence-transformers ␠
+  - provider_id: sentence-transformers
-  - provider_id: ocp-docs ␠
+  - provider_id: ocp-docs
-  - provider_id: rag-runtime ␠
+  - provider_id: rag-runtime
-  embedding_model: sentence-transformers/all-mpnet-base-v2 ␠
+  embedding_model: sentence-transformers/all-mpnet-base-v2

Also applies to: 18-18, 28-29, 32-32, 36-36, 59-59, 68-68, 81-81


72-78: Drop null-valued fields in tool_groups for brevity.

args and mcp_endpoint set to null are optional noise; removing them keeps the example minimal.

 tool_groups:
 - provider_id: rag-runtime
   toolgroup_id: builtin::rag
-  args: null
-  mcp_endpoint: null

23-29: Parameterize user-specific absolute paths.

Hard-coded /home/USER paths make the example brittle. Prefer env vars with sensible defaults.

-  provider_id: sentence-transformers # This ID is a reference to 'providers.inference'
-  provider_model_id: /home/USER/lightspeed-stack/embedding_models/all-mpnet-base-v2 
+  provider_id: sentence-transformers # This ID is a reference to 'providers.inference'
+  provider_model_id: ${env.EMBEDDING_MODEL_PATH:=/opt/lightspeed/embedding_models/all-mpnet-base-v2}
@@
-        db_path: /home/USER/lightspeed-stack/vector_dbs/ocp_docs/faiss_store.db
+        db_path: ${env.VECTOR_DB_PATH:=/opt/lightspeed/vector_dbs/ocp_docs/faiss_store.db}

Also applies to: 62-65


83-83: Ensure newline at end of file.

POSIX newline at EOF avoids tooling diffs and linter errors.

-  vector_db_id: openshift-index  # This ID was defined during index generation
+  vector_db_id: openshift-index  # This ID was defined during index generation
+

1-16: Provide runnable verification steps for this example.

Given PR objectives, please include steps to build the FAISS index and validate the flow end-to-end.

Proposed outline:

  • Prereqs: export OPENAI_API_KEY; install rag-content per https://github.com/lightspeed-core/rag-content.
  • Build FAISS index: rag-content build --input ./docs --embedding-model sentence-transformers/all-mpnet-base-v2 --out ${VECTOR_DB_PATH:=/opt/lightspeed/vector_dbs/ocp_docs}/openshift-index
  • Start server: llamastack serve --config examples/openai-faiss-run.yaml
  • Sanity query: use the builtin RAG toolgroup via your agents API to ask a doc-grounded question and attach the vector_db_id openshift-index. Capture status 200, model output, and top-k doc IDs as artifacts.
    Do you want me to add a short TESTING.md section and wire it from README?
examples/vllm-qwen3-run.yaml (6)

3-5: Remove trailing spaces in comments.

Tiny lint cleanups on comments.

-# 
+#
-# 
+#

56-58: Avoid zero-width space as default service name.

Using \u200B yields an invisible string that’s hard to spot in logs. Prefer empty string or a descriptive default.

-      service_name: "${env.OTEL_SERVICE_NAME:=\u200B}"
+      service_name: "${env.OTEL_SERVICE_NAME:=vllm-qwen3}"

27-31: Externalize the API token.

Replace placeholder with an environment variable to prevent accidental commits of secrets and ease deployment.

-      api_token: <add your api key>
+      api_token: ${env.QWEN_API_TOKEN}

39-45: Caution: example Postgres defaults are insecure.

Defaults include user/password “user/password”. Fine for local testing, but please add a warning comment to avoid using in production.

-        user: ${env.POSTGRES_USER:=user}
-        password: ${env.POSTGRES_PASSWORD:=password}
+        user: ${env.POSTGRES_USER:=user}       # Do not use default creds in production
+        password: ${env.POSTGRES_PASSWORD:=password}  # Replace via env
@@
-  user: ${env.POSTGRES_USER:=user}
-  password: ${env.POSTGRES_PASSWORD:=password}
+  user: ${env.POSTGRES_USER:=user}       # Do not use default creds in production
+  password: ${env.POSTGRES_PASSWORD:=password}  # Replace via env
@@
-  user: ${env.POSTGRES_USER:=user}
-  password: ${env.POSTGRES_PASSWORD:=password}
+  user: ${env.POSTGRES_USER:=user}       # Do not use default creds in production
+  password: ${env.POSTGRES_PASSWORD:=password}  # Replace via env

Also applies to: 46-52, 76-83, 84-90


108-108: Ensure newline at end of file.

Adds POSIX newline to silence linter.

 server:
   port: 8321
+

1-11: Add minimal verification steps for this config.

Given PR asks for testing artifacts, include a snippet on how to hit vLLM and confirm a basic completion path.

Suggested steps:

  • Set POSTGRES_* env vars and QWEN_API_TOKEN.
  • Start server: llamastack serve --config examples/vllm-qwen3-run.yaml
  • Smoke test: call the inference chat/completions endpoint with model_id qwen3-32b-maas; capture response time and a 200 result.
    I can contribute a short README section with curl examples if helpful.
examples/vllm-llama-faiss-run.yaml (4)

2-2: Strip trailing spaces.

Minor lint cleanups.

-# 
+#
-
+

Also applies to: 27-27


76-80: Add newline at EOF.

Silences linter and prevents spurious diffs.

   provider_id: rhel-db
   vector_db_id: rhel-docs
+

65-75: Optional: drop nulls in tool_groups.

Keeps the example concise without losing meaning.

 tool_groups:
 - provider_id: rag-runtime
   toolgroup_id: builtin::rag
-  args: null
-  mcp_endpoint: null

1-8: Add quick verification guidance.

Please include minimal steps to verify RAG wiring with FAISS on vLLM.

Proposed:

  • Start vLLM at http://localhost:8000/v1 with Llama 3.1 Instruct.
  • Export VLLM_API_TOKEN (if enforced) and VECTOR_DB_PATH; build FAISS index matching vector_db_id rhel-docs.
  • llamastack serve --config examples/vllm-llama-faiss-run.yaml
  • Query via agents API using builtin::rag; collect a 200 result and top-k references as artifacts.
examples/gemini-run.yaml (2)

2-2: Trim trailing spaces in comments.

Minor lint-only fix.

-# 
+#
-# 
+#

Also applies to: 5-5


23-29: Provide minimal TESTING steps inline with this example.

Given the PR checklist request, please add a short “Testing” comment block or README pointer.

Proposed steps:

  • export GEMINI_API_KEY and Postgres env vars (LLAMA_STACK_POSTGRES_*).
  • Start server: llamastack serve --config examples/gemini-run.yaml
  • Send a simple prompt with one of the configured model_ids; attach artifacts (HTTP 200, response JSON snippet).
    I can draft the commands if you confirm the intended REST paths.

Also applies to: 37-43, 44-50, 77-83

examples/openai-pgvector-run.yaml (5)

2-2: Remove trailing spaces to satisfy yamllint.

Multiple trailing-space instances flagged; remove to keep CI clean.

-# 
+#
-# 
+#
-
+
-  - provider_id: pgvector-example ␠
+  - provider_id: pgvector-example
-  provider_id: pgvector-example ␠
+  provider_id: pgvector-example

Also applies to: 7-7, 29-29, 58-58, 84-84


23-29: Parameterize embedding model path.

Avoid user-specific absolute paths to improve portability.

-  provider_model_id: /home/USER/lightspeed-stack/embedding_models/all-mpnet-base-v2
+  provider_model_id: ${env.EMBEDDING_MODEL_PATH:=/opt/lightspeed/embedding_models/all-mpnet-base-v2}

75-80: Drop nulls in tool_groups to simplify.

Optional but keeps example cleaner.

 tool_groups:
 - provider_id: rag-runtime
   toolgroup_id: builtin::rag
-  args: null
-  mcp_endpoint: null

81-87: Add newline at EOF and keep vector_dbs block consistent.

Also, slight consistency tweak with env-parameterized connection earlier.

   provider_id: pgvector-example 
   # A unique ID that becomes the PostgreSQL table name, prefixed with 'vector_store_'.
   # e.g., 'rhdocs' will create the table 'vector_store_rhdocs'.
   vector_db_id: rhdocs
+

1-17: Request verification: end-to-end pgvector smoke test steps.

Please include in the docs a brief test plan and artifacts.

Suggested:

  • Ensure PostgreSQL with pgvector extension: CREATE EXTENSION IF NOT EXISTS vector;
  • Export PGVECTOR_* envs and OPENAI_API_KEY.
  • Create table via the stack (vector_db_id rhdocs => table vector_store_rhdocs).
  • Generate embeddings and insert rows using the rag tool, referencing embedding_model sentence-transformers/all-mpnet-base-v2.
  • Start: llamastack serve --config examples/openai-pgvector-run.yaml
  • Query via agents API using builtin::rag, expect 200 and top-k matches. Attach logs and a sample response.
    I can supply the exact SQL schema expected by llama-stack if you want it included in rag_guide.md.

Also applies to: 57-69, 81-87

examples/vllm-granite-run.yaml (7)

3-5: Fix YAML lint issues: trailing spaces and missing EOF newline.

Remove trailing spaces (Lines 3, 5) and add a newline at end of file to satisfy yamllint.

-# 
+#

-# 
+#
-server:
-  port: 8321
+server:
+  port: 8321
+

Also applies to: 148-148


117-120: Remove unusable model-context-protocol provider or supply required config.

remote::model-context-protocol has an empty config and is not referenced by any tool_group. This will likely error at runtime if loaded. Remove it or provide the required endpoint configuration.

Option A — remove unused provider:

-  - provider_id: model-context-protocol
-    provider_type: remote::model-context-protocol
-    config: {}

Option B — keep it and add a URL (example):

-  - provider_id: model-context-protocol
-    provider_type: remote::model-context-protocol
-    config: {}
+  - provider_id: model-context-protocol
+    provider_type: remote::model-context-protocol
+    config:
+      url: ${env.MCP_ENDPOINT_URL}

28-30: Confirm vLLM URL points to the OpenAI-compatible /v1 endpoint and token behavior.

Ensure ${env.VLLM_URL} includes the /v1 prefix (e.g., http://host:8000/v1/) expected by remote::vllm, and verify whether your deployment actually requires api_token. If not, consider omitting it to avoid confusion.

Would you like me to add a brief comment in-file clarifying that VLLM_URL must target the OpenAI-compatible /v1 endpoint?


131-134: Verify model registration: provider_model_id is null.

Some stacks expect provider_model_id to match the served model name (e.g., meta-llama/Llama-3.1-8B-Instruct) returned by the vLLM /v1/models endpoint. If null is acceptable with your version, OK; otherwise set it explicitly.

 models:
 - metadata: {}
   model_id: ${env.INFERENCE_MODEL}
   provider_id: granite
-  provider_model_id: null
+  provider_model_id: ${env.INFERENCE_MODEL}

40-47: Use provider-appropriate default storage paths for clarity.

Default paths reference “ollama” directories in an example meant for vLLM/Granite. Consider renaming to avoid confusion.

-        db_path: ${env.SQLITE_STORE_DIR:~/.llama/distributions/ollama}/agents_store.db
+        db_path: ${env.SQLITE_STORE_DIR:~/.llama/distributions/vllm}/agents_store.db
-        db_path: ${env.SQLITE_STORE_DIR:~/.llama/distributions/ollama}/responses_store.db
+        db_path: ${env.SQLITE_STORE_DIR:~/.llama/distributions/vllm}/responses_store.db
-      storage_dir: ${env.FILES_STORAGE_DIR:~/.llama/distributions/ollama/files}
+      storage_dir: ${env.FILES_STORAGE_DIR:~/.llama/distributions/vllm/files}
-        db_path: ${env.SQLITE_STORE_DIR:~/.llama/distributions/ollama}/files_metadata.db
+        db_path: ${env.SQLITE_STORE_DIR:~/.llama/distributions/vllm}/files_metadata.db

Also applies to: 92-95


114-116: If this is truly “no RAG”, drop rag runtime and tool group.

The config description says “no RAG”, but rag-runtime and a builtin::rag tool group are included. Remove both to keep the example minimal and avoid unused providers.

Remove rag runtime provider:

-  - provider_id: rag-runtime
-    provider_type: inline::rag-runtime
-    config: {}

Remove rag tool group:

- - toolgroup_id: builtin::rag
-   provider_id: rag-runtime

Also applies to: 141-146


1-22: Provide quickstart test steps and required env vars for this example.

Add a brief, runnable checklist so users can validate the config quickly.

Proposed snippet to add as a comment block near the top:

# Quickstart (example)
# export VLLM_URL="http://localhost:8000/v1/"
# export INFERENCE_MODEL="meta-llama/Llama-3.1-8B-Instruct"  # or your served model
# optional: export VLLM_API_TOKEN="..."
# llama stack run examples/vllm-granite-run.yaml
# curl -s ${VLLM_URL}/models | jq .

Would you like me to submit a commit adding this quickstart block?

README.md (1)

462-462: Tighten wording: “configuration examples.”

Minor copy edit for readability and consistency.

-The following configurations are llama-stack config examples from production deployments:
+The following are llama-stack configuration examples from production deployments:
docs/rag_guide.md (4)

52-54: Fix capitalization typo (“LLama” → “Llama”).

-LLama Stack will then download the model to the `~/.cache/huggingface/hub` folder.
+Llama Stack will then download the model to the `~/.cache/huggingface/hub` folder.

118-120: Unify list style to pass markdownlint (MD004).

Switch dashes to asterisks to match the rest of the document.

-- `id` (`text`): UUID identifier of the chunk
-- `document` (`jsonb`): json containing content and metadata associated with the embedding  
-- `embedding` (`vector(n)`): the embedding vector, where `n` is the embedding dimension and must match the model's output size (e.g. 768 for `all-mpnet-base-v2`) 
+* `id` (`text`): UUID identifier of the chunk
+* `document` (`jsonb`): json containing content and metadata associated with the embedding
+* `embedding` (`vector(n)`): the embedding vector, where `n` is the embedding dimension and must match the model's output size (e.g. 768 for `all-mpnet-base-v2`)

158-176: Clarify vLLM authentication expectations.

vllm-openai often runs without an auth token by default. Confirm whether your setup requires api_token and note it accordingly to prevent confusion.

I can add a one-line note: “If your vLLM endpoint does not require authentication, omit api_token.”

Also applies to: 204-206


257-257: Fix grammar: “were” not “where”.

-The RAG tool calls where not working properly when experimenting with `mistralai/Mistral-7B-Instruct-v0.3` on vLLM.
+The RAG tool calls were not working properly when experimenting with `mistralai/Mistral-7B-Instruct-v0.3` on vLLM.
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 45eb299 and 1b72b94.

📒 Files selected for processing (8)
  • README.md (2 hunks)
  • docs/rag_guide.md (3 hunks)
  • examples/gemini-run.yaml (1 hunks)
  • examples/openai-faiss-run.yaml (1 hunks)
  • examples/openai-pgvector-run.yaml (1 hunks)
  • examples/vllm-granite-run.yaml (1 hunks)
  • examples/vllm-llama-faiss-run.yaml (1 hunks)
  • examples/vllm-qwen3-run.yaml (1 hunks)
🧰 Additional context used
🪛 YAMLlint (1.37.1)
examples/vllm-granite-run.yaml

[error] 3-3: trailing spaces

(trailing-spaces)


[error] 5-5: trailing spaces

(trailing-spaces)


[error] 148-148: no new line character at the end of file

(new-line-at-end-of-file)

examples/vllm-qwen3-run.yaml

[error] 3-3: trailing spaces

(trailing-spaces)


[error] 5-5: trailing spaces

(trailing-spaces)


[error] 108-108: no new line character at the end of file

(new-line-at-end-of-file)

examples/openai-pgvector-run.yaml

[error] 2-2: trailing spaces

(trailing-spaces)


[error] 7-7: trailing spaces

(trailing-spaces)


[error] 29-29: trailing spaces

(trailing-spaces)


[warning] 31-31: wrong indentation: expected 4 but found 2

(indentation)


[warning] 38-38: wrong indentation: expected 8 but found 6

(indentation)


[warning] 44-44: wrong indentation: expected 8 but found 6

(indentation)


[warning] 45-45: wrong indentation: expected 10 but found 8

(indentation)


[warning] 48-48: wrong indentation: expected 10 but found 8

(indentation)


[warning] 55-55: wrong indentation: expected 8 but found 6

(indentation)


[error] 58-58: trailing spaces

(trailing-spaces)


[warning] 61-61: wrong indentation: expected 8 but found 6

(indentation)


[warning] 67-67: wrong indentation: expected 10 but found 8

(indentation)


[error] 84-84: trailing spaces

(trailing-spaces)


[error] 87-87: no new line character at the end of file

(new-line-at-end-of-file)

examples/vllm-llama-faiss-run.yaml

[error] 2-2: trailing spaces

(trailing-spaces)


[error] 27-27: trailing spaces

(trailing-spaces)


[warning] 29-29: wrong indentation: expected 4 but found 2

(indentation)


[warning] 36-36: wrong indentation: expected 8 but found 6

(indentation)


[warning] 43-43: wrong indentation: expected 8 but found 6

(indentation)


[warning] 44-44: wrong indentation: expected 10 but found 8

(indentation)


[warning] 47-47: wrong indentation: expected 10 but found 8

(indentation)


[warning] 54-54: wrong indentation: expected 8 but found 6

(indentation)


[warning] 60-60: wrong indentation: expected 8 but found 6

(indentation)


[warning] 61-61: wrong indentation: expected 10 but found 8

(indentation)


[error] 80-80: no new line character at the end of file

(new-line-at-end-of-file)

examples/gemini-run.yaml

[error] 2-2: trailing spaces

(trailing-spaces)


[error] 5-5: trailing spaces

(trailing-spaces)

examples/openai-faiss-run.yaml

[error] 2-2: trailing spaces

(trailing-spaces)


[error] 6-6: trailing spaces

(trailing-spaces)


[error] 18-18: trailing spaces

(trailing-spaces)


[error] 28-28: trailing spaces

(trailing-spaces)


[error] 29-29: trailing spaces

(trailing-spaces)


[warning] 31-31: wrong indentation: expected 4 but found 2

(indentation)


[error] 32-32: trailing spaces

(trailing-spaces)


[error] 36-36: trailing spaces

(trailing-spaces)


[warning] 39-39: wrong indentation: expected 8 but found 6

(indentation)


[warning] 45-45: wrong indentation: expected 8 but found 6

(indentation)


[warning] 46-46: wrong indentation: expected 10 but found 8

(indentation)


[warning] 49-49: wrong indentation: expected 10 but found 8

(indentation)


[warning] 56-56: wrong indentation: expected 8 but found 6

(indentation)


[error] 59-59: trailing spaces

(trailing-spaces)


[warning] 62-62: wrong indentation: expected 8 but found 6

(indentation)


[warning] 63-63: wrong indentation: expected 10 but found 8

(indentation)


[error] 68-68: trailing spaces

(trailing-spaces)


[error] 81-81: trailing spaces

(trailing-spaces)


[error] 83-83: no new line character at the end of file

(new-line-at-end-of-file)

🪛 LanguageTool
README.md

[grammar] ~464-~464: There might be a mistake here.
Context: ...deployments: - Granite on vLLM example - [Qwen3 on vLLM example](examples/vllm-qwe...

(QB_NEW_EN)


[grammar] ~465-~465: There might be a mistake here.
Context: ...anite-run.yaml) - Qwen3 on vLLM example - [Gemini example](examples/gemini-run.yaml...

(QB_NEW_EN)


[grammar] ~468-~468: There might be a mistake here.
Context: ...le](examples/gemini-run.yaml) > [!NOTE] > RAG functionality is not tested fo...

(QB_NEW_EN)

docs/rag_guide.md

[grammar] ~122-~122: There might be a mistake here.
Context: ...768 for all-mpnet-base-v2) > [!NOTE] > The vector_db_id (e.g. rhdocs) is used...

(QB_NEW_EN)


[grammar] ~178-~178: There might be a mistake here.
Context: ...ion options, see the vLLM documentation: > [vLLM: Tool Calling](https://docs.vllm...

(QB_NEW_EN)

🪛 markdownlint-cli2 (0.17.2)
docs/rag_guide.md

118-118: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


119-119: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


120-120: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-pr
  • GitHub Check: e2e_tests
🔇 Additional comments (3)
examples/gemini-run.yaml (1)

25-29: Standardize on ${env.VAR} for all environment-driven YAML values

Other examples in this PR consistently use the ${env.*} syntax. Mixing in bare ${VAR} may result in literal strings if your loader doesn’t support that form. Please verify whether your config loader handles both styles; if not, switch all bare references to ${env.VAR}.

This applies to:

  • examples/gemini-run.yaml
    • Lines 25–29, 54–56, 75–76, 84–98, 105–112

Example change:

-  provider_id: ${LLAMA_STACK_INFERENCE_PROVIDER}
+  provider_id: ${env.LLAMA_STACK_INFERENCE_PROVIDER}

If bare ${VAR} is intentionally supported, please add a comment clarifying that behavior.

README.md (2)

462-467: All referenced example files are present

Verified that the following files exist in the examples/ directory, so the README links won’t 404:

  • examples/vllm-granite-run.yaml
  • examples/vllm-qwen3-run.yaml
  • examples/gemini-run.yaml

No further action required.


38-38: No duplicate “RAG Configuration” entries found

Verification confirms exactly one # RAG Configuration header (line 455) and one TOC entry (line 38). No further action needed.

Comment on lines +103 to +104
See the full working [config example](examples/openai-faiss-run.yaml) for more details.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix relative links to examples from docs/ to examples/.

From docs/, the correct relative path is ../examples/...

Update FAISS example reference:

-See the full working [config example](examples/openai-faiss-run.yaml) for more details.
+See the full working [config example](../examples/openai-faiss-run.yaml) for more details.

Update pgvector example reference:

-See the full working [config example](examples/openai-pgvector-run.yaml) for more details.
+See the full working [config example](../examples/openai-pgvector-run.yaml) for more details.

Update vLLM Llama example reference:

-See the full working [config example](examples/vllm-llama-faiss-run.yaml) for more details.
+See the full working [config example](../examples/vllm-llama-faiss-run.yaml) for more details.

Update OpenAI example reference:

-See the full working [config example](examples/openai-faiss-run.yaml) for more details.
+See the full working [config example](../examples/openai-faiss-run.yaml) for more details.

Also applies to: 152-153, 208-209, 241-242

🤖 Prompt for AI Agents
docs/rag_guide.md lines 103-104 (and also update similar occurrences at 152-153,
208-209, 241-242): the example links currently point to examples/... which are
incorrect when referenced from docs/; change each link to use the correct
relative path prefix ../examples/ (e.g., ../examples/openai-faiss-run.yaml) for
the FAISS, pgvector, vLLM Llama, and OpenAI example references so they resolve
correctly from the docs directory.

Comment on lines +31 to +41
inference:
- provider_id: sentence-transformers
provider_type: inline::sentence-transformers
config: {}

- provider_id: openai
provider_type: remote::openai
config:
api_key: ${env.OPENAI_API_KEY}

agents:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Fix indentation to 4 spaces per level under providers and nested maps.

YAML lint expects 4-space indentation. Current blocks under providers/config/stores are under-indented by 2 spaces.

-providers:
-  inference:
-  - provider_id: sentence-transformers 
-    provider_type: inline::sentence-transformers
-    config: {}
-
-  - provider_id: openai 
-    provider_type: remote::openai
-    config:
-      api_key: ${env.OPENAI_API_KEY}
-
-  agents:
-  - provider_id: meta-reference
-    provider_type: inline::meta-reference
-    config:
-      persistence_store:
-        type: sqlite
-        db_path: .llama/distributions/ollama/agents_store.db
-      responses_store:
-        type: sqlite
-        db_path: .llama/distributions/ollama/responses_store.db
-
-  safety:
-  - provider_id: llama-guard
-    provider_type: inline::llama-guard
-    config:
-      excluded_categories: []
-
-  vector_io:
-  - provider_id: ocp-docs 
-    provider_type: inline::faiss
-    config:
-      kvstore:
-        type: sqlite
-        db_path: /home/USER/lightspeed-stack/vector_dbs/ocp_docs/faiss_store.db
-        namespace: null
-
-  tool_runtime:
-  - provider_id: rag-runtime 
-    provider_type: inline::rag-runtime
-    config: {}
+providers:
+    inference:
+        - provider_id: sentence-transformers
+          provider_type: inline::sentence-transformers
+          config: {}
+
+        - provider_id: openai
+          provider_type: remote::openai
+          config:
+              api_key: ${env.OPENAI_API_KEY}
+
+    agents:
+        - provider_id: meta-reference
+          provider_type: inline::meta-reference
+          config:
+              persistence_store:
+                  type: sqlite
+                  db_path: .llama/distributions/ollama/agents_store.db
+              responses_store:
+                  type: sqlite
+                  db_path: .llama/distributions/ollama/responses_store.db
+
+    safety:
+        - provider_id: llama-guard
+          provider_type: inline::llama-guard
+          config:
+              excluded_categories: []
+
+    vector_io:
+        - provider_id: ocp-docs
+          provider_type: inline::faiss
+          config:
+              kvstore:
+                  type: sqlite
+                  db_path: /home/USER/lightspeed-stack/vector_dbs/ocp_docs/faiss_store.db
+                  namespace: null
+
+    tool_runtime:
+        - provider_id: rag-runtime
+          provider_type: inline::rag-runtime
+          config: {}

Also applies to: 45-51, 56-66, 67-71

🧰 Tools
🪛 YAMLlint (1.37.1)

[warning] 31-31: wrong indentation: expected 4 but found 2

(indentation)


[error] 32-32: trailing spaces

(trailing-spaces)


[error] 36-36: trailing spaces

(trailing-spaces)


[warning] 39-39: wrong indentation: expected 8 but found 6

(indentation)

🤖 Prompt for AI Agents
In examples/openai-faiss-run.yaml around lines 31 to 41 (and also apply to
45-51, 56-66, 67-71), the YAML blocks under "inference", provider entries,
"config", and other nested maps use 2-space indentation; update these to use 4
spaces per indentation level consistently (each nested key/value should be
indented by 4 spaces from its parent) so that provider entries, their
provider_type and config keys and values, and any stores/nested maps align with
YAML lint expectations.

Comment on lines +31 to +39
inference:
- provider_id: sentence-transformers
provider_type: inline::sentence-transformers
config: {}
- provider_id: openai
provider_type: remote::openai
config:
api_key: ${env.OPENAI_API_KEY}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Indentation: switch to 4 spaces per nesting level.

Aligns with yamllint expectations and your other examples.

-providers:
-  inference:
-  - provider_id: sentence-transformers
-    provider_type: inline::sentence-transformers
-    config: {}
-  - provider_id: openai
-    provider_type: remote::openai
-    config:
-      api_key: ${env.OPENAI_API_KEY}
-
-  agents:
-  - provider_id: meta-reference
-    provider_type: inline::meta-reference
-    config:
-      persistence_store:
-        type: sqlite
-        db_path: .llama/distributions/ollama/agents_store.db
-      responses_store:
-        type: sqlite
-        db_path: .llama/distributions/ollama/responses_store.db
-
-  safety:
-  - provider_id: llama-guard
-    provider_type: inline::llama-guard
-    config:
-      excluded_categories: []
-
-  vector_io:
-  - provider_id: pgvector-example 
-    provider_type: remote::pgvector
-    config:
-      host: localhost
-      port: 5432
-      db: pgvector_example # PostgreSQL database (psql -d pgvector_example)
-      user: lightspeed # PostgreSQL user
-      password: empty
-      kvstore:
-        type: sqlite
-        db_path: .llama/distributions/pgvector/pgvector_registry.db
+providers:
+    inference:
+        - provider_id: sentence-transformers
+          provider_type: inline::sentence-transformers
+          config: {}
+        - provider_id: openai
+          provider_type: remote::openai
+          config:
+              api_key: ${env.OPENAI_API_KEY}
+
+    agents:
+        - provider_id: meta-reference
+          provider_type: inline::meta-reference
+          config:
+              persistence_store:
+                  type: sqlite
+                  db_path: .llama/distributions/ollama/agents_store.db
+              responses_store:
+                  type: sqlite
+                  db_path: .llama/distributions/ollama/responses_store.db
+
+    safety:
+        - provider_id: llama-guard
+          provider_type: inline::llama-guard
+          config:
+              excluded_categories: []
+
+    vector_io:
+        - provider_id: pgvector-example
+          provider_type: remote::pgvector
+          config:
+              host: ${env.PGVECTOR_HOST:=localhost}
+              port: ${env.PGVECTOR_PORT:=5432}
+              db: ${env.PGVECTOR_DB:=pgvector_example}  # PostgreSQL database (psql -d pgvector_example)
+              user: ${env.PGVECTOR_USER:=lightspeed}    # PostgreSQL user
+              password: ${env.PGVECTOR_PASSWORD:=empty}
+              kvstore:
+                  type: sqlite
+                  db_path: .llama/distributions/pgvector/pgvector_registry.db

Also applies to: 44-50, 51-56, 61-69

🧰 Tools
🪛 YAMLlint (1.37.1)

[warning] 31-31: wrong indentation: expected 4 but found 2

(indentation)


[warning] 38-38: wrong indentation: expected 8 but found 6

(indentation)

🤖 Prompt for AI Agents
In examples/openai-pgvector-run.yaml around lines 31 to 39 (and similarly for
44-50, 51-56, 61-69), the YAML uses 2-space or inconsistent indentation; update
the file to use 4 spaces per nesting level consistently (e.g., under "inference"
indent each list item and their nested keys by 4 spaces, and ensure nested
mapping entries like "config:" and "api_key:" are indented an additional 4
spaces relative to their parent) so it conforms to yamllint expectations and
matches other examples.

Comment on lines +29 to +37
inference:
- provider_id: sentence-transformers
provider_type: inline::sentence-transformers
config: {}
- provider_id: vllm
provider_type: remote::vllm
config:
url: http://localhost:8000/v1/
api_token: key
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Normalize indentation to 4 spaces and tweak vLLM URL/token.

  • Align to 4-space indent per yamllint.
  • Prefer URL without trailing slash to avoid double-slash issues with some clients.
  • Externalize api_token.
-providers:
-  inference:
-  - provider_id: sentence-transformers
-    provider_type: inline::sentence-transformers
-    config: {}
-  - provider_id: vllm
-    provider_type: remote::vllm
-    config:
-      url: http://localhost:8000/v1/
-      api_token: key
+providers:
+    inference:
+        - provider_id: sentence-transformers
+          provider_type: inline::sentence-transformers
+          config: {}
+        - provider_id: vllm
+          provider_type: remote::vllm
+          config:
+              url: http://localhost:8000/v1
+              api_token: ${env.VLLM_API_TOKEN}
@@
-  agents:
-  - provider_id: meta-reference
-    provider_type: inline::meta-reference
-    config:
-      persistence_store:
-        type: sqlite
-        db_path: .llama/distributions/ollama/agents_store.db
-      responses_store:
-        type: sqlite
-        db_path: .llama/distributions/ollama/responses_store.db
+    agents:
+        - provider_id: meta-reference
+          provider_type: inline::meta-reference
+          config:
+              persistence_store:
+                  type: sqlite
+                  db_path: .llama/distributions/ollama/agents_store.db
+              responses_store:
+                  type: sqlite
+                  db_path: .llama/distributions/ollama/responses_store.db
@@
-  vector_io:
-  - provider_id: rhel-db
-    provider_type: inline::faiss
-    config:
-      kvstore:
-        type: sqlite
-        db_path: /home/USER/vector_dbs/rhel_index/faiss_store.db
-        namespace: null
+    vector_io:
+        - provider_id: rhel-db
+          provider_type: inline::faiss
+          config:
+              kvstore:
+                  type: sqlite
+                  db_path: ${env.VECTOR_DB_PATH:=/opt/lightspeed/vector_dbs/rhel_index/faiss_store.db}
+                  namespace: null

Also applies to: 43-49, 56-64

🧰 Tools
🪛 YAMLlint (1.37.1)

[warning] 29-29: wrong indentation: expected 4 but found 2

(indentation)


[warning] 36-36: wrong indentation: expected 8 but found 6

(indentation)

🤖 Prompt for AI Agents
In examples/vllm-llama-faiss-run.yaml around lines 29-37 (and likewise adjust
blocks at 43-49 and 56-64), normalize indentation to 4 spaces for all YAML
entries, remove the trailing slash from the vllm url (use http://localhost:8000)
to avoid double-slash issues, and externalize the api_token (replace the inline
token with a reference to an environment variable or secret placeholder such as
${VLLM_API_TOKEN} or a secrets key) so no secret is stored directly in the file.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
README.md (1)

455-458: Fixed: RAG guide link now points to docs/rag_guide.md

The previously broken link has been corrected to use the underscore path.

🧹 Nitpick comments (2)
README.md (2)

459-466: Align list style and capitalization; minor polish

  • Use asterisks for unordered lists here to satisfy MD004 and keep style consistent with the rest of the README.
  • Capitalize “Llama Stack” for consistency with prior usage in the document.

Apply this diff:

-## Example configurations for inference
-
-The following configurations are llama-stack config examples from production deployments:
-
-- [Granite on vLLM example](examples/vllm-granite-run.yaml)
-- [Qwen3 on vLLM example](examples/vllm-qwen3-run.yaml)
-- [Gemini example](examples/gemini-run.yaml)
+## Example configurations for inference
+
+The following configurations are Llama Stack config examples from production deployments:
+
+* [Granite on vLLM example](examples/vllm-granite-run.yaml)
+* [Qwen3 on vLLM example](examples/vllm-qwen3-run.yaml)
+* [Gemini example](examples/gemini-run.yaml)

467-468: Optional wording tweak

Current wording is fine; if you want to be slightly more formal, consider “validated” instead of “tested.”

-> RAG functionality is **not tested** for these configurations.
+> RAG functionality is **not validated** for these configurations.
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 1b72b94 and 1d84fcf.

📒 Files selected for processing (1)
  • README.md (2 hunks)
🧰 Additional context used
🪛 LanguageTool
README.md

[grammar] ~463-~463: There might be a mistake here.
Context: ...deployments: - Granite on vLLM example - [Qwen3 on vLLM example](examples/vllm-qwe...

(QB_NEW_EN)


[grammar] ~464-~464: There might be a mistake here.
Context: ...anite-run.yaml) - Qwen3 on vLLM example - [Gemini example](examples/gemini-run.yaml...

(QB_NEW_EN)


[grammar] ~467-~467: There might be a mistake here.
Context: ...le](examples/gemini-run.yaml) > [!NOTE] > RAG functionality is not tested fo...

(QB_NEW_EN)

🪛 markdownlint-cli2 (0.17.2)
README.md

463-463: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


464-464: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


465-465: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-pr
  • GitHub Check: e2e_tests
🔇 Additional comments (2)
README.md (2)

38-38: TOC entry for RAG looks correct

Anchor matches the section header and placement in TOC is consistent.


455-466: Verified README references – all linked files exist

I’ve confirmed that:

  • docs/rag_guide.md
  • examples/vllm-granite-run.yaml
  • examples/vllm-qwen3-run.yaml
  • examples/gemini-run.yaml

are all present in the repository. No broken links found.

Copy link
Contributor

@tisnik tisnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thank you very much

Copy link
Contributor

@matysek matysek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Copy link

@Akrog Akrog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR looks good in general, my only concern is that you are using env var EXPORTED_PORT but then assuming afterwards that this has been set to 8000, which doesn't feel quite right.
The rest of the comments are just nits.

…nfigurations

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

Added VertexAI configuration example
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
docs/rag_guide.md (1)

1-357: Add CI Checks for Documentation and Example Config Validity

The new CI script now verifies that all example links in docs/rag_guide.md exist, that each example YAML parses correctly, and that every declared API has a corresponding provider. It successfully ran but flagged missing provider definitions in two example files. Please address these before merging:

• examples/vertexai-run.yaml
– Declares apis: ['safety'] but has no providers.safety entry. Add a safety provider config or remove the API declaration.
• examples/vllm-granite-run.yaml
– Declares apis: ['vector_io'] but has no providers.vector_io entry. Add a vector_io provider config or remove the API declaration.

Once these are fixed, the CI step will pass and prevent future regressions.

♻️ Duplicate comments (7)
examples/vllm-llama-faiss-run.yaml (2)

36-38: URL normalization: remove trailing slash to avoid double-slash issues when clients join paths.

Change http://localhost:8000/v1/http://localhost:8000/v1 as shown in the diff above.


29-36: Reindent to 4 spaces per level under providers; current indentation violates yamllint and harms readability.

Also remove the trailing slash in vLLM URL and externalize api_token. This mirrors prior feedback.

Apply:

-providers:
-  inference:
-  - provider_id: sentence-transformers
-    provider_type: inline::sentence-transformers
-    config: {}
-  - provider_id: vllm
-    provider_type: remote::vllm
-    config:
-      url: http://localhost:8000/v1/
-      api_token: key
+providers:
+    inference:
+        - provider_id: sentence-transformers
+          provider_type: inline::sentence-transformers
+          config: {}
+        - provider_id: vllm
+          provider_type: remote::vllm
+          config:
+              url: ${env.VLLM_URL:=http://localhost:8000/v1}
+              api_token: ${env.VLLM_API_TOKEN}

-  agents:
-  - provider_id: meta-reference
-    provider_type: inline::meta-reference
-    config:
-      persistence_store:
-        type: sqlite
-        db_path: .llama/distributions/ollama/agents_store.db
-      responses_store:
-        type: sqlite
-        db_path: .llama/distributions/ollama/responses_store.db
+    agents:
+        - provider_id: meta-reference
+          provider_type: inline::meta-reference
+          config:
+              persistence_store:
+                  type: sqlite
+                  db_path: .llama/distributions/ollama/agents_store.db
+              responses_store:
+                  type: sqlite
+                  db_path: .llama/distributions/ollama/responses_store.db

-  safety:
-  - provider_id: llama-guard
-    provider_type: inline::llama-guard
-    config:
-      excluded_categories: []
+    safety:
+        - provider_id: llama-guard
+          provider_type: inline::llama-guard
+          config:
+              excluded_categories: []

-  vector_io:
-  - provider_id: rhel-db
-    provider_type: inline::faiss
-    config:
-      kvstore:
-        type: sqlite
-        db_path: /home/USER/vector_dbs/rhel_index/faiss_store.db
-        namespace: null
+    vector_io:
+        - provider_id: rhel-db
+          provider_type: inline::faiss
+          config:
+              kvstore:
+                  type: sqlite
+                  db_path: ${env.VECTOR_DB_PATH:=/home/USER/vector_dbs/rhel_index/faiss_store.db}
+                  namespace: null

Also applies to: 39-49, 50-55, 56-64, 76-80

examples/openai-pgvector-run.yaml (1)

31-39: Normalize indentation to 4 spaces and parameterize pgvector connection.

This mirrors earlier feedback on inconsistent indentation. Also remove trailing space in provider_id and externalize pgvector credentials.

-providers:
-  inference:
-  - provider_id: sentence-transformers
-    provider_type: inline::sentence-transformers
-    config: {}
-  - provider_id: openai
-    provider_type: remote::openai
-    config:
-      api_key: ${env.OPENAI_API_KEY}
+providers:
+    inference:
+        - provider_id: sentence-transformers
+          provider_type: inline::sentence-transformers
+          config: {}
+        - provider_id: openai
+          provider_type: remote::openai
+          config:
+              api_key: ${env.OPENAI_API_KEY}
@@
-  agents:
-  - provider_id: meta-reference
-    provider_type: inline::meta-reference
-    config:
-      persistence_store:
-        type: sqlite
-        db_path: .llama/distributions/ollama/agents_store.db
-      responses_store:
-        type: sqlite
-        db_path: .llama/distributions/ollama/responses_store.db
+    agents:
+        - provider_id: meta-reference
+          provider_type: inline::meta-reference
+          config:
+              persistence_store:
+                  type: sqlite
+                  db_path: .llama/distributions/ollama/agents_store.db
+              responses_store:
+                  type: sqlite
+                  db_path: .llama/distributions/ollama/responses_store.db
@@
-  safety:
-  - provider_id: llama-guard
-    provider_type: inline::llama-guard
-    config:
-      excluded_categories: []
+    safety:
+        - provider_id: llama-guard
+          provider_type: inline::llama-guard
+          config:
+              excluded_categories: []
@@
-  vector_io:
-  - provider_id: pgvector-example 
-    provider_type: remote::pgvector
-    config:
-      host: localhost
-      port: 5432
-      db: pgvector_example # PostgreSQL database (psql -d pgvector_example)
-      user: lightspeed # PostgreSQL user
-      password: empty
-      kvstore:
-        type: sqlite
-        db_path: .llama/distributions/pgvector/pgvector_registry.db
+    vector_io:
+        - provider_id: pgvector-example
+          provider_type: remote::pgvector
+          config:
+              host: ${env.PGVECTOR_HOST:=localhost}
+              port: ${env.PGVECTOR_PORT:=5432}
+              db: ${env.PGVECTOR_DB:=pgvector_example}   # PostgreSQL database
+              user: ${env.PGVECTOR_USER:=lightspeed}     # PostgreSQL user
+              password: ${env.PGVECTOR_PASSWORD:=empty}
+              kvstore:
+                  type: sqlite
+                  db_path: .llama/distributions/pgvector/pgvector_registry.db
@@
-  provider_id: pgvector-example 
+  provider_id: pgvector-example

Also applies to: 40-50, 51-56, 57-69, 81-87

examples/openai-faiss-run.yaml (1)

31-41: Reindent to 4 spaces per level and remove inline style issues.

Matches earlier feedback.

Apply:

-providers:
-  inference:
-  - provider_id: sentence-transformers 
-    provider_type: inline::sentence-transformers
-    config: {}
-
-  - provider_id: openai 
-    provider_type: remote::openai
-    config:
-      api_key: ${env.OPENAI_API_KEY}
+providers:
+    inference:
+        - provider_id: sentence-transformers
+          provider_type: inline::sentence-transformers
+          config: {}
+        - provider_id: openai
+          provider_type: remote::openai
+          config:
+              api_key: ${env.OPENAI_API_KEY}
@@
-  agents:
-  - provider_id: meta-reference
-    provider_type: inline::meta-reference
-    config:
-      persistence_store:
-        type: sqlite
-        db_path: .llama/distributions/ollama/agents_store.db
-      responses_store:
-        type: sqlite
-        db_path: .llama/distributions/ollama/responses_store.db
+    agents:
+        - provider_id: meta-reference
+          provider_type: inline::meta-reference
+          config:
+              persistence_store:
+                  type: sqlite
+                  db_path: .llama/distributions/ollama/agents_store.db
+              responses_store:
+                  type: sqlite
+                  db_path: .llama/distributions/ollama/responses_store.db
@@
-  safety:
-  - provider_id: llama-guard
-    provider_type: inline::llama-guard
-    config:
-      excluded_categories: []
+    safety:
+        - provider_id: llama-guard
+          provider_type: inline::llama-guard
+          config:
+              excluded_categories: []
@@
-  vector_io:
-  - provider_id: ocp-docs 
-    provider_type: inline::faiss
-    config:
-      kvstore:
-        type: sqlite
-        db_path: /home/USER/lightspeed-stack/vector_dbs/ocp_docs/faiss_store.db
-        namespace: null
+    vector_io:
+        - provider_id: ocp-docs
+          provider_type: inline::faiss
+          config:
+              kvstore:
+                  type: sqlite
+                  db_path: ${env.VECTOR_DB_PATH:=/home/USER/lightspeed-stack/vector_dbs/ocp_docs/faiss_store.db}
+                  namespace: null
@@
-  tool_runtime:
-  - provider_id: rag-runtime 
-    provider_type: inline::rag-runtime
-    config: {}
+    tool_runtime:
+        - provider_id: rag-runtime
+          provider_type: inline::rag-runtime
+          config: {}

Also applies to: 45-51, 52-57, 58-66, 67-71

README.md (1)

457-457: Broken-link fix from earlier review is correctly applied.

docs/rag_guide.md is now referenced with underscore; previous rag-guide typo resolved.

docs/rag_guide.md (2)

103-104: Fix relative links to examples/ from docs/.

Links must go up one level from docs/ to examples/. This mirrors the earlier review; applying here keeps the doc navigable.

-See the full working [config example](examples/openai-faiss-run.yaml) for more details.
+See the full working [config example](../examples/openai-faiss-run.yaml) for more details.
@@
-See the full working [config example](examples/openai-pgvector-run.yaml) for more details.
+See the full working [config example](../examples/openai-pgvector-run.yaml) for more details.
@@
-See the full working [config example](examples/vllm-llama-faiss-run.yaml) for more details.
+See the full working [config example](../examples/vllm-llama-faiss-run.yaml) for more details.
@@
-See the full working [config example](examples/openai-faiss-run.yaml) for more details.
+See the full working [config example](../examples/openai-faiss-run.yaml) for more details.

Also applies to: 153-154, 206-207, 239-240


239-240: Duplicate relative link fix repeated here; will update with the others.

Covered in the earlier multi-hunk diff.

🧹 Nitpick comments (28)
examples/vllm-qwen3-run.yaml (4)

56-59: Suspicious zero‑width character in service_name default; set a safe explicit default.

The current default contains a zero‑width char (\u200B), which is easy to miss and can complicate debugging. Prefer a readable default.

-      service_name: "${env.OTEL_SERVICE_NAME:=\u200B}"
+      service_name: "${env.OTEL_SERVICE_NAME:=vllm-qwen3}"

90-95: Explicit provider_model_id recommended for vLLM to avoid ambiguity.

Leaving provider_model_id null relies on the vLLM server’s default. Recommend explicitly setting it to the served model name on your vLLM instance.

-  provider_model_id: null
+  # Use the exact model name as loaded by your vLLM server (example shown):
+  provider_model_id: ${env.QWEN3_VLLM_MODEL:=Qwen3-32B-Instruct}

If the server enforces a single model and ignores this, keep it but confirm behavior.


52-59: Ensure telemetry SQLite path exists at runtime.

sqlite_db_path points under /tmp/.llama/...; create the parent directory on startup, or allow overriding via env.

-      sqlite_db_path: ${env.SQLITE_STORE_DIR:=/tmp/.llama/distributions/starter}/trace_store.db
+      sqlite_db_path: ${env.SQLITE_STORE_DIR:=/tmp/.llama/distributions/starter}/trace_store.db
+      # Ensure ${env.SQLITE_STORE_DIR} exists before booting the stack.

23-31: Consistency: normalize empty provider lists and toolgroups formatting.

  • Prefer explicit empty lists ([]) where shown, but keep indentation consistent across examples.
  • tool_groups already set; consider aligning order/format with other example files.

No functional bug; purely readability and consistency.

Also applies to: 62-74, 100-107

examples/vllm-llama-faiss-run.yaml (3)

2-2: Remove trailing spaces to satisfy yamllint.

Lines 2 and 27 contain trailing spaces. This tends to fail strict linters.

Also applies to: 27-27


16-26: Model/provider mapping check.

  • For vLLM, consider explicitly setting provider_model_id to the actual served model (even if server defaults it). Keeps examples copy/paste‑safe.
-  provider_model_id: null
+  provider_model_id: ${env.VLLM_MODEL:=meta-llama/Llama-3.1-8B-Instruct}

80-80: Add newline at EOF.

Some linters enforce this.

examples/gemini-run.yaml (5)

2-2: Remove trailing spaces to satisfy yamllint.

Lines 2 and 5 have trailing spaces after the comment marker.

Also applies to: 5-5


24-29: Environment placeholder style: standardize on ${env.VAR} for consistency across examples.

This file mixes ${LLAMA_STACK_} and ${env.}. If the config loader expects ${env.*} (as used in other examples), align these:

-  - provider_id: ${LLAMA_STACK_INFERENCE_PROVIDER}
+  - provider_id: ${env.LLAMA_STACK_INFERENCE_PROVIDER}
@@
-  model_id: ${LLAMA_STACK_2_0_FLASH_MODEL}
-  provider_id: ${LLAMA_STACK_INFERENCE_PROVIDER}
-  provider_model_id: ${LLAMA_STACK_2_0_FLASH_MODEL}
+  model_id: ${env.LLAMA_STACK_2_0_FLASH_MODEL}
+  provider_id: ${env.LLAMA_STACK_INFERENCE_PROVIDER}
+  provider_model_id: ${env.LLAMA_STACK_2_0_FLASH_MODEL}

Repeat similarly for 2_5_PRO and 2_5_FLASH.

If both syntaxes are supported, consider adding a comment stating so.

Also applies to: 84-98


50-57: Telemetry defaults: provide safe fallbacks and ensure path exists.

If LLAMA_STACK_TELEMETRY_SINKS is unset, the loader may error on an empty value. Provide defaults and ensure sqlite dir exists.

-      service_name: "${LLAMA_STACK_OTEL_SERVICE_NAME}"
-      sinks: ${LLAMA_STACK_TELEMETRY_SINKS}
-      sqlite_db_path: ${STORAGE_MOUNT_PATH}/sqlite/trace_store.db
+      service_name: "${env.LLAMA_STACK_OTEL_SERVICE_NAME:=gemini}"
+      sinks: ${env.LLAMA_STACK_TELEMETRY_SINKS:=console,sqlite}
+      sqlite_db_path: ${env.STORAGE_MOUNT_PATH:=/var/lib/lightspeed}/sqlite/trace_store.db
+      # Ensure ${env.STORAGE_MOUNT_PATH}/sqlite exists before boot.

73-76: Consider Postgres for metadata_store for parity with inference_store.

Using sqlite for metadata_store while inference_store is Postgres is fine, but mixing stores can surprise operators. Consider making both Postgres or add a comment explaining the choice.


111-112: Server port placeholder should have a default to ease local runs.

-  port: ${LLAMA_STACK_SERVER_PORT}
+  port: ${env.LLAMA_STACK_SERVER_PORT:=8321}
examples/openai-faiss-run.yaml (1)

72-83: Vector DB block looks consistent; minor nits only.

  • Keep the comment explaining provider_id reference — helpful.
  • Consider moving the explanatory comment above vector_db_id to avoid very long lines.

No functional changes needed.

examples/vertexai-run.yaml (4)

71-75: Remove unused module: null to avoid schema validation noise.

module: null under tool_runtime is unnecessary and can be rejected by strict validators.

   - provider_id: model-context-protocol
     provider_type: remote::model-context-protocol
     config: {}
-    module: null

19-31: Tame very chatty logging categories for production-like examples.

Global DEBUG plus uvicorn/fastapi DEBUG will spam logs and slow I/O. Recommend INFO for server frameworks, keep component-level DEBUG if needed.

   level: DEBUG  # Set root logger to DEBUG
   category_levels:
@@
-    uvicorn: DEBUG
-    uvicorn.access: INFO  # Keep HTTP requests at INFO to reduce noise
-    fastapi: DEBUG
+    uvicorn: INFO
+    uvicorn.access: INFO  # Keep HTTP requests at INFO to reduce noise
+    fastapi: INFO

2-2: Clean up trailing spaces and add newline at EOF.

YAMLlint flagged trailing spaces and missing newline at EOF.

-# 
+#
@@
-# 
+#
@@
-  db_path: /tmp/inference_store.db
\ No newline at end of file
+  db_path: /tmp/inference_store.db
+

Also applies to: 5-5, 91-91


64-69: Call out required Vertex AI credentials and auth flow.

project has no default and requires GCP ADC or service account. Add a brief comment to prevent confusion when testing.

     - provider_id: google-vertex
       provider_type: remote::vertexai
       config:
-        project: ${env.VERTEXAI_PROJECT}
+        project: ${env.VERTEXAI_PROJECT}  # Required. Use a valid GCP project id.
         region: ${env.VERTEXAI_REGION:=us-east5}
+        # Authentication uses Google ADC (Application Default Credentials).
+        # Ensure 'gcloud auth application-default login' or set GOOGLE_APPLICATION_CREDENTIALS.

Would you like me to add a short “How to auth with Vertex AI” snippet to docs/rag_guide.md?

examples/vllm-granite-run.yaml (4)

3-3: Fix trailing spaces and add newline at EOF.

These are minor but keep the repo lint-clean.

-# 
+#
@@
-# 
+#
@@
-  port: 8321
\ No newline at end of file
+  port: 8321
+

Also applies to: 5-5, 148-148


9-23: Description says “no RAG” but apis include vector_io and tool_runtime (used for RAG).

This is confusing for users. Either (a) remove RAG-related APIs/providers, or (b) tweak wording to “RAG not configured by default.”

Option A — remove RAG bits:

 apis:
 - agents
 - datasetio
 - eval
 - files
 - inference
 - post_training
 - safety
 - scoring
 - telemetry
-- tool_runtime
-- vector_io
+ # no RAG in this example
+ # - tool_runtime
+ # - vector_io

And below, comment out rag-runtime and tool_groups:

   tool_runtime:
-  - provider_id: rag-runtime
-    provider_type: inline::rag-runtime
-    config: {}
-  - provider_id: model-context-protocol
+  - provider_id: model-context-protocol
     provider_type: remote::model-context-protocol
     config: {}
@@
-tool_groups:
- - toolgroup_id: builtin::websearch
-   provider_id: tavily-search
- - toolgroup_id: builtin::rag
-   provider_id: rag-runtime
- - toolgroup_id: builtin::wolfram_alpha
-   provider_id: wolfram-alpha
+tool_groups: []

Option B — keep config as-is but clarify:

-# Example llama-stack configuration for IBM Granite using vLLM (no RAG)
+# Example llama-stack configuration for IBM Granite using vLLM (RAG not configured by default)

28-31: Consider adding tls_verify with secure default for remote vLLM.

Mirror the Vertex example and promote safer defaults.

       url: ${env.VLLM_URL}
       api_token: ${env.VLLM_API_TOKEN:fake}
       max_tokens: 10000
+      tls_verify: ${env.VLLM_TLS_VERIFY:true}

51-53: Avoid zero‑width space as default service_name; use an explicit string or omit.

"${env.OTEL_SERVICE_NAME:\u200B}" is non-obvious. Prefer a readable default or leave unset.

-      service_name: "${env.OTEL_SERVICE_NAME:\u200B}"
+      service_name: "${env.OTEL_SERVICE_NAME:llama-stack}"
README.md (2)

463-466: Fix markdown list style to satisfy markdownlint (MD004).

Earlier lists use asterisks. Switch dashes to asterisks for consistency.

-- [Granite on vLLM example](examples/vllm-granite-run.yaml)
-- [Qwen3 on vLLM example](examples/vllm-qwen3-run.yaml)
-- [Gemini example](examples/gemini-run.yaml)
-- [VertexAI example](examples/vertexai-run.yaml)
+* [Granite on vLLM example](examples/vllm-granite-run.yaml)
+* [Qwen3 on vLLM example](examples/vllm-qwen3-run.yaml)
+* [Gemini example](examples/gemini-run.yaml)
+* [Vertex AI example](examples/vertexai-run.yaml)

466-466: Nit: use “Vertex AI” brand spelling.

Included in the diff above.

docs/rag_guide.md (5)

118-121: Match list style with repo convention to satisfy markdownlint (MD004).

Use asterisks like other lists.

-- `id` (`text`): UUID identifier of the chunk
-- `document` (`jsonb`): json containing content and metadata associated with the embedding  
-- `embedding` (`vector(n)`): the embedding vector, where `n` is the embedding dimension and will match the model's output size (e.g. 768 for `all-mpnet-base-v2`) 
+* `id` (`text`): UUID identifier of the chunk
+* `document` (`jsonb`): json containing content and metadata associated with the embedding
+* `embedding` (`vector(n)`): the embedding vector, where `n` is the embedding dimension and matches the model's output size (e.g., 768 for `all-mpnet-base-v2`)

53-54: Fix capitalization: “Llama Stack”, not “LLama Stack”.

-LLama Stack will then download the model to the `~/.cache/huggingface/hub` folder.
+Llama Stack will then download the model to the `~/.cache/huggingface/hub` folder.

255-256: Grammar: “were not” instead of “where not”.

-The RAG tool calls where not working properly when experimenting with `mistralai/Mistral-7B-Instruct-v0.3` on vLLM.
+The RAG tool calls were not working properly when experimenting with `mistralai/Mistral-7B-Instruct-v0.3` on vLLM.

202-204: URL note: parameterize host/port once and reuse.

You already use ${env.EXPORTED_PORT:=8000}. Consider also surfacing host as ${env.VLLM_HOST:=localhost} for clarity.

-      url: http://localhost:${env.EXPORTED_PORT:=8000}/v1/ # Replace localhost with the url of the vLLM instance
+      url: http://${env.VLLM_HOST:=localhost}:${env.EXPORTED_PORT:=8000}/v1/ # Replace host with the URL of the vLLM instance

122-124: Tighten the pgvector notes for clarity and consistency with earlier reviewer nits.

Earlier comments suggested wording tweaks; incorporating them improves readability.

-> [!NOTE]
-> The `vector_db_id` (e.g. `rhdocs`) is used to point to the table named `vector_store_rhdocs` in the specified database, which stores the vector embeddings.
+> [!NOTE]
+> The `vector_db_id` (e.g., `rhdocs`) maps to a PostgreSQL table named `vector_store_rhdocs` in the selected database, which stores the embeddings.
@@
-  # If the table was already created, this value must match the ID used at creation.
+  # If the table was already created, this value will match the ID used at creation.

Also applies to: 147-151

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 6771d6e and d14b373.

📒 Files selected for processing (9)
  • README.md (2 hunks)
  • docs/rag_guide.md (3 hunks)
  • examples/gemini-run.yaml (1 hunks)
  • examples/openai-faiss-run.yaml (1 hunks)
  • examples/openai-pgvector-run.yaml (1 hunks)
  • examples/vertexai-run.yaml (1 hunks)
  • examples/vllm-granite-run.yaml (1 hunks)
  • examples/vllm-llama-faiss-run.yaml (1 hunks)
  • examples/vllm-qwen3-run.yaml (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/rag_guide.md

[grammar] ~122-~122: There might be a mistake here.
Context: ...768 for all-mpnet-base-v2) > [!NOTE] > The vector_db_id (e.g. rhdocs) is ...

(QB_NEW_EN)


[grammar] ~161-~161: There might be a mistake here.
Context: ...n RHEL AI (Llama 3.1) example > [!NOTE] > The following example assumes that pod...

(QB_NEW_EN)


[grammar] ~182-~182: There might be a mistake here.
Context: ...ion options, see the vLLM documentation: > [vLLM: Tool Calling](https://docs.vllm...

(QB_NEW_EN)

README.md

[grammar] ~463-~463: There might be a mistake here.
Context: ...deployments: - Granite on vLLM example - [Qwen3 on vLLM example](examples/vllm-qwe...

(QB_NEW_EN)


[grammar] ~464-~464: There might be a mistake here.
Context: ...anite-run.yaml) - Qwen3 on vLLM example - [Gemini example](examples/gemini-run.yaml...

(QB_NEW_EN)


[grammar] ~465-~465: There might be a mistake here.
Context: ...s/vllm-qwen3-run.yaml) - Gemini example - [VertexAI example](examples/vertexai-run....

(QB_NEW_EN)


[grammar] ~468-~468: There might be a mistake here.
Context: ...](examples/vertexai-run.yaml) > [!NOTE] > RAG functionality is not tested fo...

(QB_NEW_EN)

🪛 markdownlint-cli2 (0.17.2)
docs/rag_guide.md

118-118: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


119-119: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


120-120: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)

README.md

463-463: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


464-464: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


465-465: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


466-466: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)

🪛 YAMLlint (1.37.1)
examples/gemini-run.yaml

[error] 2-2: trailing spaces

(trailing-spaces)


[error] 5-5: trailing spaces

(trailing-spaces)

examples/openai-faiss-run.yaml

[error] 2-2: trailing spaces

(trailing-spaces)


[error] 6-6: trailing spaces

(trailing-spaces)


[error] 18-18: trailing spaces

(trailing-spaces)


[error] 28-28: trailing spaces

(trailing-spaces)


[error] 29-29: trailing spaces

(trailing-spaces)


[warning] 31-31: wrong indentation: expected 4 but found 2

(indentation)


[error] 32-32: trailing spaces

(trailing-spaces)


[error] 36-36: trailing spaces

(trailing-spaces)


[warning] 39-39: wrong indentation: expected 8 but found 6

(indentation)


[warning] 45-45: wrong indentation: expected 8 but found 6

(indentation)


[warning] 46-46: wrong indentation: expected 10 but found 8

(indentation)


[warning] 49-49: wrong indentation: expected 10 but found 8

(indentation)


[warning] 56-56: wrong indentation: expected 8 but found 6

(indentation)


[error] 59-59: trailing spaces

(trailing-spaces)


[warning] 62-62: wrong indentation: expected 8 but found 6

(indentation)


[warning] 63-63: wrong indentation: expected 10 but found 8

(indentation)


[error] 68-68: trailing spaces

(trailing-spaces)


[error] 81-81: trailing spaces

(trailing-spaces)


[error] 83-83: no new line character at the end of file

(new-line-at-end-of-file)

examples/openai-pgvector-run.yaml

[error] 2-2: trailing spaces

(trailing-spaces)


[error] 7-7: trailing spaces

(trailing-spaces)


[error] 29-29: trailing spaces

(trailing-spaces)


[warning] 31-31: wrong indentation: expected 4 but found 2

(indentation)


[warning] 38-38: wrong indentation: expected 8 but found 6

(indentation)


[warning] 44-44: wrong indentation: expected 8 but found 6

(indentation)


[warning] 45-45: wrong indentation: expected 10 but found 8

(indentation)


[warning] 48-48: wrong indentation: expected 10 but found 8

(indentation)


[warning] 55-55: wrong indentation: expected 8 but found 6

(indentation)


[error] 58-58: trailing spaces

(trailing-spaces)


[warning] 61-61: wrong indentation: expected 8 but found 6

(indentation)


[warning] 67-67: wrong indentation: expected 10 but found 8

(indentation)


[error] 84-84: trailing spaces

(trailing-spaces)


[error] 87-87: no new line character at the end of file

(new-line-at-end-of-file)

examples/vertexai-run.yaml

[error] 2-2: trailing spaces

(trailing-spaces)


[error] 5-5: trailing spaces

(trailing-spaces)


[warning] 43-43: wrong indentation: expected 4 but found 2

(indentation)


[error] 91-91: no new line character at the end of file

(new-line-at-end-of-file)

examples/vllm-granite-run.yaml

[error] 3-3: trailing spaces

(trailing-spaces)


[error] 5-5: trailing spaces

(trailing-spaces)


[error] 148-148: no new line character at the end of file

(new-line-at-end-of-file)

examples/vllm-llama-faiss-run.yaml

[error] 2-2: trailing spaces

(trailing-spaces)


[error] 27-27: trailing spaces

(trailing-spaces)


[warning] 29-29: wrong indentation: expected 4 but found 2

(indentation)


[warning] 36-36: wrong indentation: expected 8 but found 6

(indentation)


[warning] 43-43: wrong indentation: expected 8 but found 6

(indentation)


[warning] 44-44: wrong indentation: expected 10 but found 8

(indentation)


[warning] 47-47: wrong indentation: expected 10 but found 8

(indentation)


[warning] 54-54: wrong indentation: expected 8 but found 6

(indentation)


[warning] 60-60: wrong indentation: expected 8 but found 6

(indentation)


[warning] 61-61: wrong indentation: expected 10 but found 8

(indentation)


[error] 80-80: no new line character at the end of file

(new-line-at-end-of-file)

examples/vllm-qwen3-run.yaml

[error] 3-3: trailing spaces

(trailing-spaces)


[error] 5-5: trailing spaces

(trailing-spaces)


[error] 108-108: no new line character at the end of file

(new-line-at-end-of-file)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build-pr
🔇 Additional comments (10)
examples/vllm-qwen3-run.yaml (2)

1-108: One-shot validation script for all example YAMLs in this PR

Run this from the repo root to:

  • Lint all examples
  • Detect trailing slashes in remote::vllm URLs
  • Ensure no inline secrets remain
  • Verify EOF newline
#!/usr/bin/env bash
set -euo pipefail

echo "== yamllint examples =="
find examples -type f \( -name '*.yml' -o -name '*.yaml' \) -print0 \
  | xargs -0 yamllint

echo "== check trailing slashes in vLLM URLs =="
grep -RPn 'provider_type:\s*remote::vllm' examples \
  | cut -d: -f1 \
  | uniq \
  | xargs -I{} grep -Pn 'url:\s*.+/v1/$' {}

echo "== check inline tokens/keys =="
grep -RPn 'api_token:\s*(<add your api key>|[A-Za-z0-9._-]+)|api_key:\s*(sk-[A-Za-z0-9]+|key)' examples || true

echo "== check EOF newline =="
find examples -type f \( -name '*.yml' -o -name '*.yaml' \) -print0 \
  | xargs -0 -n1 bash -c '
      file="$0"
      tail -c1 "$file" | od -An -t x1 | grep -q "0a" || echo "missing newline at EOF: $file"
  '

Flagged issues to address:

  • examples/vllm-llama-faiss-run.yaml (line 36): URL ends with /v1/ – remove trailing slash so it ends with /v1
  • examples/vllm-qwen3-run.yaml (line 29): placeholder <add your api key> – replace with documentation on where/how to inject a real token
  • examples/vllm-llama-faiss-run.yaml (line 37): api_token: key – remove this default or replace with a placeholder
  • Ensure every example YAML ends with a newline at EOF (script may report any missing files)

After fixing the above, re-run the script to confirm no failures.


27-31: Externalize VLLM configuration via environment variables

In examples/vllm-qwen3-run.yaml (lines 27–31), replace hard-coded values with env-vars—keeping tls_verify: true:

-      url: https://qwen3.rosa.openshiftapps.com/v1
-      max_tokens: 32768
-      api_token: <add your api key>
+      url: ${env.VLLM_URL:=https://qwen3.rosa.openshiftapps.com/v1}
+      max_tokens: ${env.VLLM_MAX_TOKENS:=32768}
+      api_token: ${env.VLLM_API_TOKEN}
       tls_verify: true

• This matches how other VLLM examples externalize VLLM_URL, VLLM_MAX_TOKENS and VLLM_API_TOKEN.
• No default for api_token ensures you must explicitly set it in your environment.

examples/openai-pgvector-run.yaml (1)

18-28: Embedding model dimension matches (768) — good.

The embedding metadata aligns with all-mpnet-base-v2’s 768-d vectors and the vector_dbs block. No changes needed.

examples/vertexai-run.yaml (1)

9-9: Confirm config schema version; likely should be '2' for current Llama Stack distributions.

Other examples use version: '2'. If this should be '3', please cite the schema change. Otherwise, align to '2' to avoid runtime schema errors.

-version: '3'
+version: '2'

Would you like me to run a quick schema check script across all new YAMLs to ensure version consistency?

examples/vllm-granite-run.yaml (2)

29-29: Normalize environment interpolation syntax.

This file uses ${env.VAR:default} while others use ${env.VAR:=default}. Please standardize across examples to reduce confusion; confirm which form llama-stack officially supports in 0.2.x.

I can submit a follow-up commit to normalize once you confirm the preferred syntax.

Also applies to: 51-51, 53-53, 61-61, 69-69, 76-76, 87-87, 95-95, 101-101, 107-108, 112-113, 118-119, 123-123, 126-129


131-147: Model registration sanity: provider_model_id null may be fine for remote::vllm, but verify.

Some distributions accept null and use model_id only; others expect provider_model_id. Ensure runtime registration succeeds.

If needed, set provider_model_id to the served model (or remove it entirely if schema allows).

-  provider_model_id: null
+  provider_model_id: ${env.INFERENCE_MODEL}
README.md (2)

38-38: TOC entry added — looks good and consistent with section header.


455-470: RAG section content LGTM; ensure the section is present only once.

The AI summary mentions a second identical “RAG Configuration” section near the bottom. Please dedupe to avoid reader confusion and keep a single TOC anchor.

If duplication exists, remove the duplicate block; I can provide a cleanup diff once you confirm its exact range.

docs/rag_guide.md (2)

168-179: Verify podman flags for GPU; use CDI guidance consistently.

The command mixes --device and --gpus. For Podman, GPU enablement typically relies on CDI; some environments don’t support --gpus. Please confirm and document the recommended approach for RHEL AI.

If needed, add a short preface with the exact podman GPU flags and CDI setup steps (link already provided).


64-96: FAISS example is clear; nice job calling out embedding_dimension and vector_db_id.

@@ -0,0 +1,83 @@
# Example llama-stack configuration for OpenAI inference + FAISS (RAG)
#
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Trim trailing spaces and add newline at EOF (yamllint/CI).

Multiple trailing spaces; missing newline at EOF. These are typical CI blockers.

Also applies to: 6-6, 18-18, 28-29, 32-32, 36-36, 59-59, 68-68, 81-81, 83-83

🧰 Tools
🪛 YAMLlint (1.37.1)

[error] 2-2: trailing spaces

(trailing-spaces)

🤖 Prompt for AI Agents
In examples/openai-faiss-run.yaml around lines 2 (and additionally lines 6, 18,
28-29, 32, 36, 59, 68, 81, 83), remove any trailing spaces at the ends of those
lines and ensure the file ends with a single newline character; update each
affected line to have no trailing whitespace and save the file with a final
newline (LF) so yamllint/CI passes.

@@ -0,0 +1,87 @@
# Example llama-stack configuration for OpenAI inference + PSQL (pgvector) vector index (RAG)
#
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Whitespace and EOF fixes (yamllint blockers).

  • Trailing spaces at Lines 2, 7, 29, 58, 84.
  • Missing newline at EOF (Line 87).

Apply trims and add a final newline.

Also applies to: 7-7, 29-29, 58-58, 84-84, 87-87

🧰 Tools
🪛 YAMLlint (1.37.1)

[error] 2-2: trailing spaces

(trailing-spaces)

🤖 Prompt for AI Agents
In examples/openai-pgvector-run.yaml around lines 2, 7, 29, 58, 84 and EOF at
87: remove trailing whitespace on lines 2, 7, 29, 58 and 84, and ensure the file
ends with a single newline character (add missing final newline at EOF). Trim
each indicated line so there are no trailing spaces and save the file with a
terminating newline.

Comment on lines +12 to +17
- agents
- inference
- safety
- telemetry
- tool_runtime
- vector_io
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Safety API enabled but no safety provider configured — add one or drop the API.

apis includes safety, but providers has no safety block. Llama Stack will fail to start or the API will be unavailable. Add a minimal inline::llama-guard provider or remove safety from apis.

Apply one of the following diffs.

Option A — add a safety provider:

 providers:
@@
   telemetry:
@@
       provider_type: inline::meta-reference
+
+  safety:
+    - provider_id: llama-guard
+      provider_type: inline::llama-guard
+      config:
+        excluded_categories: []

Option B — remove the API:

 apis:
   - agents
   - inference
-  - safety
   - telemetry
   - tool_runtime
   - vector_io

Also applies to: 76-83

🤖 Prompt for AI Agents
In examples/vertexai-run.yaml around lines 12 to 17 (and also apply the same fix
at lines 76 to 83), apis lists "safety" but providers contains no safety
configuration; either add a minimal safety provider or remove "safety" from
apis. To fix, either (A) add a providers.safety entry with an
inline::llama-guard provider and any required minimal config (e.g.,
enable/threshold settings) so the safety API is backed by a provider, or (B)
remove the "safety" item from the apis list in both locations if you do not
intend to enable the safety API; ensure the file remains valid YAML after the
change.

Comment on lines +43 to +51
- config:
persistence_store:
db_path: /tmp/agents_store.db
namespace: null
type: sqlite
responses_store:
db_path: /tmp/responses_store.db
type: sqlite
provider_id: meta-reference
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Fix indentation under providers.agents to satisfy YAML parsers and linters.

The list item under agents is indented with 2 spaces; expected 4. YAMLlint flagged this. Incorrect indentation can change structure silently.

   agents:
-  - config:
-      persistence_store:
-        db_path: /tmp/agents_store.db
-        namespace: null
-        type: sqlite
-      responses_store:
-        db_path: /tmp/responses_store.db
-        type: sqlite
-    provider_id: meta-reference
-    provider_type: inline::meta-reference
+    - config:
+        persistence_store:
+          db_path: /tmp/agents_store.db
+          namespace: null
+          type: sqlite
+        responses_store:
+          db_path: /tmp/responses_store.db
+          type: sqlite
+      provider_id: meta-reference
+      provider_type: inline::meta-reference
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- config:
persistence_store:
db_path: /tmp/agents_store.db
namespace: null
type: sqlite
responses_store:
db_path: /tmp/responses_store.db
type: sqlite
provider_id: meta-reference
agents:
- config:
persistence_store:
db_path: /tmp/agents_store.db
namespace: null
type: sqlite
responses_store:
db_path: /tmp/responses_store.db
type: sqlite
provider_id: meta-reference
provider_type: inline::meta-reference
🧰 Tools
🪛 YAMLlint (1.37.1)

[warning] 43-43: wrong indentation: expected 4 but found 2

(indentation)

🤖 Prompt for AI Agents
In examples/vertexai-run.yaml around lines 43 to 51, the list item under
providers.agents is indented 2 spaces but must be indented 4 to satisfy YAML
parsers; adjust the indentation so the entire "config:" block (and its nested
keys like persistence_store and responses_store) is shifted two additional
spaces to align as a proper list element under agents, ensuring consistent
4-space indentation for list items and their children.

Comment on lines +59 to +63
url: ${env.VLLM_URL:=http://localhost:8000/v1}
max_tokens: ${env.VLLM_MAX_TOKENS:=4096}
api_token: ${env.VLLM_API_TOKEN:=fake}
tls_verify: ${env.VLLM_TLS_VERIFY:=false}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Defaulting tls_verify to false is insecure; default to true and allow override.

Disable TLS verification only when required. Safer defaults help avoid accidental MITM exposure when pointing to remote vLLM.

         api_token: ${env.VLLM_API_TOKEN:=fake}
-        tls_verify: ${env.VLLM_TLS_VERIFY:=false}
+        tls_verify: ${env.VLLM_TLS_VERIFY:=true}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
url: ${env.VLLM_URL:=http://localhost:8000/v1}
max_tokens: ${env.VLLM_MAX_TOKENS:=4096}
api_token: ${env.VLLM_API_TOKEN:=fake}
tls_verify: ${env.VLLM_TLS_VERIFY:=false}
url: ${env.VLLM_URL:=http://localhost:8000/v1}
max_tokens: ${env.VLLM_MAX_TOKENS:=4096}
api_token: ${env.VLLM_API_TOKEN:=fake}
tls_verify: ${env.VLLM_TLS_VERIFY:=true}
🤖 Prompt for AI Agents
In examples/vertexai-run.yaml around lines 59 to 63, tls_verify is defaulted to
false which is insecure; change the default to true and allow callers to
override via the VLLM_TLS_VERIFY environment variable. Update the YAML to set
tls_verify: ${env.VLLM_TLS_VERIFY:=true} (and if your system treats env vars as
strings, ensure downstream code parses the value as a boolean so "false"/"0"/""
properly disable TLS verification while any other value enables it).

Comment on lines +3 to +5
#
# Contributed by @eranco74 (2025-08).
#
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix trailing spaces and add missing newline at EOF to satisfy yamllint/CI.

  • Trailing spaces at Line 3 and Line 5.
  • Missing newline at end of file (Line 108).

Apply:

-# 
+#

-# 
+#

-  port: 8321
+  port: 8321
+

Also applies to: 108-108

🧰 Tools
🪛 YAMLlint (1.37.1)

[error] 3-3: trailing spaces

(trailing-spaces)


[error] 5-5: trailing spaces

(trailing-spaces)

🤖 Prompt for AI Agents
In examples/vllm-qwen3-run.yaml around lines 3-5 and at EOF line 108, there are
trailing spaces on lines 3 and 5 and the file lacks a final newline; remove the
trailing spaces from those lines and ensure the file ends with a single newline
character (add a newline at EOF) so it passes yamllint/CI.

@tisnik tisnik merged commit fa8a413 into lightspeed-core:main Aug 21, 2025
18 checks passed
@coderabbitai coderabbitai bot mentioned this pull request Aug 26, 2025
15 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants