This repository contains the necessary configuration to build a Docker Container Image for ansible-chatbot-stack.
ansible-chatbot-stack builds on top of lightspeed-stack that wraps Meta's llama-stack AI framework.
ansible-chatbot-stack includes various customisations for:
- A remote vLLM inference provider (RHOSAI vLLM compatible)
- The inline sentence transformers (Meta)
- AAP RAG database files and configuration
- Lightspeed external providers
- System Prompt injection
Build/Run overview:
flowchart TB
%% Nodes
    LLAMA_STACK([fa:fa-layer-group llama-stack:x.y.z])
    LIGHTSPEED_STACK([fa:fa-layer-group lightspeed-stack:x.y.z])
    LIGHTSPEED_RUN_CONFIG{{fa:fa-wrench lightspeed-stack.yaml}}
    ANSIBLE_CHATBOT_STACK([fa:fa-layer-group ansible-chatbot-stack:x.y.z])
    ANSIBLE_CHATBOT_RUN_CONFIG{{fa:fa-wrench ansible-chatbot-run.yaml}}
    ANSIBLE_CHATBOT_DOCKERFILE{{fa:fa-wrench Containerfile}}
    ANSIBLE_LIGHTSPEED([fa:fa-layer-group ansible-ai-connect-service:x.y.z])
    LIGHTSPEED_PROVIDERS("fa:fa-code-branch lightspeed-providers:x.y.z")
    PYPI("fa:fa-database PyPI")
%% Edge connections between nodes
    ANSIBLE_LIGHTSPEED -- Uses --> ANSIBLE_CHATBOT_STACK
    ANSIBLE_CHATBOT_STACK -- Consumes --> PYPI
    LIGHTSPEED_PROVIDERS -- Publishes --> PYPI
    ANSIBLE_CHATBOT_STACK -- Built from --> ANSIBLE_CHATBOT_DOCKERFILE
    ANSIBLE_CHATBOT_STACK -- Inherits from --> LIGHTSPEED_STACK
    ANSIBLE_CHATBOT_STACK -- Includes --> LIGHTSPEED_RUN_CONFIG
    ANSIBLE_CHATBOT_STACK -- Includes --> ANSIBLE_CHATBOT_RUN_CONFIG
    LIGHTSPEED_STACK -- Embeds --> LLAMA_STACK
    LIGHTSPEED_STACK -- Uses --> LIGHTSPEED_RUN_CONFIG
    LLAMA_STACK -- Uses --> ANSIBLE_CHATBOT_RUN_CONFIG
    - External Providers YAML manifests must be present in providers.d/of your host'sllama-stackdirectory.
- Vector Database is copied from the latest aap-rag-contentimage to./vector_db.
- Embeddings image files are copied from the latest aap-rag-contentimage to./embeddings_model.
        make setupBuilds the image ansible-chatbot-stack:$ANSIBLE_CHATBOT_VERSION.
Change the
ANSIBLE_CHATBOT_VERSIONversion and inference parameters below accordingly.
    export ANSIBLE_CHATBOT_VERSION=0.0.1
    
    make build└── app-root/
    ├── .venv/
    └── src/
        ├── <lightspeed-stack files>
        └── lightspeed_stack.py
These are stored in a
PersistentVolumeClaimfor resilience
└── .llama/
    └── data/
        └── distributions/
            └── ansible-chatbot/
                ├── aap_faiss_store.db
                ├── agents_store.db
                ├── responses_store.db
                ├── localfs_datasetio.db
                ├── trace_store.db
                └── embeddings_model/
└── .llama/
    ├── distributions/
    │   └── llama-stack/
    │       └── config
    │           └── ansible-chatbot-run.yaml
    │   └── ansible-chatbot/
    │       ├── ansible-chatbot-version-info.json    
    │       └── config
    │           └── lightspeed-stack.yaml
    │       └── system-prompts/
    │           └── default.txt
    └── providers.d
        └── <llama-stack external providers>
Runs the image ansible-chatbot-stack:$ANSIBLE_CHATBOT_VERSION as a local container.
Change the
ANSIBLE_CHATBOT_VERSIONversion and inference parameters below accordingly.
    export ANSIBLE_CHATBOT_VERSION=0.0.1
    export ANSIBLE_CHATBOT_VLLM_URL=<YOUR_MODEL_SERVING_URL>
    export ANSIBLE_CHATBOT_VLLM_API_TOKEN=<YOUR_MODEL_SERVING_API_TOKEN>
    export ANSIBLE_CHATBOT_INFERENCE_MODEL=<YOUR_INFERENCE_MODEL>
    export ANSIBLE_CHATBOT_INFERENCE_MODEL_FILTER=<YOUR_INFERENCE_MODEL_TOOLS_FILTERING>
    
    make runRuns basic tests against the local container.
Change the
ANSIBLE_CHATBOT_VERSIONversion and inference parameters below accordingly.
    export ANSIBLE_CHATBOT_VERSION=0.0.1
    export ANSIBLE_CHATBOT_VLLM_URL=<YOUR_MODEL_SERVING_URL>
    export ANSIBLE_CHATBOT_VLLM_API_TOKEN=<YOUR_MODEL_SERVING_API_TOKEN>
    export ANSIBLE_CHATBOT_INFERENCE_MODEL=<YOUR_INFERENCE_MODEL>
    export ANSIBLE_CHATBOT_INFERENCE_MODEL_FILTER=<YOUR_INFERENCE_MODEL_TOOLS_FILTERING>
    
    make run-testAAP Chatbot Quality evaluations available:
    kubectl kustomize . > my-chatbot-stack-deploy.yaml    kubectl apply -f my-chatbot-stack-deploy.yamlUsing the gemini remote inference provider:
- Set the environment variable OPENAI_API_KEY=<YOUR_API_KEY>
- Example of a v1/queryrequest:
{
    "query": "hello",
    "system_prompt": "You are a helpful assistant.",
    "model": "gemini/gemini-2.5-flash",
    "provider": "gemini"
}Using the gemini remote inference provider:
- Set a dummy value for the environment variable OPENAI_API_KEY(sogeminiprovider within llama-stack, does not complain)
- Set the path for your Google's Service Account credentials JSON file in the env GOOGLE_APPLICATION_CREDENTIALS=<PATH_GOOGLE_CRED_JSON_FILE>
- Example of a v1/queryrequest:
{
    "query": "hello",
    "system_prompt": "You are a helpful assistant.",
    "model": "gemini-2.5-flash",
    "provider": "gemini"
}If you have the need for re-building images, apply the following clean-ups right before:
    make clean    # Obtain a container shell for the Ansible Chatbot Stack.
    make shell- Clone the lightspeed-core/lightspeed-stack repository to your development environment.
- In the ansible-chatbot-stack project root, create .envfile in the project root and define following variables:PYTHONDONTWRITEBYTECODE=1 PYTHONUNBUFFERED=1 PYTHONCOERCECLOCALE=0 PYTHONUTF8=1 PYTHONIOENCODING=UTF-8 LANG=en_US.UTF-8 VLLM_URL=(VLLM URL Here) VLLM_API_TOKEN=(VLLM API Token Here) INFERENCE_MODEL=granite-3.3-8b-instruct LIBRARY_CLIENT_CONFIG_PATH=./ansible-chatbot-run.yaml SYSTEM_PROMPT_PATH=./ansible-chatbot-system-prompt.txt EMBEDDINGS_MODEL=./embeddings_model VECTOR_DB_DIR=./vector_db PROVIDERS_DB_DIR=./work EXTERNAL_PROVIDERS_DIR=./llama-stack/providers.d
- Create a Python run configuration with following values:
- script/module: script
- script path: (lightspeed-stack project root)/src/lightspeed_stack.py
- arguments: --config ./lightspeed-stack_local.yaml
- working directory: (ansible-chatbot-stack project root)
- path to ".env" files: (ansible-chatbot-stack project root)/.env
 
- script/module: 
- Run the created configuration from PyCharm main menu.
If you want to debug codes in the lightspeed-providers project, you
can add it as a local package dependency with:
uv add --editable (lightspeed-providers project root)
It will update pyproject.toml and uv.lock files.  Remember that
they are for debugging purpose only and avoid checking in those local
changes.