Skip to content

Conversation

@abhibongale
Copy link
Contributor

@abhibongale abhibongale commented Aug 8, 2025

This commit introduces the --generate=compose option to the ramalama serve command, enabling users
to generate a docker-compose.yaml file for a given model.

Summary by Sourcery

Add support for a new --generate=compose option to ramalama serve to generate Docker Compose configurations for models.

New Features:

  • Introduce Compose class to generate docker-compose.yaml with service configuration, including volumes, ports, environment variables, host devices, GPU deployments, and command specifications
  • Add --generate=compose option in ramalama serve to invoke the Compose generator and write the output to a specified directory
  • Implement genfile helper function to standardize creation and output of docker-compose.yaml files

Tests:

  • Add comprehensive unit tests for the Compose.generate method covering default ports, custom port mappings, environment variables, RAG volumes (OCI images and host paths), chat templates, MMProj mounts, custom service names, NVIDIA GPU support, and handling of missing attributes
  • Add fixtures for expected compose file outputs under test/unit/data/test_compose

@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Aug 8, 2025

Reviewer's Guide

This PR adds Docker Compose generation support to the ramalama serve command by integrating a new Compose class into the command flow and covering its functionality with a comprehensive test suite against golden YAML templates.

Sequence diagram for Docker Compose file generation via ramalama serve

sequenceDiagram
    actor User
    participant CLI as ramalama serve
    participant Model
    participant Compose
    participant PlainFile

    User->>CLI: run 'ramalama serve --generate=compose'
    CLI->>Model: generate_container_config(args, exec_args)
    Model->>Compose: compose(...)
    Compose->>Compose: generate()
    Compose->>PlainFile: genfile(name, content)
    PlainFile-->>Compose: PlainFile instance
    Compose-->>Model: PlainFile instance
    Model-->>CLI: docker-compose.yaml generated
    CLI-->>User: Output file ready
Loading

Class diagram for the new Compose class

classDiagram
    class Compose {
        - src_model_path: str
        - dest_model_path: str
        - src_chat_template_path: str
        - dest_chat_template_path: str
        - src_mmproj_path: str
        - dest_mmproj_path: str
        - model_name: str
        - name: str
        - args
        - exec_args
        - image: str
        + __init__(model_name, model_paths, chat_template_paths, mmproj_paths, args, exec_args)
        + _gen_volumes() str
        + _gen_model_volume() str
        + _gen_rag_volume() str
        + _gen_chat_template_volume() str
        + _gen_mmproj_volume() str
        + _gen_devices() str
        + _gen_ports() str
        + _gen_environment() str
        + _gen_gpu_deployment() str
        + _gen_command() str
        + generate() PlainFile
    }

    class PlainFile
    Compose --> PlainFile : returns
Loading

Class diagram for updated Model class methods

classDiagram
    class Model {
        + generate_container_config(args, exec_args)
        + compose(model_paths, chat_template_paths, mmproj_paths, args, exec_args, output_dir)
    }
    Model --> Compose : uses
Loading

File-Level Changes

Change Details Files
Integrate compose generation into the serve command
  • Added branch for --generate=compose in generate_container_config
  • Implemented a new compose() method to mirror kube workflow
  • Wire up Compose.generate().write(output_dir) in model execution
ramalama/model.py
Implement Docker Compose file generation
  • Introduced Compose class with modular helpers for volumes, ports, environment, devices, GPU deployment, and command sections
  • Added genfile helper to produce PlainFile and print generation notice
ramalama/compose.py
Add unit tests for Compose generator
  • Created parameterized tests for Compose.generate across various configurations
  • Added tests for genfile helper and edge cases (missing port, env, devices)
test/unit/test_compose.py
test/unit/data/test_compose/*

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @abhibongale, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

I've implemented a new feature that allows ramalama users to generate Docker Compose configurations for their models. This enhancement provides a convenient way to create docker-compose.yaml files, streamlining the process of deploying models within Docker environments. The generated files are highly configurable, supporting various model-serving needs and hardware setups, and the new functionality is thoroughly tested to ensure reliability.

Highlights

  • New Docker Compose Generation Feature: I've introduced a new --generate=compose option to the ramalama serve command. This allows users to automatically generate a docker-compose.yaml file tailored for serving a specific model, simplifying deployment.
  • Modular and Configurable Docker Compose Generation: The new ramalama/compose.py module handles all the logic for constructing the Docker Compose file. It intelligently includes configurations for model volumes, RAG data (both OCI images and host paths), chat templates, mmproj files, environment variables, port mappings, and even GPU deployment settings for CUDA images.
  • Extensive Unit Test Coverage: I've added comprehensive unit tests for the Docker Compose generator, covering a wide range of scenarios including different model types, RAG configurations, custom names, port settings, environment variables, and GPU support. This ensures the generated YAML files are correct and robust.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @abhibongale - I've reviewed your changes - here's some feedback:

  • Consider refactoring the shared setup logic in Compose and Kube (e.g. path normalization and arg handling) into a common base or helper to avoid duplication.
  • Add an explicit else branch in generate_container_config for unsupported generate types so users get a clear error when passing an invalid --generate option.
  • The OCI prefix handling is split between stripping oci:// on model paths and using oci: for RAG volumes—unify this normalization in one place to prevent inconsistencies.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider refactoring the shared setup logic in Compose and Kube (e.g. path normalization and arg handling) into a common base or helper to avoid duplication.
- Add an explicit else branch in generate_container_config for unsupported generate types so users get a clear error when passing an invalid --generate option.
- The OCI prefix handling is split between stripping `oci://` on model paths and using `oci:` for RAG volumes—unify this normalization in one place to prevent inconsistencies.

## Individual Comments

### Comment 1
<location> `ramalama/compose.py:105` </location>
<code_context>
+            # Default to 8080 if no port is specified
+            return '    ports:\n      - "8080:8080"'
+
+        p = port_arg.split(":", 2)
+        host_port = p[1] if len(p) > 1 else p[0]
+        container_port = p[0]
+        return f'    ports:\n      - "{host_port}:{container_port}"'
+
</code_context>

<issue_to_address>
Port mapping logic may not handle all valid Docker Compose port formats.

Docker Compose allows formats like 'host_ip:host_port:container_port', which aren't handled by the current split logic. Please update the parsing to support all valid Compose port mapping formats.
</issue_to_address>

<suggested_fix>
<<<<<<< SEARCH
    def _gen_ports(self) -> str:
        port_arg = getattr(self.args, "port", None)
        if not port_arg:
            # Default to 8080 if no port is specified
            return '    ports:\n      - "8080:8080"'

=======
    def _gen_ports(self) -> str:
        port_arg = getattr(self.args, "port", None)
        if not port_arg:
            # Default to 8080 if no port is specified
            return '    ports:\n      - "8080:8080"'

        # Docker Compose supports:
        # - container_port
        # - host_port:container_port
        # - host_ip:host_port:container_port
        # Accept comma-separated list for multiple ports (optional, if needed)
        port_mappings = []
        for port_mapping in str(port_arg).split(","):
            port_mapping = port_mapping.strip()
            if not port_mapping:
                continue
            parts = port_mapping.split(":")
            if len(parts) == 1:
                # container_port only
                mapping = f'"{parts[0]}:{parts[0]}"'
            elif len(parts) == 2:
                # host_port:container_port
                mapping = f'"{parts[0]}:{parts[1]}"'
            elif len(parts) == 3:
                # host_ip:host_port:container_port
                mapping = f'"{parts[0]}:{parts[1]}:{parts[2]}"'
            else:
                raise ValueError(f"Invalid port mapping format: {port_mapping}")
            port_mappings.append(f"      - {mapping}")
        return "    ports:\n" + "\n".join(port_mappings)
>>>>>>> REPLACE

</suggested_fix>

### Comment 2
<location> `ramalama/compose.py:113` </location>
<code_context>
+    def _gen_environment(self) -> str:
+        env_vars = get_accel_env_vars()
+        # Allow user to override with --env
+        if getattr(self.args, "env", None):
+            for e in self.args.env:
+                key, val = e.split("=", 1)
+                env_vars[key] = val
+
</code_context>

<issue_to_address>
No error handling for malformed environment variable strings.

If any entry in self.args.env lacks an '=', split will raise ValueError. Please handle this to prevent runtime errors, either by skipping invalid entries or raising a clear exception.
</issue_to_address>

### Comment 3
<location> `ramalama/compose.py:127` </location>
<code_context>
+        return env_spec
+
+    def _gen_gpu_deployment(self) -> str:
+        if "cuda" not in self.image:
+            return ""
+
</code_context>

<issue_to_address>
GPU deployment logic is tightly coupled to 'cuda' substring in image name.

This approach may fail if image names change or use different conventions. Please make the check configurable or clearly document the naming requirement.
</issue_to_address>

<suggested_fix>
<<<<<<< SEARCH
    def _gen_gpu_deployment(self) -> str:
        if "cuda" not in self.image:
            return ""
=======
    # By default, images with GPU support are identified by the presence of the substring
    # specified in `gpu_image_identifier` (default: "cuda") in the image name.
    # This can be overridden by setting the `gpu_image_identifier` attribute.
    def _gen_gpu_deployment(self) -> str:
        if not hasattr(self, "gpu_image_identifier"):
            self.gpu_image_identifier = "cuda"
        if self.gpu_image_identifier not in self.image:
            return ""
>>>>>>> REPLACE

</suggested_fix>

### Comment 4
<location> `test/unit/test_compose.py:240` </location>
<code_context>
+    assert "environment:" not in result
+
+
+def test_compose_no_devices(monkeypatch):
+    """Test Compose generation when no host devices are found."""
+    monkeypatch.setattr("os.path.exists", lambda path: False)
+    monkeypatch.setattr("ramalama.compose.get_accel_env_vars", lambda: {})
+    monkeypatch.setattr("ramalama.compose.version", lambda: "test")
+
+    args = Args()
+    compose = Compose("test", ("/a", "/b"), None, None, args, [])
+    result = compose.generate().content
+
+    assert "devices:" not in result
</code_context>

<issue_to_address>
Consider testing device presence with partial device availability

Adding tests with only some devices present (e.g., just /dev/dri) would help verify that the Compose generator correctly includes only available devices.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines 99 to 104
def _gen_ports(self) -> str:
port_arg = getattr(self.args, "port", None)
if not port_arg:
# Default to 8080 if no port is specified
return ' ports:\n - "8080:8080"'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Port mapping logic may not handle all valid Docker Compose port formats.

Docker Compose allows formats like 'host_ip:host_port:container_port', which aren't handled by the current split logic. Please update the parsing to support all valid Compose port mapping formats.

Suggested change
def _gen_ports(self) -> str:
port_arg = getattr(self.args, "port", None)
if not port_arg:
# Default to 8080 if no port is specified
return ' ports:\n - "8080:8080"'
def _gen_ports(self) -> str:
port_arg = getattr(self.args, "port", None)
if not port_arg:
# Default to 8080 if no port is specified
return ' ports:\n - "8080:8080"'
# Docker Compose supports:
# - container_port
# - host_port:container_port
# - host_ip:host_port:container_port
# Accept comma-separated list for multiple ports (optional, if needed)
port_mappings = []
for port_mapping in str(port_arg).split(","):
port_mapping = port_mapping.strip()
if not port_mapping:
continue
parts = port_mapping.split(":")
if len(parts) == 1:
# container_port only
mapping = f'"{parts[0]}:{parts[0]}"'
elif len(parts) == 2:
# host_port:container_port
mapping = f'"{parts[0]}:{parts[1]}"'
elif len(parts) == 3:
# host_ip:host_port:container_port
mapping = f'"{parts[0]}:{parts[1]}:{parts[2]}"'
else:
raise ValueError(f"Invalid port mapping format: {port_mapping}")
port_mappings.append(f" - {mapping}")
return " ports:\n" + "\n".join(port_mappings)

Comment on lines 113 to 115
if getattr(self.args, "env", None):
for e in self.args.env:
key, val = e.split("=", 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: No error handling for malformed environment variable strings.

If any entry in self.args.env lacks an '=', split will raise ValueError. Please handle this to prevent runtime errors, either by skipping invalid entries or raising a clear exception.

Comment on lines 126 to 128
def _gen_gpu_deployment(self) -> str:
if "cuda" not in self.image:
return ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: GPU deployment logic is tightly coupled to 'cuda' substring in image name.

This approach may fail if image names change or use different conventions. Please make the check configurable or clearly document the naming requirement.

Suggested change
def _gen_gpu_deployment(self) -> str:
if "cuda" not in self.image:
return ""
# By default, images with GPU support are identified by the presence of the substring
# specified in `gpu_image_identifier` (default: "cuda") in the image name.
# This can be overridden by setting the `gpu_image_identifier` attribute.
def _gen_gpu_deployment(self) -> str:
if not hasattr(self, "gpu_image_identifier"):
self.gpu_image_identifier = "cuda"
if self.gpu_image_identifier not in self.image:
return ""

Comment on lines 240 to 249
def test_compose_no_devices(monkeypatch):
"""Test Compose generation when no host devices are found."""
monkeypatch.setattr("os.path.exists", lambda path: False)
monkeypatch.setattr("ramalama.compose.get_accel_env_vars", lambda: {})
monkeypatch.setattr("ramalama.compose.version", lambda: "test")

args = Args()
compose = Compose("test", ("/a", "/b"), None, None, args, [])
result = compose.generate().content

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Consider testing device presence with partial device availability

Adding tests with only some devices present (e.g., just /dev/dri) would help verify that the Compose generator correctly includes only available devices.


self.model_name = model_name
custom_name = getattr(args, "name", None)
self.name = custom_name if custom_name else f"ramalama-{model_name}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): Replace if-expression with or (or-if-exp-identity)

Suggested change
self.name = custom_name if custom_name else f"ramalama-{model_name}"
self.name = custom_name or f"ramalama-{model_name}"


ExplanationHere we find ourselves setting a value if it evaluates to True, and otherwise
using a default.

The 'After' case is a bit easier to read and avoids the duplication of
input_currency.

It works because the left-hand side is evaluated first. If it evaluates to
true then currency will be set to this and the right-hand side will not be
evaluated. If it evaluates to false the right-hand side will be evaluated and
currency will be set to DEFAULT_CURRENCY.

Comment on lines 86 to 90
device_list = []
for dev_path in ["/dev/dri", "/dev/kfd", "/dev/accel"]:
if os.path.exists(dev_path):
device_list.append(dev_path)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): Convert for loop into list comprehension (list-comprehension)

Suggested change
device_list = []
for dev_path in ["/dev/dri", "/dev/kfd", "/dev/accel"]:
if os.path.exists(dev_path):
device_list.append(dev_path)
device_list = [
dev_path
for dev_path in ["/dev/dri", "/dev/kfd", "/dev/accel"]
if os.path.exists(dev_path)
]

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a Docker Compose generator, which is a great addition. The implementation is well-tested and covers many use cases. My review focuses on improving the robustness of the implementation. I've pointed out a bug in port mapping parsing and suggested a more maintainable way to generate the YAML file to avoid potential formatting issues.

Comment on lines 105 to 108
p = port_arg.split(":", 2)
host_port = p[1] if len(p) > 1 else p[0]
container_port = p[0]
return f' ports:\n - "{host_port}:{container_port}"'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The port string parsing p = port_arg.split(":", 2) is not robust. If a user provides a port mapping with an IP address like 127.0.0.1:8080:9090, which is a valid format for Docker, this parsing will produce an incorrect ports entry ("8080:127.0.0.1").

The convention used here (CONTAINER_PORT:HOST_PORT) is also non-standard for Docker and can be confusing for users.

To make the parsing more robust against unexpected formats, consider using split(":", 1). This will at least prevent dropping parts of the port string. A better long-term solution would be to adopt and correctly parse the standard [IP:]HOST_PORT:CONTAINER_PORT format.

Suggested change
p = port_arg.split(":", 2)
host_port = p[1] if len(p) > 1 else p[0]
container_port = p[0]
return f' ports:\n - "{host_port}:{container_port}"'
p = port_arg.split(":", 1)
if len(p) == 2:
container_port, host_port = p
else:
container_port = host_port = p[0]
return f' ports:\n - "{host_port}:{container_port}"'

Comment on lines 146 to 178
def generate(self) -> PlainFile:
_version = version()

# Generate all the dynamic sections of the YAML file
volumes_string = self._gen_volumes()
ports_string = self._gen_ports()
environment_string = self._gen_environment()
devices_string = self._gen_devices()
gpu_deploy_string = self._gen_gpu_deployment()
command_string = self._gen_command()

# Assemble the final file content
content = f"""\
# Save this output to a 'docker-compose.yaml' file and run 'docker compose up'.
#
# Created with ramalama-{_version}

services:
{self.model_name}:
container_name: {self.name}
image: {self.image}
{volumes_string}
{ports_string}
{environment_string}
{devices_string}
{gpu_deploy_string}
{command_string}
restart: unless-stopped
"""
# Clean up any empty lines that might result from empty sections
content = "\n".join(line for line in content.splitlines() if line.strip())

return genfile(self.name, content)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Generating YAML by concatenating strings with hardcoded indentation is fragile and error-prone. A small change in one of the _gen_* methods could break the entire YAML structure.

A more robust approach would be to build a Python dictionary that represents the docker-compose.yaml structure and then serialize it to a string. This would abstract away the formatting and indentation details, making the code easier to read and maintain.

If adding a new dependency like PyYAML is not an option, you could still improve this by building up a list of properly indented lines and then joining them, which would be less fragile than the current string concatenation approach.

@rhatdan
Copy link
Member

rhatdan commented Aug 9, 2025

Thanks @abhibongale
You need to sign your commits.

git commit -a --amend -s
git push --force

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants