-
Notifications
You must be signed in to change notification settings - Fork 272
Add Docker Compose generator #1817
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Reviewer's GuideThis PR adds Docker Compose generation support to the Sequence diagram for Docker Compose file generation via ramalama servesequenceDiagram
actor User
participant CLI as ramalama serve
participant Model
participant Compose
participant PlainFile
User->>CLI: run 'ramalama serve --generate=compose'
CLI->>Model: generate_container_config(args, exec_args)
Model->>Compose: compose(...)
Compose->>Compose: generate()
Compose->>PlainFile: genfile(name, content)
PlainFile-->>Compose: PlainFile instance
Compose-->>Model: PlainFile instance
Model-->>CLI: docker-compose.yaml generated
CLI-->>User: Output file ready
Class diagram for the new Compose classclassDiagram
class Compose {
- src_model_path: str
- dest_model_path: str
- src_chat_template_path: str
- dest_chat_template_path: str
- src_mmproj_path: str
- dest_mmproj_path: str
- model_name: str
- name: str
- args
- exec_args
- image: str
+ __init__(model_name, model_paths, chat_template_paths, mmproj_paths, args, exec_args)
+ _gen_volumes() str
+ _gen_model_volume() str
+ _gen_rag_volume() str
+ _gen_chat_template_volume() str
+ _gen_mmproj_volume() str
+ _gen_devices() str
+ _gen_ports() str
+ _gen_environment() str
+ _gen_gpu_deployment() str
+ _gen_command() str
+ generate() PlainFile
}
class PlainFile
Compose --> PlainFile : returns
Class diagram for updated Model class methodsclassDiagram
class Model {
+ generate_container_config(args, exec_args)
+ compose(model_paths, chat_template_paths, mmproj_paths, args, exec_args, output_dir)
}
Model --> Compose : uses
File-Level Changes
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @abhibongale, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
I've implemented a new feature that allows ramalama users to generate Docker Compose configurations for their models. This enhancement provides a convenient way to create docker-compose.yaml files, streamlining the process of deploying models within Docker environments. The generated files are highly configurable, supporting various model-serving needs and hardware setups, and the new functionality is thoroughly tested to ensure reliability.
Highlights
- New Docker Compose Generation Feature: I've introduced a new
--generate=composeoption to theramalama servecommand. This allows users to automatically generate adocker-compose.yamlfile tailored for serving a specific model, simplifying deployment. - Modular and Configurable Docker Compose Generation: The new
ramalama/compose.pymodule handles all the logic for constructing the Docker Compose file. It intelligently includes configurations for model volumes, RAG data (both OCI images and host paths), chat templates, mmproj files, environment variables, port mappings, and even GPU deployment settings for CUDA images. - Extensive Unit Test Coverage: I've added comprehensive unit tests for the Docker Compose generator, covering a wide range of scenarios including different model types, RAG configurations, custom names, port settings, environment variables, and GPU support. This ensures the generated YAML files are correct and robust.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @abhibongale - I've reviewed your changes - here's some feedback:
- Consider refactoring the shared setup logic in Compose and Kube (e.g. path normalization and arg handling) into a common base or helper to avoid duplication.
- Add an explicit else branch in generate_container_config for unsupported generate types so users get a clear error when passing an invalid --generate option.
- The OCI prefix handling is split between stripping
oci://on model paths and usingoci:for RAG volumes—unify this normalization in one place to prevent inconsistencies.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider refactoring the shared setup logic in Compose and Kube (e.g. path normalization and arg handling) into a common base or helper to avoid duplication.
- Add an explicit else branch in generate_container_config for unsupported generate types so users get a clear error when passing an invalid --generate option.
- The OCI prefix handling is split between stripping `oci://` on model paths and using `oci:` for RAG volumes—unify this normalization in one place to prevent inconsistencies.
## Individual Comments
### Comment 1
<location> `ramalama/compose.py:105` </location>
<code_context>
+ # Default to 8080 if no port is specified
+ return ' ports:\n - "8080:8080"'
+
+ p = port_arg.split(":", 2)
+ host_port = p[1] if len(p) > 1 else p[0]
+ container_port = p[0]
+ return f' ports:\n - "{host_port}:{container_port}"'
+
</code_context>
<issue_to_address>
Port mapping logic may not handle all valid Docker Compose port formats.
Docker Compose allows formats like 'host_ip:host_port:container_port', which aren't handled by the current split logic. Please update the parsing to support all valid Compose port mapping formats.
</issue_to_address>
<suggested_fix>
<<<<<<< SEARCH
def _gen_ports(self) -> str:
port_arg = getattr(self.args, "port", None)
if not port_arg:
# Default to 8080 if no port is specified
return ' ports:\n - "8080:8080"'
=======
def _gen_ports(self) -> str:
port_arg = getattr(self.args, "port", None)
if not port_arg:
# Default to 8080 if no port is specified
return ' ports:\n - "8080:8080"'
# Docker Compose supports:
# - container_port
# - host_port:container_port
# - host_ip:host_port:container_port
# Accept comma-separated list for multiple ports (optional, if needed)
port_mappings = []
for port_mapping in str(port_arg).split(","):
port_mapping = port_mapping.strip()
if not port_mapping:
continue
parts = port_mapping.split(":")
if len(parts) == 1:
# container_port only
mapping = f'"{parts[0]}:{parts[0]}"'
elif len(parts) == 2:
# host_port:container_port
mapping = f'"{parts[0]}:{parts[1]}"'
elif len(parts) == 3:
# host_ip:host_port:container_port
mapping = f'"{parts[0]}:{parts[1]}:{parts[2]}"'
else:
raise ValueError(f"Invalid port mapping format: {port_mapping}")
port_mappings.append(f" - {mapping}")
return " ports:\n" + "\n".join(port_mappings)
>>>>>>> REPLACE
</suggested_fix>
### Comment 2
<location> `ramalama/compose.py:113` </location>
<code_context>
+ def _gen_environment(self) -> str:
+ env_vars = get_accel_env_vars()
+ # Allow user to override with --env
+ if getattr(self.args, "env", None):
+ for e in self.args.env:
+ key, val = e.split("=", 1)
+ env_vars[key] = val
+
</code_context>
<issue_to_address>
No error handling for malformed environment variable strings.
If any entry in self.args.env lacks an '=', split will raise ValueError. Please handle this to prevent runtime errors, either by skipping invalid entries or raising a clear exception.
</issue_to_address>
### Comment 3
<location> `ramalama/compose.py:127` </location>
<code_context>
+ return env_spec
+
+ def _gen_gpu_deployment(self) -> str:
+ if "cuda" not in self.image:
+ return ""
+
</code_context>
<issue_to_address>
GPU deployment logic is tightly coupled to 'cuda' substring in image name.
This approach may fail if image names change or use different conventions. Please make the check configurable or clearly document the naming requirement.
</issue_to_address>
<suggested_fix>
<<<<<<< SEARCH
def _gen_gpu_deployment(self) -> str:
if "cuda" not in self.image:
return ""
=======
# By default, images with GPU support are identified by the presence of the substring
# specified in `gpu_image_identifier` (default: "cuda") in the image name.
# This can be overridden by setting the `gpu_image_identifier` attribute.
def _gen_gpu_deployment(self) -> str:
if not hasattr(self, "gpu_image_identifier"):
self.gpu_image_identifier = "cuda"
if self.gpu_image_identifier not in self.image:
return ""
>>>>>>> REPLACE
</suggested_fix>
### Comment 4
<location> `test/unit/test_compose.py:240` </location>
<code_context>
+ assert "environment:" not in result
+
+
+def test_compose_no_devices(monkeypatch):
+ """Test Compose generation when no host devices are found."""
+ monkeypatch.setattr("os.path.exists", lambda path: False)
+ monkeypatch.setattr("ramalama.compose.get_accel_env_vars", lambda: {})
+ monkeypatch.setattr("ramalama.compose.version", lambda: "test")
+
+ args = Args()
+ compose = Compose("test", ("/a", "/b"), None, None, args, [])
+ result = compose.generate().content
+
+ assert "devices:" not in result
</code_context>
<issue_to_address>
Consider testing device presence with partial device availability
Adding tests with only some devices present (e.g., just /dev/dri) would help verify that the Compose generator correctly includes only available devices.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
ramalama/compose.py
Outdated
| def _gen_ports(self) -> str: | ||
| port_arg = getattr(self.args, "port", None) | ||
| if not port_arg: | ||
| # Default to 8080 if no port is specified | ||
| return ' ports:\n - "8080:8080"' | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Port mapping logic may not handle all valid Docker Compose port formats.
Docker Compose allows formats like 'host_ip:host_port:container_port', which aren't handled by the current split logic. Please update the parsing to support all valid Compose port mapping formats.
| def _gen_ports(self) -> str: | |
| port_arg = getattr(self.args, "port", None) | |
| if not port_arg: | |
| # Default to 8080 if no port is specified | |
| return ' ports:\n - "8080:8080"' | |
| def _gen_ports(self) -> str: | |
| port_arg = getattr(self.args, "port", None) | |
| if not port_arg: | |
| # Default to 8080 if no port is specified | |
| return ' ports:\n - "8080:8080"' | |
| # Docker Compose supports: | |
| # - container_port | |
| # - host_port:container_port | |
| # - host_ip:host_port:container_port | |
| # Accept comma-separated list for multiple ports (optional, if needed) | |
| port_mappings = [] | |
| for port_mapping in str(port_arg).split(","): | |
| port_mapping = port_mapping.strip() | |
| if not port_mapping: | |
| continue | |
| parts = port_mapping.split(":") | |
| if len(parts) == 1: | |
| # container_port only | |
| mapping = f'"{parts[0]}:{parts[0]}"' | |
| elif len(parts) == 2: | |
| # host_port:container_port | |
| mapping = f'"{parts[0]}:{parts[1]}"' | |
| elif len(parts) == 3: | |
| # host_ip:host_port:container_port | |
| mapping = f'"{parts[0]}:{parts[1]}:{parts[2]}"' | |
| else: | |
| raise ValueError(f"Invalid port mapping format: {port_mapping}") | |
| port_mappings.append(f" - {mapping}") | |
| return " ports:\n" + "\n".join(port_mappings) |
ramalama/compose.py
Outdated
| if getattr(self.args, "env", None): | ||
| for e in self.args.env: | ||
| key, val = e.split("=", 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: No error handling for malformed environment variable strings.
If any entry in self.args.env lacks an '=', split will raise ValueError. Please handle this to prevent runtime errors, either by skipping invalid entries or raising a clear exception.
ramalama/compose.py
Outdated
| def _gen_gpu_deployment(self) -> str: | ||
| if "cuda" not in self.image: | ||
| return "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: GPU deployment logic is tightly coupled to 'cuda' substring in image name.
This approach may fail if image names change or use different conventions. Please make the check configurable or clearly document the naming requirement.
| def _gen_gpu_deployment(self) -> str: | |
| if "cuda" not in self.image: | |
| return "" | |
| # By default, images with GPU support are identified by the presence of the substring | |
| # specified in `gpu_image_identifier` (default: "cuda") in the image name. | |
| # This can be overridden by setting the `gpu_image_identifier` attribute. | |
| def _gen_gpu_deployment(self) -> str: | |
| if not hasattr(self, "gpu_image_identifier"): | |
| self.gpu_image_identifier = "cuda" | |
| if self.gpu_image_identifier not in self.image: | |
| return "" |
test/unit/test_compose.py
Outdated
| def test_compose_no_devices(monkeypatch): | ||
| """Test Compose generation when no host devices are found.""" | ||
| monkeypatch.setattr("os.path.exists", lambda path: False) | ||
| monkeypatch.setattr("ramalama.compose.get_accel_env_vars", lambda: {}) | ||
| monkeypatch.setattr("ramalama.compose.version", lambda: "test") | ||
|
|
||
| args = Args() | ||
| compose = Compose("test", ("/a", "/b"), None, None, args, []) | ||
| result = compose.generate().content | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (testing): Consider testing device presence with partial device availability
Adding tests with only some devices present (e.g., just /dev/dri) would help verify that the Compose generator correctly includes only available devices.
ramalama/compose.py
Outdated
|
|
||
| self.model_name = model_name | ||
| custom_name = getattr(args, "name", None) | ||
| self.name = custom_name if custom_name else f"ramalama-{model_name}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (code-quality): Replace if-expression with or (or-if-exp-identity)
| self.name = custom_name if custom_name else f"ramalama-{model_name}" | |
| self.name = custom_name or f"ramalama-{model_name}" |
Explanation
Here we find ourselves setting a value if it evaluates toTrue, and otherwiseusing a default.
The 'After' case is a bit easier to read and avoids the duplication of
input_currency.
It works because the left-hand side is evaluated first. If it evaluates to
true then currency will be set to this and the right-hand side will not be
evaluated. If it evaluates to false the right-hand side will be evaluated and
currency will be set to DEFAULT_CURRENCY.
ramalama/compose.py
Outdated
| device_list = [] | ||
| for dev_path in ["/dev/dri", "/dev/kfd", "/dev/accel"]: | ||
| if os.path.exists(dev_path): | ||
| device_list.append(dev_path) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (code-quality): Convert for loop into list comprehension (list-comprehension)
| device_list = [] | |
| for dev_path in ["/dev/dri", "/dev/kfd", "/dev/accel"]: | |
| if os.path.exists(dev_path): | |
| device_list.append(dev_path) | |
| device_list = [ | |
| dev_path | |
| for dev_path in ["/dev/dri", "/dev/kfd", "/dev/accel"] | |
| if os.path.exists(dev_path) | |
| ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a Docker Compose generator, which is a great addition. The implementation is well-tested and covers many use cases. My review focuses on improving the robustness of the implementation. I've pointed out a bug in port mapping parsing and suggested a more maintainable way to generate the YAML file to avoid potential formatting issues.
ramalama/compose.py
Outdated
| p = port_arg.split(":", 2) | ||
| host_port = p[1] if len(p) > 1 else p[0] | ||
| container_port = p[0] | ||
| return f' ports:\n - "{host_port}:{container_port}"' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The port string parsing p = port_arg.split(":", 2) is not robust. If a user provides a port mapping with an IP address like 127.0.0.1:8080:9090, which is a valid format for Docker, this parsing will produce an incorrect ports entry ("8080:127.0.0.1").
The convention used here (CONTAINER_PORT:HOST_PORT) is also non-standard for Docker and can be confusing for users.
To make the parsing more robust against unexpected formats, consider using split(":", 1). This will at least prevent dropping parts of the port string. A better long-term solution would be to adopt and correctly parse the standard [IP:]HOST_PORT:CONTAINER_PORT format.
| p = port_arg.split(":", 2) | |
| host_port = p[1] if len(p) > 1 else p[0] | |
| container_port = p[0] | |
| return f' ports:\n - "{host_port}:{container_port}"' | |
| p = port_arg.split(":", 1) | |
| if len(p) == 2: | |
| container_port, host_port = p | |
| else: | |
| container_port = host_port = p[0] | |
| return f' ports:\n - "{host_port}:{container_port}"' |
ramalama/compose.py
Outdated
| def generate(self) -> PlainFile: | ||
| _version = version() | ||
|
|
||
| # Generate all the dynamic sections of the YAML file | ||
| volumes_string = self._gen_volumes() | ||
| ports_string = self._gen_ports() | ||
| environment_string = self._gen_environment() | ||
| devices_string = self._gen_devices() | ||
| gpu_deploy_string = self._gen_gpu_deployment() | ||
| command_string = self._gen_command() | ||
|
|
||
| # Assemble the final file content | ||
| content = f"""\ | ||
| # Save this output to a 'docker-compose.yaml' file and run 'docker compose up'. | ||
| # | ||
| # Created with ramalama-{_version} | ||
|
|
||
| services: | ||
| {self.model_name}: | ||
| container_name: {self.name} | ||
| image: {self.image} | ||
| {volumes_string} | ||
| {ports_string} | ||
| {environment_string} | ||
| {devices_string} | ||
| {gpu_deploy_string} | ||
| {command_string} | ||
| restart: unless-stopped | ||
| """ | ||
| # Clean up any empty lines that might result from empty sections | ||
| content = "\n".join(line for line in content.splitlines() if line.strip()) | ||
|
|
||
| return genfile(self.name, content) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generating YAML by concatenating strings with hardcoded indentation is fragile and error-prone. A small change in one of the _gen_* methods could break the entire YAML structure.
A more robust approach would be to build a Python dictionary that represents the docker-compose.yaml structure and then serialize it to a string. This would abstract away the formatting and indentation details, making the code easier to read and maintain.
If adding a new dependency like PyYAML is not an option, you could still improve this by building up a list of properly indented lines and then joining them, which would be less fragile than the current string concatenation approach.
|
Thanks @abhibongale git commit -a --amend -s |
This commit introduces the
--generate=composeoption to theramalama servecommand, enabling usersto generate a
docker-compose.yamlfile for a given model.Summary by Sourcery
Add support for a new --generate=compose option to ramalama serve to generate Docker Compose configurations for models.
New Features:
Tests: