Skip to content

Conversation

@rhatdan
Copy link
Member

@rhatdan rhatdan commented Oct 20, 2025

Just showing this off, I am still working on PRs to get into podman 5.7 release.

containers/podman#27329
containers/podman#27328
containers/podman#27325
containers/podman#27324
containers/podman#27319
containers/podman#27253

Summary by Sourcery

Add comprehensive support for storing and managing AI models as OCI artifacts, extending the transport, CLI, and configuration layers, and covering the new functionality with extensive documentation and end-to-end system tests.

New Features:

  • Add support for OCI 'artifact' type for model conversion, push, pull, list, inspect, and removal operations
  • Introduce CLI options and configuration for specifying 'artifact' conversion type alongside 'car' and 'raw'

Enhancements:

  • Extend OCI transport to handle artifact-specific commands (_add_artifact, _rm_artifact, inspect, mount_cmd, is_artifact)
  • Integrate artifact listing into oci_tools and combine with existing image and manifest listings
  • Update base transport for correct mounting and inspect behavior based on artifact flag
  • Allow default conversion type to be set via configuration and environment variable with correct precedence

Documentation:

  • Update command-line help and man pages to include 'artifact' type and describe its behavior
  • Add configuration documentation for the new 'convert_type' setting

Tests:

  • Add comprehensive system tests for artifact workflows including conversion, push, pull, listing, removal, error handling, configuration precedence, and size reporting

@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Oct 20, 2025

Reviewer's Guide

This PR extends RamaLama to support OCI artifacts as a new target type by implementing artifact-specific logic across transport and CLI modules, updating configuration and documentation, and adding comprehensive system tests.

File-Level Changes

Change Details Files
Extend OCI transport layer for artifact operations
  • Add artifact detection flag and mount command logic
  • Implement _create_artifact, _add_artifact, _rm_artifact methods
  • Branch _convert, push, remove, exists, inspect to handle artifact type
ramalama/transports/oci.py
Refactor base transport to support artifacts and unify inspect output
  • Introduce artifact flag and mount adjustments in base class
  • Split print_inspect into get_inspect returning JSON
  • Ensure inspect returns consistent JSON regardless of model type
ramalama/transports/base.py
Add artifact listing and size parsing in utility layer
  • Implement list_artifacts with JSON parsing and type filtering
  • Add convert_from_human_readable_size helper
  • Integrate artifact list into overall model listing
ramalama/oci_tools.py
Expand CLI to include 'artifact' type and override behavior
  • Add 'artifact' choice to convert and push commands
  • Respect config and environment defaults for convert_type
  • Adjust push_cli and rm_cli flows for artifact removal
ramalama/cli.py
Introduce convert_type config option with validation
  • Add convert_type field to BaseConfig
  • Default CLI flags to CONFIG.convert_type and allow env overrides
  • Validate convert_type values at startup
ramalama/config.py
Update documentation for artifact support
  • Describe 'artifact' type in convert manpage and config docs
  • List available types and their semantics in docs
  • Provide sample config snippets for default convert_type
docs/ramalama-convert.1.md
docs/ramalama.conf.5.md
docs/ramalama.conf
Add system tests covering artifact workflows
  • Create comprehensive Bats tests for convert, push, list, remove
  • Cover error handling, config precedence, size reporting, concurrency
  • Verify integration with podman artifact commands
test/system/056-artifact.bats

Possibly linked issues

  • Resolve undefined errors in huggingface.py and model.py #123: The PR introduces comprehensive OCI artifact support in Ramalama, including new commands for convert, push, list, and remove, and enables mounting artifacts into containers, fulfilling the issue's requirements.
  • #N/A: The PR introduces comprehensive OCI artifact support, adding new commands like convert --type artifact and updating existing functionalities for AI models.
  • #New OCI artifact type "docker model runner": The PR adds comprehensive support for OCI artifacts, addressing the issue's request for a new OCI artifact type.

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @rhatdan, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the ramalama tool by integrating full support for OCI artifacts. It allows users to convert, manage, and interact with AI models packaged as OCI artifacts, providing a more flexible and potentially more efficient way to handle models compared to traditional OCI images. The changes span documentation, command-line interface, configuration, core logic for model handling, and include a robust set of new system tests to ensure functionality and reliability.

Highlights

  • OCI Artifact Support: Introduced 'artifact' as a new type for converting and managing OCI models, allowing AI models to be stored as OCI artifacts rather than traditional OCI images.
  • CLI and Configuration Updates: The --type option for convert and push commands now includes 'artifact', and a new convert_type configuration option has been added to specify the default conversion type.
  • Artifact Management Logic: Implemented new functions and modified existing ones to list, create, push, remove, and inspect OCI artifacts, integrating them seamlessly into the ramalama tool's lifecycle.
  • Comprehensive Testing: Added a new test suite (056-artifact.bats) with extensive test cases covering basic artifact conversion, push operations, listing, error handling, configuration precedence, and performance with large and multi-file artifacts.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • Refactor repetitive run_cmd command construction in artifact/image operations (push, remove, exists) into shared helper functions to reduce duplication and improve maintainability.
  • Standardize the inspect and get_inspect return types (JSON string vs Python data structures) so that printing always produces a consistent and predictable output format.
  • Revisit how the CLI --type default is set from CONFIG.convert_type—parsing it at definition time won’t pick up runtime config or environment overrides properly, so consider applying defaults after loading config.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Refactor repetitive run_cmd command construction in artifact/image operations (push, remove, exists) into shared helper functions to reduce duplication and improve maintainability.
- Standardize the inspect and get_inspect return types (JSON string vs Python data structures) so that printing always produces a consistent and predictable output format.
- Revisit how the CLI --type default is set from CONFIG.convert_type—parsing it at definition time won’t pick up runtime config or environment overrides properly, so consider applying defaults after loading config.

## Individual Comments

### Comment 1
<location> `ramalama/transports/oci.py:67-68` </location>
<code_context>
         # Keep this in sync with docs/ramalama-oci.5.md !
         is_car = args.type == "car"
+        is_raw = args.type == "raw"
+        if not is_car and not is_raw:
+            raise ValueError(f"argument --type: invalid choice: '{args.type}' (choose from artifact, car, raw)")
         has_gguf = getattr(args, 'gguf', None) is not None
         content = ""
</code_context>

<issue_to_address>
**issue (bug_risk):** The validation for args.type excludes 'artifact', which is now a valid option.

Please update the validation to accept 'artifact' as a valid type to prevent errors for users and ensure consistency with the CLI and config.
</issue_to_address>

### Comment 2
<location> `ramalama/transports/oci.py:175-176` </location>
<code_context>
+            "rm",
+            self.model,
+        ]
+        if ignore:
+            rm_cmd.extend("--ignore")
+        rm_cmd.append(self.model)
+
</code_context>

<issue_to_address>
**issue (bug_risk):** The use of extend with a string will add each character as a separate argument.

Use append for '--ignore' to avoid splitting the string into individual characters in the command arguments.
</issue_to_address>

### Comment 3
<location> `ramalama/transports/oci.py:177` </location>
<code_context>
+        ]
+        if ignore:
+            rm_cmd.extend("--ignore")
+        rm_cmd.append(self.model)
+
+        run_cmd(
</code_context>

<issue_to_address>
**issue (bug_risk):** The model is appended twice to rm_cmd if ignore is True.

Appending the model twice may lead to duplicate arguments and unintended command behavior.
</issue_to_address>

### Comment 4
<location> `ramalama/transports/oci.py:303-300` </location>
<code_context>
-            perror(f"Failed to push OCI {target} : {e}")
-            raise e
+            try:
+                if args.type != "artifact":
+                    perror(f"Pushing artifact {self.model} ...")
+                    conman_args.insert(1, "artifact")
+                    run_cmd(conman_args)
+            except subprocess.CalledProcessError:
+                perror(f"Failed to push OCI {target} : {e}")
</code_context>

<issue_to_address>
**suggestion:** The error handling in push retries pushing as artifact if the initial push fails and type is not 'artifact'.

Log the original error before retrying, and clarify the fallback logic to ensure maintainers understand when and why the retry occurs.
</issue_to_address>

### Comment 5
<location> `ramalama/oci_tools.py:35` </location>
<code_context>
+    if output == "":
+        return []
+
+    artifacts = json.loads("[" + output[:-1] + "]")
+    models = []
+    for artifact in artifacts:
</code_context>

<issue_to_address>
**issue (bug_risk):** Parsing output with output[:-1] assumes a trailing comma, which may be fragile.

This approach may fail if the output format changes. Use a parsing method that does not rely on a trailing comma to avoid potential errors.
</issue_to_address>

### Comment 6
<location> `ramalama/oci_tools.py:166-167` </location>
<code_context>
+                # drop the id from the model
+                del model["id"]

     models += list_manifests(args)
+    models += list_artifacts(args)
     for model in models:
         # Convert to ISO 8601 format
</code_context>

<issue_to_address>
**suggestion:** The models list is extended with both manifests and artifacts, which may result in duplicate entries if the same model exists in both forms.

Deduplicate models by name or ID to prevent listing the same model multiple times.

Suggested implementation:

```python
    # Combine manifests and artifacts, then deduplicate by 'name'
    manifest_models = list_manifests(args)
    artifact_models = list_artifacts(args)
    combined_models = manifest_models + artifact_models

    # Deduplicate by 'name'
    seen_names = set()
    deduped_models = []
    for model in combined_models:
        model_name = model.get("name")
        if model_name and model_name not in seen_names:
            deduped_models.append(model)
            seen_names.add(model_name)

    models += deduped_models

    for model in models:
        # Convert to ISO 8601 format

```

- If your models use a different unique key (e.g., "id" instead of "name"), replace `"name"` with the appropriate key in the deduplication logic.
- If `models` is not empty before this block, you may want to deduplicate the entire list (including existing entries).
</issue_to_address>

### Comment 7
<location> `ramalama/cli.py:1361-1363` </location>
<code_context>
             m = New(model, args)
-            m.remove(args)
+            # Don't ignore missing so that we attempt OCI as well.
+            newargs=args
+            newargs.ignore=False
+            m.remove(newargs)
+            continue
         except KeyError as e:
</code_context>

<issue_to_address>
**issue (bug_risk):** Reassigning args to newargs and setting ignore=False may have unintended side effects if args is used elsewhere.

To prevent unintended mutations, use a copy of args or a more explicit method when modifying its properties.
</issue_to_address>

### Comment 8
<location> `test/system/056-artifact.bats:377-386` </location>
<code_context>
+@test "ramalama artifact - concurrent operations" {
</code_context>

<issue_to_address>
**suggestion (testing):** Concurrency test for artifact creation and removal.

Consider adding assertions to verify data integrity and detect race conditions, ensuring artifacts remain intact after concurrent creation.
</issue_to_address>

### Comment 9
<location> `ramalama/oci_tools.py:20` </location>
<code_context>
+    return input
+
+
+def list_artifacts(args: EngineArgType):
+    if args.engine == "docker":
+        return []
</code_context>

<issue_to_address>
**issue (complexity):** Consider refactoring repeated command execution and JSON parsing logic into reusable helper functions for clarity and maintainability.

```suggestion
You can collapse most of the “run‐cmd → decode → strip → wrap JSON → json.loads → filter/inspect → map” logic into a couple of small helpers. For example:

```python
def run_list_json(cmd: list[str]) -> list[dict]:
    out = run_cmd(cmd).stdout.decode().strip()
    if not out:
        return []
    # remove trailing comma and wrap in [ … ]
    return json.loads(f"[{out.rstrip(',')}]")
```

```python
def inspect_and_map(
    items: list[dict],                    # e.g. output of run_list_json(...)
    base_inspect_cmd: list[str],         # e.g. [engine, "artifact", "inspect"]
    keep_fn: Callable[[dict], bool],      # predicate on the inspect JSON
    map_fn: Callable[[dict, dict], dict], # builds final record from (list_item, inspect_json)
) -> list[dict]:
    results = []
    for item in items:
        infos = run_list_json(base_inspect_cmd + [item["ID"]])
        if not infos or not keep_fn(infos[0]):
            continue
        results.append(map_fn(item, infos[0]))
    return results
```

Then `list_artifacts` becomes:

```python
def list_artifacts(args: EngineArgType):
    if args.engine == "docker":
        return []

    list_cmd = [
        args.engine, "artifact", "ls",
        "--format", '{"name":"oci://{{ .Repository }}:{{ .Tag }}","size":"{{ .Size }}","ID":"{{ .Digest }}"},',
    ]
    raw = run_list_json(list_cmd)

    return inspect_and_map(
        raw,
        [args.engine, "artifact", "inspect"],
        lambda info: info.get("Manifest", {}).get("artifactType") == annotations.ArtifactTypeModelManifest,
        lambda item, info: {
            "name": item["name"],
            "modified": datetime.now().isoformat(),
            "size": convert_from_human_readable_size(item["size"]),
        }
    )
```

You can apply the same pattern in your image/list_manifests paths, so you remove the per‐block run_cmd/json.loads boilerplate entirely.
</issue_to_address>

### Comment 10
<location> `ramalama/oci_tools.py:13-17` </location>
<code_context>
def convert_from_human_readable_size(input):
    sizes = [("KB", 1024), ("MB", 1024**2), ("GB", 1024**3), ("TB", 1024**4), ("B", 1)]
    for unit, size in sizes:
        if input.endswith(unit) or input.endswith(unit.lower()):
            return float(input[: -len(unit)]) * size

    return input

</code_context>

<issue_to_address>
**suggestion (code-quality):** Use the built-in function `next` instead of a for-loop ([`use-next`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-next/))

```suggestion
    return next(
        (
            float(input[: -len(unit)]) * size
            for unit, size in sizes
            if input.endswith(unit) or input.endswith(unit.lower())
        ),
        input,
    )
```
</issue_to_address>

### Comment 11
<location> `ramalama/oci_tools.py:35` </location>
<code_context>
def list_artifacts(args: EngineArgType):
    if args.engine == "docker":
        return []

    conman_args = [
        args.engine,
        "artifact",
        "ls",
        "--format",
        ('{"name":"oci://{{ .Repository }}:{{ .Tag }}", "size":"{{ .Size }}", "ID":"{{ .Digest }}"},'),
    ]
    output = run_cmd(conman_args).stdout.decode("utf-8").strip()
    if output == "":
        return []

    artifacts = json.loads("[" + output[:-1] + "]")
    models = []
    for artifact in artifacts:
        conman_args = [
            args.engine,
            "artifact",
            "inspect",
            artifact["ID"],
        ]
        output = run_cmd(conman_args).stdout.decode("utf-8").strip()

        if output == "":
            continue
        inspect = json.loads(output)
        if "Manifest" not in inspect:
            continue
        if "artifactType" not in inspect["Manifest"]:
            continue
        if inspect["Manifest"]['artifactType'] != annotations.ArtifactTypeModelManifest:
            continue
        models += [
            {
                "name": artifact["name"],
                "modified": str(datetime.now()),
                "size": convert_from_human_readable_size(artifact["size"]),
            }
        ]
    return models

</code_context>

<issue_to_address>
**suggestion (code-quality):** Use f-string instead of string concatenation [×2] ([`use-fstring-for-concatenation`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-fstring-for-concatenation/))

```suggestion
    artifacts = json.loads(f"[{output[:-1]}]")
```
</issue_to_address>

### Comment 12
<location> `ramalama/transports/base.py:678` </location>
<code_context>
    def get_inspect(
        self,
        show_all: bool = False,
        show_all_metadata: bool = False,
        get_field: str = "",
        dryrun: bool = False,
    ) -> None:
        as_json = True
        model_name = self.filename
        model_registry = self.type.lower()
        model_path = self._get_entry_model_path(False, False, dryrun)
        if GGUFInfoParser.is_model_gguf(model_path):
            if not show_all_metadata and get_field == "":
                gguf_info: GGUFModelInfo = GGUFInfoParser.parse(model_name, model_registry, model_path)
                return gguf_info.serialize(json=as_json, all=show_all)

            metadata = GGUFInfoParser.parse_metadata(model_path)
            if show_all_metadata:
                return metadata.serialize(json=as_json)
            elif get_field != "":  # If a specific field is requested, print only that field
                field_value = metadata.get(get_field)
                if field_value is None:
                    raise KeyError(f"Field '{get_field}' not found in GGUF model metadata")
                return field_value

        if SafetensorInfoParser.is_model_safetensor(model_name):
            safetensor_info: SafetensorModelInfo = SafetensorInfoParser.parse(model_name, model_registry, model_path)
            return safetensor_info.serialize(json=as_json, all=show_all)

        return ModelInfoBase(model_name, model_registry, model_path).serialize(json=as_json)

</code_context>

<issue_to_address>
**issue (code-quality):** We've found these issues:

- Replaces an empty collection equality with a boolean operation ([`simplify-empty-collection-comparison`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/simplify-empty-collection-comparison/))
- Remove unnecessary else after guard condition ([`remove-unnecessary-else`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/remove-unnecessary-else/))
- Remove redundant conditional ([`remove-redundant-if`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/remove-redundant-if/))
</issue_to_address>

### Comment 13
<location> `ramalama/transports/oci.py:286` </location>
<code_context>
    def push(self, source_model, args):
        target = self.model
        source = source_model.model

        conman_args = [self.conman, "push"]
        type = "image"
        if args.type == "artifact":
            type = args.type
            conman_args.insert(1, "artifact")

        perror(f"Pushing {type} {self.model} ...")
        if args.authfile:
            conman_args.extend([f"--authfile={args.authfile}"])
        if str(args.tlsverify).lower() == "false":
            conman_args.extend([f"--tls-verify={args.tlsverify}"])
        conman_args.extend([target])
        if source != target:
            self._convert(source_model, args)
        try:
            run_cmd(conman_args)
        except subprocess.CalledProcessError as e:
            try:
                if args.type != "artifact":
                    perror(f"Pushing artifact {self.model} ...")
                    conman_args.insert(1, "artifact")
                    run_cmd(conman_args)
            except subprocess.CalledProcessError:
                perror(f"Failed to push OCI {target} : {e}")
                raise e

</code_context>

<issue_to_address>
**issue (code-quality):** We've found these issues:

- Explicitly raise from a previous error ([`raise-from-previous-error`](https://docs.sourcery.ai/Reference/Default-Rules/suggestions/raise-from-previous-error/))
- Don't assign to builtin variable `type` [×2] ([`avoid-builtin-shadow`](https://docs.sourcery.ai/Reference/Default-Rules/comments/avoid-builtin-shadow/))

<br/><details><summary>Explanation</summary>
Python has a number of `builtin` variables: functions and constants that
form a part of the language, such as `list`, `getattr`, and `type`
(See https://docs.python.org/3/library/functions.html).
It is valid, in the language, to re-bind such variables:

```python
list = [1, 2, 3]
```
However, this is considered poor practice.
- It will confuse other developers.
- It will confuse syntax highlighters and linters.
- It means you can no longer use that builtin for its original purpose.

How can you solve this?

Rename the variable something more specific, such as `integers`.
In a pinch, `my_list` and similar names are colloquially-recognized
placeholders.
Python has a number of `builtin` variables: functions and constants that
form a part of the language, such as `list`, `getattr`, and `type`
(See https://docs.python.org/3/library/functions.html).
It is valid, in the language, to re-bind such variables:

```python
list = [1, 2, 3]
```
However, this is considered poor practice.
- It will confuse other developers.
- It will confuse syntax highlighters and linters.
- It means you can no longer use that builtin for its original purpose.

How can you solve this?

Rename the variable something more specific, such as `integers`.
In a pinch, `my_list` and similar names are colloquially-recognized
placeholders.</details>
</issue_to_address>

### Comment 14
<location> `ramalama/transports/oci.py:376-377` </location>
<code_context>
    def inspect(
            self,
            show_all: bool = False,
            show_all_metadata: bool = False,
            get_field: str = "",
            as_json: bool = False,
            dryrun: bool = False,
    ) -> None:
        out = super().get_inspect(show_all, show_all_metadata, get_field, dryrun)
        conman_args = [self.conman, "image", "inspect", self.model]
        try:
            out += run_cmd(conman_args, ignore_stderr=True).stdout.decode('utf-8').strip().split('\n')
            print("OCI Image")
        except Exception as e:
            conman_args = [self.conman, "artifact", "inspect", self.model]
            try:
                out += run_cmd(conman_args, ignore_stderr=True).stdout.decode('utf-8').strip()
                print("OCI Artifact")
            except Exception:
                raise e

        print(out)

</code_context>

<issue_to_address>
**suggestion (code-quality):** Explicitly raise from a previous error ([`raise-from-previous-error`](https://docs.sourcery.ai/Reference/Default-Rules/suggestions/raise-from-previous-error/))

```suggestion
            except Exception as exc:
                raise e from exc
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

if source != target:
self._convert(source_model, args)
try:
run_cmd(conman_args)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: The error handling in push retries pushing as artifact if the initial push fails and type is not 'artifact'.

Log the original error before retrying, and clarify the fallback logic to ensure maintainers understand when and why the retry occurs.

if output == "":
return []

artifacts = json.loads("[" + output[:-1] + "]")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Parsing output with output[:-1] assumes a trailing comma, which may be fragile.

This approach may fail if the output format changes. Use a parsing method that does not rely on a trailing comma to avoid potential errors.

Comment on lines 113 to +172
models += list_manifests(args)
models += list_artifacts(args)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: The models list is extended with both manifests and artifacts, which may result in duplicate entries if the same model exists in both forms.

Deduplicate models by name or ID to prevent listing the same model multiple times.

Suggested implementation:

    # Combine manifests and artifacts, then deduplicate by 'name'
    manifest_models = list_manifests(args)
    artifact_models = list_artifacts(args)
    combined_models = manifest_models + artifact_models

    # Deduplicate by 'name'
    seen_names = set()
    deduped_models = []
    for model in combined_models:
        model_name = model.get("name")
        if model_name and model_name not in seen_names:
            deduped_models.append(model)
            seen_names.add(model_name)

    models += deduped_models

    for model in models:
        # Convert to ISO 8601 format
  • If your models use a different unique key (e.g., "id" instead of "name"), replace "name" with the appropriate key in the deduplication logic.
  • If models is not empty before this block, you may want to deduplicate the entire list (including existing entries).

Comment on lines +377 to +386
@test "ramalama artifact - concurrent operations" {
skip_if_nocontainer
skip_if_docker
skip_if_ppc64le
skip_if_s390x

echo "test model 1" > $RAMALAMA_TMPDIR/testmodel1
echo "test model 2" > $RAMALAMA_TMPDIR/testmodel2

# Create two artifacts concurrently
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Concurrency test for artifact creation and removal.

Consider adding assertions to verify data integrity and detect race conditions, ensuring artifacts remain intact after concurrent creation.

Comment on lines 13 to 17
for unit, size in sizes:
if input.endswith(unit) or input.endswith(unit.lower()):
return float(input[: -len(unit)]) * size

return input
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): Use the built-in function next instead of a for-loop (use-next)

Suggested change
for unit, size in sizes:
if input.endswith(unit) or input.endswith(unit.lower()):
return float(input[: -len(unit)]) * size
return input
return next(
(
float(input[: -len(unit)]) * size
for unit, size in sizes
if input.endswith(unit) or input.endswith(unit.lower())
),
input,
)


model_path = self._get_entry_model_path(False, False, dryrun)
if GGUFInfoParser.is_model_gguf(model_path):
if not show_all_metadata and get_field == "":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (code-quality): We've found these issues:

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This PR introduces support for OCI artifacts, which is a great addition. The changes span across documentation, CLI, configuration, and transports, and include a comprehensive new test suite. The implementation looks promising but is marked as a WIP, and there are a few areas that need attention. I've found some critical issues in the oci transport implementation, particularly in artifact removal and inspection logic, which could lead to runtime errors. There are also some high-severity issues related to incorrect data being presented (e.g., modification timestamps) and potential bugs from object mutation. The new test file has a typo in its shebang that could prevent it from running. I've provided specific comments and suggestions to address these points.

Comment on lines 168 to 181
def _rm_artifact(self, ignore):
rm_cmd = [
self.conman,
"artifact",
"rm",
self.model,
]
if ignore:
rm_cmd.extend("--ignore")
rm_cmd.append(self.model)

run_cmd(
rm_cmd,
ignore_all=True,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

There are a couple of issues in the _rm_artifact method:

  1. rm_cmd.extend("--ignore") will add each character of the string as a separate argument to the command list (e.g., ..., '-', '-', 'i', ...). It should be rm_cmd.append("--ignore") or rm_cmd.extend(["--ignore"]).
  2. self.model is added to rm_cmd twice: once during initialization and again with rm_cmd.append(self.model).

This will result in an incorrect command.

    def _rm_artifact(self, ignore):
        rm_cmd = [
            self.conman,
            "artifact",
            "rm",
        ]
        if ignore:
            rm_cmd.append("--ignore")
        rm_cmd.append(self.model)

        run_cmd(
            rm_cmd,
            ignore_all=True,
        )

Comment on lines 358 to 379
def inspect(
self,
show_all: bool = False,
show_all_metadata: bool = False,
get_field: str = "",
as_json: bool = False,
dryrun: bool = False,
) -> None:
out = super().get_inspect(show_all, show_all_metadata, get_field, dryrun)
conman_args = [self.conman, "image", "inspect", self.model]
try:
out += run_cmd(conman_args, ignore_stderr=True).stdout.decode('utf-8').strip().split('\n')
print("OCI Image")
except Exception as e:
conman_args = [self.conman, "artifact", "inspect", self.model]
try:
out += run_cmd(conman_args, ignore_stderr=True).stdout.decode('utf-8').strip()
print("OCI Artifact")
except Exception:
raise e

print(out)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The inspect method has a bug that will cause a TypeError. On line 369, out += ... .split('\n') attempts to concatenate a list (the result of .split('\n')) to a string (out), which is not a valid operation. The logic for combining the output from super().get_inspect() and the podman inspect commands needs to be reworked to handle the data correctly, likely by parsing the JSON strings into Python objects and merging them before printing.

ramalama/cli.py Outdated
Comment on lines 1361 to 1373
newargs=args
newargs.ignore=False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The line newargs=args creates a new reference to the args object, not a copy. Modifying newargs.ignore will mutate the original args object, which can lead to unexpected side effects elsewhere in the program. It's safer to work on a shallow copy of the args object.

Suggested change
newargs=args
newargs.ignore=False
import copy
newargs = copy.copy(args)
newargs.ignore=False

models += [
{
"name": artifact["name"],
"modified": str(datetime.now()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The modified timestamp for an artifact is being set to the current time using datetime.now(). This is inaccurate as it reflects when the list_artifacts function was run, not when the artifact was actually created or modified. You can get the creation timestamp from podman artifact ls by adding {{ .CreatedAt }} to the format string on line 29, and then using that value here.

Suggested change
"modified": str(datetime.now()),
"modified": artifact["modified"],

#
# artifact: Store AI Models as artifacts
# car: Traditional OCI image including base image with the model stored in a /models subdir
# raw Traditional OCI image including only the model and a link file `model.file` pointed at it stored at /
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The description for the raw option is missing a colon and is not aligned with artifact: and car:. For consistency in the documentation, it would be better to add the colon and align it.

# raw:      Traditional OCI image including only the model and a link file `model.file` pointed at it stored at /

ramalama/cli.py Outdated
Comment on lines 726 to 742
default="raw",
choices=["car", "raw"],
choices=["artifact", "car", "raw"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The push command's --type argument defaults to "raw", while the convert command's default is configurable via CONFIG.convert_type. For consistency, it would be better if the push command also respected the configured default type.

Suggested change
default="raw",
choices=["car", "raw"],
choices=["artifact", "car", "raw"],
default=CONFIG.convert_type,

Comment on lines 11 to 17
def convert_from_human_readable_size(input):
sizes = [("KB", 1024), ("MB", 1024**2), ("GB", 1024**3), ("TB", 1024**4), ("B", 1)]
for unit, size in sizes:
if input.endswith(unit) or input.endswith(unit.lower()):
return float(input[: -len(unit)]) * size

return input
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The function convert_from_human_readable_size returns the input as a string if it doesn't match any of the size units. If the input is a plain number string (e.g., representing bytes), this will result in a string being returned instead of a numeric type, which could cause issues for consumers of this function that expect a number. It would be more robust to attempt to convert the input to a float if no units are found.

Suggested change
def convert_from_human_readable_size(input):
sizes = [("KB", 1024), ("MB", 1024**2), ("GB", 1024**3), ("TB", 1024**4), ("B", 1)]
for unit, size in sizes:
if input.endswith(unit) or input.endswith(unit.lower()):
return float(input[: -len(unit)]) * size
return input
def convert_from_human_readable_size(input):
sizes = [("KB", 1024), ("MB", 1024**2), ("GB", 1024**3), ("TB", 1024**4), ("B", 1)]
for unit, size in sizes:
if input.endswith(unit) or input.endswith(unit.lower()):
return float(input[: -len(unit)]) * size
try:
return float(input)
except ValueError:
return input

show_all_metadata: bool = False,
get_field: str = "",
dryrun: bool = False,
) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The get_inspect method is declared with a return type hint of -> None, but it actually returns str or other types. This should be corrected to reflect the actual return type, for example -> Any or a more specific union of types.

Suggested change
) -> None:
) -> Any:

@@ -0,0 +1,545 @@
#!/Us/bin/env bats
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There's a typo in the shebang. It should be #!/usr/bin/env bats, not #!/Us/bin/env bats. This will prevent the test script from being executed correctly.

#!/usr/bin/env bats

@rhatdan rhatdan force-pushed the artifact branch 4 times, most recently from 7f55115 to 6d468dc Compare October 24, 2025 15:12
@rhatdan
Copy link
Member Author

rhatdan commented Oct 24, 2025

/gemini retest

@gemini-code-assist
Copy link
Contributor

It looks like you tried to use the /gemini retest command, but this is not a recognized command. If you intended to trigger a new code review, please use /gemini review.

@rhatdan
Copy link
Member Author

rhatdan commented Oct 24, 2025

@sourcery-ai review

@rhatdan
Copy link
Member Author

rhatdan commented Oct 24, 2025

/gemini review

@gemini-code-assist
Copy link
Contributor

Warning

Gemini encountered an error creating the review. You can try again by commenting /gemini review.

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and found some issues that need to be addressed.

  • The push/pull/remove/exists methods have a lot of duplicated branching for 'artifact' vs 'image' commands—consider centralizing this logic into a helper or dispatch table to reduce complexity and repetition.
  • The inspect implementations in both OCI and base transports contain deeply nested try/except blocks and decode/formatting logic—refactor them into a shared helper or strategy to simplify the flow.
  • Listing functions in oci_tools.py repeat JSON parsing and size conversion logic for images and artifacts—extract common parsing routines to keep behavior consistent and avoid code duplication.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The push/pull/remove/exists methods have a lot of duplicated branching for 'artifact' vs 'image' commands—consider centralizing this logic into a helper or dispatch table to reduce complexity and repetition.
- The inspect implementations in both OCI and base transports contain deeply nested try/except blocks and decode/formatting logic—refactor them into a shared helper or strategy to simplify the flow.
- Listing functions in oci_tools.py repeat JSON parsing and size conversion logic for images and artifacts—extract common parsing routines to keep behavior consistent and avoid code duplication.

## Individual Comments

### Comment 1
<location> `ramalama/transports/oci.py:67-68` </location>
<code_context>
         # Keep this in sync with docs/ramalama-oci.5.md !
         is_car = args.type == "car"
+        is_raw = args.type == "raw"
+        if not is_car and not is_raw:
+            raise ValueError(f"argument --type: invalid choice: '{args.type}' (choose from artifact, car, raw)")
         has_gguf = getattr(args, 'gguf', None) is not None
         content = ""
</code_context>

<issue_to_address>
**issue (bug_risk):** The type validation excludes 'artifact', which is now a valid option.

Please update the validation to accept 'artifact' as a valid type to align with recent CLI and config updates.
</issue_to_address>

### Comment 2
<location> `ramalama/oci_tools.py:11-17` </location>
<code_context>


-def engine_supports_manifest_attributes(engine):
+def convert_from_human_readable_size(input) -> str:
+    sizes = [("KB", 1024), ("MB", 1024**2), ("GB", 1024**3), ("TB", 1024**4), ("B", 1)]
+    for unit, size in sizes:
+        if input.endswith(unit) or input.endswith(unit.lower()):
+            return float(input[: -len(unit)]) * size
+
+    return str(input)
+
+
</code_context>

<issue_to_address>
**issue (bug_risk):** The function convert_from_human_readable_size returns a string for unrecognized units, which may cause downstream issues.

Consider raising an exception or returning a default numeric value for unrecognized units to maintain consistent return types and prevent type errors.
</issue_to_address>

### Comment 3
<location> `ramalama/transports/base.py:204-206` </location>
<code_context>

         if self.model_type == 'oci':
             if use_container or should_generate:
+                if self.artifact:
+                    return os.path.join(MNT_DIR, self.artifact_name())
                 return os.path.join(MNT_DIR, 'model.file')
</code_context>

<issue_to_address>
**issue (bug_risk):** The artifact_name method is called without checking if it is implemented in all subclasses.

This may cause an AttributeError at runtime. Use hasattr to check for the method or add a default implementation in the base class.
</issue_to_address>

### Comment 4
<location> `ramalama/transports/base.py:676-685` </location>
<code_context>
+        as_json = True
</code_context>

<issue_to_address>
**issue (bug_risk):** The as_json variable is always set to True, which may override user intent.

This assignment ignores the user's input. Use the as_json argument value instead.
</issue_to_address>

### Comment 5
<location> `ramalama/cli.py:1384-1385` </location>
<code_context>
             m = New(model, args)
-            m.remove(args)
+            # Don't ignore missing so that we attempt OCI as well.
+            newargs = args
+            newargs.ignore = False
+            m.remove(newargs)
+            continue
</code_context>

<issue_to_address>
**issue (bug_risk):** Modifying args in-place may have unintended side effects elsewhere in the code.

Copy args before modifying to prevent side effects in other parts of the code.
</issue_to_address>

### Comment 6
<location> `test/system/056-artifact.bats:405` </location>
<code_context>
+    assert "$output" !~ ".*concurrent-artifact2" "second concurrent artifact was removed"
+}
+
+@test "ramalama artifact - error handling for invalid source" {
+    skip_if_nocontainer
+    skip_if_docker
</code_context>

<issue_to_address>
**suggestion (testing):** Invalid source error handling is covered.

Please add a test case for sources with insufficient permissions, such as unreadable files, to enhance error handling coverage.
</issue_to_address>

### Comment 7
<location> `test/system/056-artifact.bats:416-417` </location>
<code_context>
+    is "$output" ".*Error.*" "directory as source is handled gracefully"
+}
+
+@test "ramalama artifact - size reporting accuracy" {
+    skip_if_nocontainer
+    skip_if_docker
</code_context>

<issue_to_address>
**suggestion (testing):** Artifact size reporting is validated.

Please add a test for multi-file artifacts to confirm the reported size matches the total size of all files.

```suggestion
    is "$output" ".*Error.*" "directory as source is handled gracefully"
}

@test "ramalama artifact - multi-file size reporting accuracy" {
    skip_if_nocontainer
    skip_if_docker

    # Create a directory with multiple files
    mkdir -p $RAMALAMA_TMPDIR/multifile_artifact
    echo "abc" > $RAMALAMA_TMPDIR/multifile_artifact/file1.txt   # 3 bytes
    echo "defgh" > $RAMALAMA_TMPDIR/multifile_artifact/file2.txt # 5 bytes
    echo "ijklmno" > $RAMALAMA_TMPDIR/multifile_artifact/file3.txt # 7 bytes

    artifact=artifact-multifile:latest
    run_ramalama ? rm ${artifact}

    run_ramalama convert --type artifact file://$RAMALAMA_TMPDIR/multifile_artifact ${artifact}

    # Get reported size from output (assuming output contains "Size: <bytes>")
    reported_size=$(echo "$output" | grep -Eo 'Size: [0-9]+' | awk '{print $2}')

    # Calculate expected size
    expected_size=$((3 + 5 + 7))

    [ "$reported_size" -eq "$expected_size" ]
    assert_success "Reported artifact size ($reported_size) matches total file size ($expected_size)"
}
```
</issue_to_address>

### Comment 8
<location> `test/system/056-artifact.bats:472-474` </location>
<code_context>
+    assert "$output" !~ ".*config-test-artifact" "artifact was removed"
+}
+
+@test "ramalama config - convert_type validation" {
+    skip_if_nocontainer
+
</code_context>

<issue_to_address>
**suggestion (testing):** Config validation for convert_type is tested.

Please add tests for missing config values and malformed config files to further strengthen config parsing robustness.

```suggestion
@test "ramalama config - convert_type validation" {
    skip_if_nocontainer

}

@test "ramalama config - missing required config value" {
    skip_if_nocontainer

    # Create a config missing a required value (e.g., missing 'convert_type')
    config_file="$(mktemp)"
    cat > "$config_file" <<EOF
artifact:
  name: config-test-artifact
  # convert_type is intentionally missing
EOF

    run ramalama convert --config "$config_file"
    # Expect failure and a message about missing config value
    [ "$status" -ne 0 ]
    [[ "$output" == *"missing required config value"* ]]
    rm -f "$config_file"
}

@test "ramalama config - malformed config file" {
    skip_if_nocontainer

    # Create a malformed config file
    config_file="$(mktemp)"
    echo "artifact: name: config-test-artifact convert_type: [unclosed" > "$config_file"

    run ramalama convert --config "$config_file"
    # Expect failure and a message about malformed config
    [ "$status" -ne 0 ]
    [[ "$output" == *"malformed config"* || "$output" == *"parse error"* ]]
    rm -f "$config_file"
}
```
</issue_to_address>

### Comment 9
<location> `test/system/056-artifact.bats:351` </location>
<code_context>
+    assert "$output" !~ ".*large-artifact" "large artifact was removed"
+}
+
+@test "ramalama artifact - multiple files in artifact" {
+    skip_if_nocontainer
+    skip_if_docker
</code_context>

<issue_to_address>
**suggestion (testing):** Multi-file artifact creation is tested.

Please add a test to check that all expected files are present in the artifact and correctly named.
</issue_to_address>

### Comment 10
<location> `ramalama/transports/oci.py:340` </location>
<code_context>
+            except subprocess.CalledProcessError:
+                self._rm_artifact(args.ignore)

     def exists(self) -> bool:
         if self.conman is None:
</code_context>

<issue_to_address>
**issue (complexity):** Consider extracting a generic helper for conman command invocation to eliminate duplicated logic and simplify method implementations.

```markdown
You’ve got a lot of 1:1 duplication around building/running `conman` commands for “image” vs “artifact” and retry‐handling.  Extract a small generic helper and then simplify each operation to one‐liner calls.  For example:

```python
# add near top of class
def _invoke(self,
            kind: Literal["image", "artifact"],
            action: str,
            *args: str,
            ignore_stderr: bool | None = None,
            retry: bool = False) -> subprocess.CompletedProcess:
    """
    Generic conman invoker for either image or artifact commands.
    """
    cmd = [self.conman, kind, action, *args]
    try:
        return run_cmd(cmd, ignore_stderr=(self.ignore_stderr if ignore_stderr is None else ignore_stderr))
    except subprocess.CalledProcessError:
        if retry:
            return run_cmd(cmd, ignore_stderr=True)
        raise

# refactor exists() to use helper
def exists(self) -> bool:
    if self.conman is None:
        return False
    # try image.inspect, fallback to artifact.inspect
    try:
        self._invoke("image", "inspect", self.model)
        return True
    except subprocess.CalledProcessError:
        try:
            self._invoke("artifact", "inspect", self.model)
            return True
        except subprocess.CalledProcessError:
            return False

# refactor remove() to use helper
def remove(self, args, ignore_stderr=False):
    if self.conman is None:
        raise NotImplementedError("OCI Images require a container engine")
    # try manifest.rm, else rmi, else artifact.rm
    try:
        self._invoke("manifest", "rm", self.model, ignore_stderr=ignore_stderr)
    except subprocess.CalledProcessError:
        try:
            self._invoke("image", "rmi", f"--force={args.ignore}", self.model, ignore_stderr=ignore_stderr)
        except subprocess.CalledProcessError:
            self._invoke("artifact", "rm", self.model, ignore_stderr=ignore_stderr)
```

Then collapse your other methods (`push`, `_inspect`, `mount_cmd`, etc.) to calls like:

```python
kind = "artifact" if args.type == "artifact" else "image"
self._invoke(kind, "push", target, ignore_stderr=False, retry=True)
```

This removes per‐method branching, centralizes retry/stderr logic, and keeps all functionality intact.
</issue_to_address>

### Comment 11
<location> `ramalama/oci_tools.py:20` </location>
<code_context>
+    return str(input)
+
+
+def list_artifacts(args: EngineArgType):
+    if args.engine == "docker":
+        return []
</code_context>

<issue_to_address>
**issue (complexity):** Consider refactoring by extracting shared logic into helper functions to reduce duplication and simplify list_models and list_artifacts.

Here’s one way to pull out the common bits into small helpers and keep both list_models and list_artifacts thin. You can

1. factor out the “run‐and‐JSON‐parse” step  
2. unify the manifest‐based filtering/inspect loop  
3. share your size‐conversion and timestamp‐normalization  

```python
def _run_and_parse(cmd: list[str]) -> list[dict]:
    out = run_cmd(cmd).stdout.decode().strip()
    return json.loads(f"[{out[:-1]}]") if out else []

def _normalize_timestamp(ts: str) -> str:
    dt = datetime.fromisoformat(ts.replace(" UTC","")
                                .replace("+0000", "+00:00")
                                .replace(" ", "T"))
    return dt.isoformat()

def _inspect_and_filter(engine: str, items: list[dict], subcmd: str, want_type: str) -> list[dict]:
    result = []
    for item in items:
        info = run_cmd([engine, subcmd, "inspect", item["ID"]]).stdout.decode().strip()
        if not info:
            continue
        manifest = json.loads(info).get("Manifest", {})
        if manifest.get("artifactType") != want_type:
            continue
        result.append(item)
    return result
```

Then your two listing functions become:

```python
def list_models(args):
    # 1) grab image‐list
    models = _run_and_parse([args.engine, "images", "--filter", f"label={ocilabeltype}", "--format", formatLine])
    models = [m for m in models if m["name"] != "oci://<none>:<none>"]
    # 2) inspect/filter for model‐manifests
    models = _inspect_and_filter(args.engine, models, "image", annotations.ArtifactTypeModelManifest)
    # 3) size for docker
    if args.engine == "docker":
        for m in models:
            m["size"] = int(run_cmd([args.engine, "image", "inspect", m["ID"], "--format", "{{.Size}}"])
                             .stdout.decode().strip())
    # 4) normalize
    for m in models:
        m["modified"] = _normalize_timestamp(m["modified"])
        m["name"], m.pop("ID")
    return models

def list_artifacts(args):
    if args.engine == "docker":
        return []
    # 1) grab artifact‐list
    arts = _run_and_parse([
        args.engine, "artifact", "ls", "--format",
        '{"name":"oci://{{ .Repository }}:{{ .Tag }}","modified":"{{ .CreatedAt }}","size":"{{ .Size }}","ID":"{{ .Digest }}"},'
    ])
    # 2) inspect/filter for model‐manifests
    models = _inspect_and_filter(args.engine, arts, "artifact", annotations.ArtifactTypeModelManifest)
    # 3) convert sizes & normalize
    for m in models:
        m["size"] = convert_from_human_readable_size(m["size"])
        m["modified"] = _normalize_timestamp(m["modified"])
        m.pop("ID")
    return models
```

This removes ALL the duplicated loops/JSON hacks and keeps each function focused on just “which subcommand to call” and “what to filter on.”
</issue_to_address>

### Comment 12
<location> `ramalama/oci_tools.py:13-17` </location>
<code_context>
def convert_from_human_readable_size(input) -> str:
    sizes = [("KB", 1024), ("MB", 1024**2), ("GB", 1024**3), ("TB", 1024**4), ("B", 1)]
    for unit, size in sizes:
        if input.endswith(unit) or input.endswith(unit.lower()):
            return float(input[: -len(unit)]) * size

    return str(input)

</code_context>

<issue_to_address>
**suggestion (code-quality):** Use the built-in function `next` instead of a for-loop ([`use-next`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-next/))

```suggestion
    return next(
        (
            float(input[: -len(unit)]) * size
            for unit, size in sizes
            if input.endswith(unit) or input.endswith(unit.lower())
        ),
        str(input),
    )
```
</issue_to_address>

### Comment 13
<location> `ramalama/oci_tools.py:40` </location>
<code_context>
def list_artifacts(args: EngineArgType):
    if args.engine == "docker":
        return []

    conman_args = [
        args.engine,
        "artifact",
        "ls",
        "--format",
        (
            '{"name":"oci://{{ .Repository }}:{{ .Tag }}",\
            "created":"{{ .CreatedAt }}", \
            "size":"{{ .Size }}", \
            "ID":"{{ .Digest }}"},'
        ),
    ]
    output = run_cmd(conman_args).stdout.decode("utf-8").strip()
    if output == "":
        return []

    artifacts = json.loads("[" + output[:-1] + "]")
    models = []
    for artifact in artifacts:
        conman_args = [
            args.engine,
            "artifact",
            "inspect",
            artifact["ID"],
        ]
        output = run_cmd(conman_args).stdout.decode("utf-8").strip()

        if output == "":
            continue
        inspect = json.loads(output)
        if "Manifest" not in inspect:
            continue
        if "artifactType" not in inspect["Manifest"]:
            continue
        if inspect["Manifest"]['artifactType'] != annotations.ArtifactTypeModelManifest:
            continue
        models += [
            {
                "name": artifact["name"],
                "modified": artifact["created"],
                "size": convert_from_human_readable_size(artifact["size"]),
            }
        ]
    return models

</code_context>

<issue_to_address>
**suggestion (code-quality):** Use f-string instead of string concatenation [×2] ([`use-fstring-for-concatenation`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-fstring-for-concatenation/))

```suggestion
    artifacts = json.loads(f"[{output[:-1]}]")
```
</issue_to_address>

### Comment 14
<location> `ramalama/transports/base.py:681` </location>
<code_context>
    def get_inspect(
        self,
        show_all: bool = False,
        show_all_metadata: bool = False,
        get_field: str = "",
        dryrun: bool = False,
    ) -> Any:
        as_json = True
        model_name = self.filename
        model_registry = self.type.lower()
        model_path = self._get_entry_model_path(False, False, dryrun)
        if GGUFInfoParser.is_model_gguf(model_path):
            if not show_all_metadata and get_field == "":
                gguf_info: GGUFModelInfo = GGUFInfoParser.parse(model_name, model_registry, model_path)
                return gguf_info.serialize(json=as_json, all=show_all)

            metadata = GGUFInfoParser.parse_metadata(model_path)
            if show_all_metadata:
                return metadata.serialize(json=as_json)
            elif get_field != "":  # If a specific field is requested, print only that field
                field_value = metadata.get(get_field)
                if field_value is None:
                    raise KeyError(f"Field '{get_field}' not found in GGUF model metadata")
                return field_value

        if SafetensorInfoParser.is_model_safetensor(model_name):
            safetensor_info: SafetensorModelInfo = SafetensorInfoParser.parse(model_name, model_registry, model_path)
            return safetensor_info.serialize(json=as_json, all=show_all)

        return ModelInfoBase(model_name, model_registry, model_path).serialize(json=as_json)

</code_context>

<issue_to_address>
**issue (code-quality):** We've found these issues:

- Replaces an empty collection equality with a boolean operation ([`simplify-empty-collection-comparison`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/simplify-empty-collection-comparison/))
- Remove unnecessary else after guard condition ([`remove-unnecessary-else`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/remove-unnecessary-else/))
- Remove redundant conditional ([`remove-redundant-if`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/remove-redundant-if/))
</issue_to_address>

### Comment 15
<location> `ramalama/transports/oci.py:284` </location>
<code_context>
    def push(self, source_model, args):
        target = self.model
        source = source_model.model

        conman_args = [self.conman, "push"]
        type = "image"
        if args.type == "artifact":
            type = args.type
            conman_args.insert(1, "artifact")

        perror(f"Pushing {type} {self.model} ...")
        if args.authfile:
            conman_args.extend([f"--authfile={args.authfile}"])
        if str(args.tlsverify).lower() == "false":
            conman_args.extend([f"--tls-verify={args.tlsverify}"])
        conman_args.extend([target])
        if source != target:
            self._convert(source_model, args)
        try:
            run_cmd(conman_args)
        except subprocess.CalledProcessError as e:
            try:
                if args.type != "artifact":
                    perror(f"Pushing artifact {self.model} ...")
                    conman_args.insert(1, "artifact")
                    run_cmd(conman_args)
            except subprocess.CalledProcessError:
                perror(f"Failed to push OCI {target} : {e}")
                raise e

</code_context>

<issue_to_address>
**issue (code-quality):** We've found these issues:

- Explicitly raise from a previous error ([`raise-from-previous-error`](https://docs.sourcery.ai/Reference/Default-Rules/suggestions/raise-from-previous-error/))
- Don't assign to builtin variable `type` [×2] ([`avoid-builtin-shadow`](https://docs.sourcery.ai/Reference/Default-Rules/comments/avoid-builtin-shadow/))

<br/><details><summary>Explanation</summary>
Python has a number of `builtin` variables: functions and constants that
form a part of the language, such as `list`, `getattr`, and `type`
(See https://docs.python.org/3/library/functions.html).
It is valid, in the language, to re-bind such variables:

```python
list = [1, 2, 3]
```
However, this is considered poor practice.
- It will confuse other developers.
- It will confuse syntax highlighters and linters.
- It means you can no longer use that builtin for its original purpose.

How can you solve this?

Rename the variable something more specific, such as `integers`.
In a pinch, `my_list` and similar names are colloquially-recognized
placeholders.
Python has a number of `builtin` variables: functions and constants that
form a part of the language, such as `list`, `getattr`, and `type`
(See https://docs.python.org/3/library/functions.html).
It is valid, in the language, to re-bind such variables:

```python
list = [1, 2, 3]
```
However, this is considered poor practice.
- It will confuse other developers.
- It will confuse syntax highlighters and linters.
- It means you can no longer use that builtin for its original purpose.

How can you solve this?

Rename the variable something more specific, such as `integers`.
In a pinch, `my_list` and similar names are colloquially-recognized
placeholders.</details>
</issue_to_address>

### Comment 16
<location> `ramalama/transports/oci.py:374-375` </location>
<code_context>
    def _inspect(
        self,
        show_all: bool = False,
        show_all_metadata: bool = False,
        get_field: str = "",
        as_json: bool = False,
        dryrun: bool = False,
    ) -> (str, str):
        out = super().get_inspect(show_all, show_all_metadata, get_field, dryrun)
        conman_args = [self.conman, "image", "inspect", self.model]
        type = "Image"
        try:
            out += run_cmd(conman_args, ignore_stderr=True).stdout.decode('utf-8').strip()
        except Exception as e:
            conman_args = [self.conman, "artifact", "inspect", self.model]
            try:
                out += run_cmd(conman_args, ignore_stderr=True).stdout.decode('utf-8').strip()
                type = "Artifact"
            except Exception:
                raise e

        return out, type

</code_context>

<issue_to_address>
**suggestion (code-quality):** We've found these issues:

- Explicitly raise from a previous error ([`raise-from-previous-error`](https://docs.sourcery.ai/Reference/Default-Rules/suggestions/raise-from-previous-error/))
- Don't assign to builtin variable `type` [×2] ([`avoid-builtin-shadow`](https://docs.sourcery.ai/Reference/Default-Rules/comments/avoid-builtin-shadow/))

```suggestion
            except Exception as exc:
                raise e from exc
```

<br/><details><summary>Explanation</summary>
Python has a number of `builtin` variables: functions and constants that
form a part of the language, such as `list`, `getattr`, and `type`
(See https://docs.python.org/3/library/functions.html).
It is valid, in the language, to re-bind such variables:

```python
list = [1, 2, 3]
```
However, this is considered poor practice.
- It will confuse other developers.
- It will confuse syntax highlighters and linters.
- It means you can no longer use that builtin for its original purpose.

How can you solve this?

Rename the variable something more specific, such as `integers`.
In a pinch, `my_list` and similar names are colloquially-recognized
placeholders.
Python has a number of `builtin` variables: functions and constants that
form a part of the language, such as `list`, `getattr`, and `type`
(See https://docs.python.org/3/library/functions.html).
It is valid, in the language, to re-bind such variables:

```python
list = [1, 2, 3]
```
However, this is considered poor practice.
- It will confuse other developers.
- It will confuse syntax highlighters and linters.
- It means you can no longer use that builtin for its original purpose.

How can you solve this?

Rename the variable something more specific, such as `integers`.
In a pinch, `my_list` and similar names are colloquially-recognized
placeholders.</details>
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

assert "$output" !~ ".*concurrent-artifact2" "second concurrent artifact was removed"
}

@test "ramalama artifact - error handling for invalid source" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Invalid source error handling is covered.

Please add a test case for sources with insufficient permissions, such as unreadable files, to enhance error handling coverage.

Comment on lines +416 to +417
is "$output" ".*Error.*" "directory as source is handled gracefully"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Artifact size reporting is validated.

Please add a test for multi-file artifacts to confirm the reported size matches the total size of all files.

Suggested change
is "$output" ".*Error.*" "directory as source is handled gracefully"
}
is "$output" ".*Error.*" "directory as source is handled gracefully"
}
@test "ramalama artifact - multi-file size reporting accuracy" {
skip_if_nocontainer
skip_if_docker
# Create a directory with multiple files
mkdir -p $RAMALAMA_TMPDIR/multifile_artifact
echo "abc" > $RAMALAMA_TMPDIR/multifile_artifact/file1.txt # 3 bytes
echo "defgh" > $RAMALAMA_TMPDIR/multifile_artifact/file2.txt # 5 bytes
echo "ijklmno" > $RAMALAMA_TMPDIR/multifile_artifact/file3.txt # 7 bytes
artifact=artifact-multifile:latest
run_ramalama ? rm ${artifact}
run_ramalama convert --type artifact file://$RAMALAMA_TMPDIR/multifile_artifact ${artifact}
# Get reported size from output (assuming output contains "Size: <bytes>")
reported_size=$(echo "$output" | grep -Eo 'Size: [0-9]+' | awk '{print $2}')
# Calculate expected size
expected_size=$((3 + 5 + 7))
[ "$reported_size" -eq "$expected_size" ]
assert_success "Reported artifact size ($reported_size) matches total file size ($expected_size)"
}

Comment on lines +472 to +474
@test "ramalama config - convert_type validation" {
skip_if_nocontainer

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Config validation for convert_type is tested.

Please add tests for missing config values and malformed config files to further strengthen config parsing robustness.

Suggested change
@test "ramalama config - convert_type validation" {
skip_if_nocontainer
@test "ramalama config - convert_type validation" {
skip_if_nocontainer
}
@test "ramalama config - missing required config value" {
skip_if_nocontainer
# Create a config missing a required value (e.g., missing 'convert_type')
config_file="$(mktemp)"
cat > "$config_file" <<EOF
artifact:
name: config-test-artifact
# convert_type is intentionally missing
EOF
run ramalama convert --config "$config_file"
# Expect failure and a message about missing config value
[ "$status" -ne 0 ]
[[ "$output" == *"missing required config value"* ]]
rm -f "$config_file"
}
@test "ramalama config - malformed config file" {
skip_if_nocontainer
# Create a malformed config file
config_file="$(mktemp)"
echo "artifact: name: config-test-artifact convert_type: [unclosed" > "$config_file"
run ramalama convert --config "$config_file"
# Expect failure and a message about malformed config
[ "$status" -ne 0 ]
[[ "$output" == *"malformed config"* || "$output" == *"parse error"* ]]
rm -f "$config_file"
}

assert "$output" !~ ".*large-artifact" "large artifact was removed"
}

@test "ramalama artifact - multiple files in artifact" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Multi-file artifact creation is tested.

Please add a test to check that all expected files are present in the artifact and correctly named.

Comment on lines 13 to 17
for unit, size in sizes:
if input.endswith(unit) or input.endswith(unit.lower()):
return float(input[: -len(unit)]) * size

return str(input)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): Use the built-in function next instead of a for-loop (use-next)

Suggested change
for unit, size in sizes:
if input.endswith(unit) or input.endswith(unit.lower()):
return float(input[: -len(unit)]) * size
return str(input)
return next(
(
float(input[: -len(unit)]) * size
for unit, size in sizes
if input.endswith(unit) or input.endswith(unit.lower())
),
str(input),
)

@rhatdan rhatdan force-pushed the artifact branch 4 times, most recently from 841e532 to 1127668 Compare October 27, 2025 17:56
@rhatdan
Copy link
Member Author

rhatdan commented Oct 27, 2025

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces comprehensive support for OCI artifacts, a significant and valuable feature. The changes are extensive, touching documentation, configuration, the CLI, and transport layers, and are well-supported by new end-to-end and system tests. The overall implementation is strong, but I have identified a few issues. These include a critical bug in how inspection data is combined, a couple of high-severity bugs related to command construction and data type handling, and a minor formatting inconsistency in the documentation. Addressing these points will help ensure the robustness and correctness of this new functionality.

Comment on lines 399 to 420
def _inspect(
self,
show_all: bool = False,
show_all_metadata: bool = False,
get_field: str = "",
as_json: bool = False,
dryrun: bool = False,
) -> (str, str):
out = super().get_inspect(show_all, show_all_metadata, get_field, dryrun)
conman_args = [self.conman, "image", "inspect", self.model]
type = "Image"
try:
out += run_cmd(conman_args, ignore_stderr=True).stdout.decode('utf-8').strip()
except Exception as e:
conman_args = [self.conman, "artifact", "inspect", self.model]
try:
out += run_cmd(conman_args, ignore_stderr=True).stdout.decode('utf-8').strip()
type = "Artifact"
except Exception:
raise e

return out, type
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The _inspect method concatenates multiple JSON strings using += on lines 411 and 415. This results in an invalid JSON string (e.g., {"key":"value"}{"other_key":"other_value"}). To correctly combine the information, you should parse each JSON output into a Python dictionary, merge them, and then serialize the result back to a single, valid JSON string.

    def _inspect(
        self,
        show_all: bool = False,
        show_all_metadata: bool = False,
        get_field: str = "",
        as_json: bool = False,
        dryrun: bool = False,
    ) -> (str, str):
        out_data = json.loads(super().get_inspect(show_all, show_all_metadata, get_field, dryrun))
        conman_args = [self.conman, "image", "inspect", self.model]
        oci_type = "Image"
        try:
            inspect_output = run_cmd(conman_args, ignore_stderr=True).stdout.decode('utf-8').strip()
            # podman image inspect returns a list of objects
            inspect_data = json.loads(inspect_output)
            if inspect_data:
                out_data.update(inspect_data[0])
        except Exception as e:
            conman_args = [self.conman, "artifact", "inspect", self.model]
            try:
                inspect_output = run_cmd(conman_args, ignore_stderr=True).stdout.decode('utf-8').strip()
                # podman artifact inspect returns a single object
                out_data.update(json.loads(inspect_output))
                oci_type = "Artifact"
            except Exception:
                raise e

        return json.dumps(out_data), oci_type

Comment on lines 11 to 17
def convert_from_human_readable_size(input) -> str:
sizes = [("KB", 1024), ("MB", 1024**2), ("GB", 1024**3), ("TB", 1024**4), ("B", 1)]
for unit, size in sizes:
if input.endswith(unit) or input.endswith(unit.lower()):
return float(input[: -len(unit)]) * size

return str(input)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This function has an incorrect return type hint (-> str) and returns inconsistent types. It returns a float if a unit is found, but a str otherwise. This can lead to TypeError exceptions in functions that consume this output and expect a numeric size. The function should consistently return a numeric type, such as float, to ensure type safety and prevent runtime errors.

Suggested change
def convert_from_human_readable_size(input) -> str:
sizes = [("KB", 1024), ("MB", 1024**2), ("GB", 1024**3), ("TB", 1024**4), ("B", 1)]
for unit, size in sizes:
if input.endswith(unit) or input.endswith(unit.lower()):
return float(input[: -len(unit)]) * size
return str(input)
def convert_from_human_readable_size(input) -> float:
sizes = [("KB", 1024), ("MB", 1024**2), ("GB", 1024**3), ("TB", 1024**4), ("B", 1)]
for unit, size in sizes:
if input.endswith(unit) or input.endswith(unit.lower()):
return float(input[: -len(unit)]) * size
return float(input)

Comment on lines 188 to 201
def _rm_artifact(self, ignore) -> None:
rm_cmd = [
self.conman,
"artifact",
"rm",
self.model,
]
if ignore:
rm_cmd.append("--ignore")
rm_cmd.append(self.model)

run_cmd(
rm_cmd,
ignore_all=True,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The _rm_artifact method constructs the rm command incorrectly. It appends self.model twice and places the --ignore flag in the wrong position, which will cause the command to fail. Command-line options should precede arguments. This needs to be corrected to ensure the command is valid.

    def _rm_artifact(self, ignore) -> None:
        rm_cmd = [
            self.conman,
            "artifact",
            "rm",
        ]
        if ignore:
            rm_cmd.append("--ignore")
        rm_cmd.append(self.model)

        run_cmd(
            rm_cmd,
            ignore_all=True,
        )

#
# artifact: Store AI Models as artifacts
# car: Traditional OCI image including base image with the model stored in a /models subdir
# raw: Traditional OCI image including only the model and a link file `model.file` pointed at it stored at /
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There's a minor alignment issue in the comment for the raw option. Aligning it with the other options would improve readability.

# raw:      Traditional OCI image including only the model and a link file `model.file` pointed at it stored at /

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant