Skip to content

Conversation

@andreyvelich
Copy link
Member

As we discussed here, we should move image to the Runtime trainer: #140 (comment)

/assign @kubeflow/kubeflow-sdk-team @Fiona-Waters

@coveralls
Copy link

coveralls commented Nov 4, 2025

Pull Request Test Coverage Report for Build 19113108843

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 22 of 30 (73.33%) changed or added relevant lines in 7 files are covered.
  • 134 unchanged lines in 3 files lost coverage.
  • Overall coverage decreased (-1.0%) to 67.33%

Changes Missing Coverage Covered Lines Changed/Added Lines %
kubeflow/trainer/backends/container/utils.py 0 2 0.0%
kubeflow/trainer/backends/container/backend.py 14 20 70.0%
Files with Coverage Reduction New Missed Lines %
kubeflow/optimizer/backends/base.py 16 0.0%
kubeflow/optimizer/api/optimizer_client.py 19 0.0%
kubeflow/optimizer/backends/kubernetes/backend.py 99 0.0%
Totals Coverage Status
Change from base Build 19111376131: -1.0%
Covered Lines: 2504
Relevant Lines: 3719

💛 - Coveralls

return UNKNOWN


def resolve_image(runtime: types.Runtime) -> str:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Fiona-Waters It looks like we don't need image resolver, since we fallback to the default runtime in case we can't get it online:

def _create_default_runtimes() -> list[base_types.Runtime]:

Is that correct ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes if image is not optional then we don't need this function.

class RuntimeTrainer:
trainer_type: TrainerType
framework: str
image: str
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@astefanutti @Fiona-Waters I made image mandatory for the RuntimeTrainer.
The container should always has an image, but for the local subprocess backend, we can populate some const value there.
What do you think about it ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense for the image to be mandatory. Adding a dummy const to local process backend seems fine to me as long as we make sure it's purpose is mentioned clearly in a comment.

@andreyvelich andreyvelich changed the title fix(sdk): Fix empty image for Runtime trainer fix(trainer): Fix empty image for Runtime trainer Nov 4, 2025
Copy link
Contributor

@Fiona-Waters Fiona-Waters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @andreyvelich - responsed to your comments.

return UNKNOWN


def resolve_image(runtime: types.Runtime) -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes if image is not optional then we don't need this function.

class RuntimeTrainer:
trainer_type: TrainerType
framework: str
image: str
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense for the image to be mandatory. Adding a dummy const to local process backend seems fine to me as long as we make sure it's purpose is mentioned clearly in a comment.

@kramaranya
Copy link
Contributor

/milestone v0.2

@google-oss-prow google-oss-prow bot added this to the v0.2 milestone Nov 5, 2025
Signed-off-by: Andrey Velichkevich <[email protected]>
Comment on lines +229 to +232
trainer: Optional[
Union[types.CustomTrainer, types.CustomTrainerContainer, types.BuiltinTrainer]
] = None,
options: Optional[list] = None,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should fix E2Es in this PR: kubeflow/trainer#2907

cc @Fiona-Waters @astefanutti @kramaranya

Signed-off-by: Andrey Velichkevich <[email protected]>
Signed-off-by: Andrey Velichkevich <[email protected]>
self,
name: str,
follow: Optional[bool] = False,
follow: bool = False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Properties should not be Optional if they can't be None.
By default follow=False

@kramaranya
Copy link
Contributor

Thanks @andreyvelich!
/lgtm

@andreyvelich
Copy link
Member Author

/approve

@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andreyvelich

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-prow google-oss-prow bot merged commit 8ebd5ed into kubeflow:main Nov 5, 2025
14 checks passed
@andreyvelich andreyvelich deleted the fix-image-runtime branch November 5, 2025 20:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants