Skip to content

Conversation

@ruivieira
Copy link
Member

@ruivieira ruivieira commented Sep 7, 2025

Refer to RHOAIENG-33702.

Summary by Sourcery

Add environment variable support for Kubernetes usage and refactor LMEval provider initialization for lazy Kubernetes client and custom resource builder setup

New Features:

  • Allow use_k8s flag to be configured via TRUSTYAI_LMEVAL_USE_K8S environment variable in run.yaml
  • Allow Kubernetes namespace to be configured via TRUSTYAI_LM_EVAL_NAMESPACE environment variable

Enhancements:

  • Introduce _ensure_k8s_initialized method for lazy initialization of Kubernetes client, namespace resolution, and CR builder
  • Remove immediate namespace resolution in provider init to defer until runtime
  • Relax configuration validation to no longer reject non-Kubernetes backend

@ruivieira ruivieira self-assigned this Sep 7, 2025
@ruivieira ruivieira added the enhancement New feature or request label Sep 7, 2025
@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Sep 7, 2025

Reviewer's Guide

This PR refactors LMEval’s Kubernetes setup by deferring namespace resolution and client/CR builder initialization until first use, introduces explicit guards in evaluation entry points, relaxes the configuration to allow non-K8s mode, and updates run.yaml to source the use_k8s flag and namespace from environment variables.

Sequence diagram for deferred Kubernetes initialization in LMEval

sequenceDiagram
    participant LMEval
    participant "K8s Client"
    participant "CR Builder"
    participant "Config"
    participant "Logger"
    LMEval->>LMEval: Call _ensure_k8s_initialized()
    alt use_k8s is False
        LMEval->>Logger: Log warning (Non-K8s backend not implemented)
        LMEval-->>LMEval: Return
    else use_k8s is True
        alt _k8s_client is None
            LMEval->>LMEval: _init_k8s_client()
        end
        alt _namespace is None
            LMEval->>LMEval: _resolve_namespace(config)
            LMEval->>Logger: Log resolved namespace
        end
        alt _cr_builder is None
            LMEval->>"CR Builder": Initialize with namespace and service_account
            LMEval->>Logger: Log CR builder initialization
        end
    end
Loading

Class diagram for updated LMEval initialization and configuration

classDiagram
    class LMEval {
        - _config: LMEvalEvalProviderConfig
        - _namespace: str | None
        - _k8s_client: k8s_client.ApiClient | None
        - _k8s_custom_api: k8s_client.CustomObjectsApi | None
        - _cr_builder: LMEvalCRBuilder | None
        + __init__(config: LMEvalEvalProviderConfig)
        + _ensure_k8s_initialized()
        + _init_k8s_client()
        + run_eval(...)
        + evaluate_rows(...)
        + job_status(...)
        + job_cancel(...)
        + job_result(...)
    }
    class LMEvalCRBuilder {
        - namespace: str | None
        - service_account: str | None
        - _config: LMEvalEvalProviderConfig
        + create_cr(...)
    }
    LMEval --> LMEvalCRBuilder
    LMEval --> k8s_client.ApiClient
    LMEval --> k8s_client.CustomObjectsApi
    LMEval --> LMEvalEvalProviderConfig
Loading

File-Level Changes

Change Details Files
Deferred Kubernetes client and CR builder initialization
  • Removed immediate namespace resolution and client setup from the constructor
  • Introduced _ensure_k8s_initialized to lazily initialize k8s client, namespace, and CR builder
  • Added debug logging for resolved namespace and CR builder initialization
src/llama_stack_provider_lmeval/lmeval.py
Added initialization guards to evaluation methods
  • Inserted calls to _ensure_k8s_initialized at start of run_eval, evaluate_rows, job_status, job_cancel, and job_result
  • Added error raise if CR builder remains uninitialized
src/llama_stack_provider_lmeval/lmeval.py
Relaxed configuration validation for non-K8s mode
  • Removed exception preventing use_k8s from being False in post_init
src/llama_stack_provider_lmeval/config.py
Enabled environment variable support in run.yaml
  • Updated use_k8s flag, base_url, and namespace entries to use ${env.VAR:default} syntax
run.yaml

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and they look great!

Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments

### Comment 1
<location> `src/llama_stack_provider_lmeval/config.py:133` </location>
<code_context>
         """Validate the configuration"""
         if not isinstance(self.use_k8s, bool):
             raise LMEvalConfigError("use_k8s must be a boolean")
-        if self.use_k8s is False:
-            raise LMEvalConfigError(
-                "Only Kubernetes LMEval backend is supported at the moment"
-            )
</code_context>

<issue_to_address>
Removing the check for use_k8s disables early error reporting for unsupported backends.

This change may result in less informative error messages, as misconfigured backends will fail later at runtime instead of during configuration validation.
</issue_to_address>

### Comment 2
<location> `src/llama_stack_provider_lmeval/lmeval.py:939` </location>
<code_context>
+
+    def _ensure_k8s_initialized(self):
+        """Ensure Kubernetes client and namespace are initialized when needed."""
+        if not self.use_k8s:
+            logger.warning("Non-K8s evaluation backend is not implemented yet")
+            return
+            
</code_context>

<issue_to_address>
Logging a warning for non-K8s backend may be redundant given the subsequent NotImplementedError.

Consider removing the warning in _ensure_k8s_initialized, as NotImplementedError is raised elsewhere if use_k8s is False. Alternatively, update the warning to clarify where the error will occur.

Suggested implementation:

```python
    def _ensure_k8s_initialized(self):
        """Ensure Kubernetes client and namespace are initialized when needed."""
        if not self.use_k8s:
            return

```

If you want to clarify in the log that the error will be raised elsewhere, you could replace the warning with:
```python
logger.warning("use_k8s is False; NotImplementedError will be raised in subsequent calls.")
```
But the cleanest approach is simply to remove the warning as shown above.
</issue_to_address>

### Comment 3
<location> `src/llama_stack_provider_lmeval/lmeval.py:950` </location>
<code_context>
+            self._namespace = _resolve_namespace(self._config)
+            logger.debug("LMEval provider resolved namespace: %s", self._namespace)
+            
+        if self._cr_builder is None:
             self._cr_builder = LMEvalCRBuilder(
                 namespace=self._namespace,
</code_context>

<issue_to_address>
Raising LMEvalConfigError for missing CR builder may mask underlying initialization issues.

Consider providing a more specific error or additional diagnostics to help users identify whether the issue is with K8s client initialization or namespace resolution.
</issue_to_address>

<suggested_fix>
<<<<<<< SEARCH
        if self._cr_builder is None:
            self._cr_builder = LMEvalCRBuilder(
                namespace=self._namespace,
                service_account=getattr(self._config, "service_account", None),
            )
            self._cr_builder._config = self._config
            logger.debug("Initialized Kubernetes client and CR builder with namespace: %s", self._namespace)
=======
        if self._cr_builder is None:
            self._cr_builder = LMEvalCRBuilder(
                namespace=self._namespace,
                service_account=getattr(self._config, "service_account", None),
            )
            self._cr_builder._config = self._config
            logger.debug("Initialized Kubernetes client and CR builder with namespace: %s", self._namespace)
            # Additional diagnostics for CR builder initialization
            if self._cr_builder is None:
                diagnostics = (
                    f"K8s client initialized: {self._k8s_client is not None}\n"
                    f"Namespace resolved: {self._namespace}\n"
                    f"Config: {self._config}\n"
                )
                raise LMEvalConfigError(
                    "Failed to initialize LMEvalCRBuilder. "
                    "This may be due to issues with Kubernetes client initialization, "
                    "namespace resolution, or CR builder construction.\n"
                    f"Diagnostics:\n{diagnostics}"
                )
>>>>>>> REPLACE

</suggested_fix>

### Comment 4
<location> `run.yaml:19` </location>
<code_context>
       provider_type: remote::trustyai_lmeval
       config:
-        use_k8s: True
+        use_k8s: ${env.TRUSTYAI_LMEVAL_USE_K8S:True}
         base_url: ${env.VLLM_URL:http://localhost:8000/v1}
         namespace: ${env.TRUSTYAI_LM_EVAL_NAMESPACE}
</code_context>

<issue_to_address>
Switching use_k8s to an environment variable increases configurability but may introduce type ambiguity.

Ensure that the environment variable is parsed to a boolean to prevent type-related issues.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines -133 to -134
if self.use_k8s is False:
raise LMEvalConfigError(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Removing the check for use_k8s disables early error reporting for unsupported backends.

This change may result in less informative error messages, as misconfigured backends will fail later at runtime instead of during configuration validation.

Comment on lines +939 to +940
if not self.use_k8s:
logger.warning("Non-K8s evaluation backend is not implemented yet")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Logging a warning for non-K8s backend may be redundant given the subsequent NotImplementedError.

Consider removing the warning in _ensure_k8s_initialized, as NotImplementedError is raised elsewhere if use_k8s is False. Alternatively, update the warning to clarify where the error will occur.

Suggested implementation:

    def _ensure_k8s_initialized(self):
        """Ensure Kubernetes client and namespace are initialized when needed."""
        if not self.use_k8s:
            return

If you want to clarify in the log that the error will be raised elsewhere, you could replace the warning with:

logger.warning("use_k8s is False; NotImplementedError will be raised in subsequent calls.")

But the cleanest approach is simply to remove the warning as shown above.

Comment on lines 950 to 956
if self._cr_builder is None:
self._cr_builder = LMEvalCRBuilder(
namespace=self._namespace,
service_account=getattr(self._config, "service_account", None),
)
self._cr_builder._config = self._config
logger.debug("Initialized Kubernetes client and CR builder with namespace: %s", self._namespace)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (bug_risk): Raising LMEvalConfigError for missing CR builder may mask underlying initialization issues.

Consider providing a more specific error or additional diagnostics to help users identify whether the issue is with K8s client initialization or namespace resolution.

Suggested change
if self._cr_builder is None:
self._cr_builder = LMEvalCRBuilder(
namespace=self._namespace,
service_account=getattr(self._config, "service_account", None),
)
self._cr_builder._config = self._config
logger.debug("Initialized Kubernetes client and CR builder with namespace: %s", self._namespace)
if self._cr_builder is None:
self._cr_builder = LMEvalCRBuilder(
namespace=self._namespace,
service_account=getattr(self._config, "service_account", None),
)
self._cr_builder._config = self._config
logger.debug("Initialized Kubernetes client and CR builder with namespace: %s", self._namespace)
# Additional diagnostics for CR builder initialization
if self._cr_builder is None:
diagnostics = (
f"K8s client initialized: {self._k8s_client is not None}\n"
f"Namespace resolved: {self._namespace}\n"
f"Config: {self._config}\n"
)
raise LMEvalConfigError(
"Failed to initialize LMEvalCRBuilder. "
"This may be due to issues with Kubernetes client initialization, "
"namespace resolution, or CR builder construction.\n"
f"Diagnostics:\n{diagnostics}"
)

run.yaml Outdated
provider_type: remote::trustyai_lmeval
config:
use_k8s: True
use_k8s: ${env.TRUSTYAI_LMEVAL_USE_K8S:True}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Switching use_k8s to an environment variable increases configurability but may introduce type ambiguity.

Ensure that the environment variable is parsed to a boolean to prevent type-related issues.

Copy link
Contributor

@nathan-weinberg nathan-weinberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment but generally LGTM!

ruivieira and others added 2 commits September 9, 2025 23:03
Co-authored-by: Sébastien Han <[email protected]>
Co-authored-by: Sébastien Han <[email protected]>
Copy link

@saichandrapandraju saichandrapandraju left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ruivieira ruivieira merged commit 545f12b into trustyai-explainability:main Sep 10, 2025
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants