Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kserve: configure servicemesh before deploying manifests #1019

Merged
merged 2 commits into from
May 24, 2024

Conversation

ykaliuta
Copy link
Contributor

@ykaliuta ykaliuta commented May 21, 2024

Jira: https://issues.redhat.com/browse/RHOAIENG-7312

kserve depends on odh-model-controller which it starts by deploying
manifests of Dependent Operator. The controller behaviour depends of
configuration (authorino) which is later deployed by configuring
servicemesh features. Here is a race, there are 2 checks in the
odh-model-controller for the presence of AuthorizationPolicy (which
is deployed by servicemesh configuration):

  1. to add a type to the schema
  2. to watch the objects of that type.

If the object appears in between odh-model-controller complains:

2024-05-16T06:46:03Z ERROR Reconciler error {"controller": "inferenceservice", "controllerGroup": "serving.kserve.io", "controllerKind": "InferenceService", "InferenceService": {"name":"xf","namespace":"single-model-test"}, "namespace": "single-model-test", "name": "xf", "reconcileID": "e6f42f44-1866-45d4-836a-69e4e93edef4", "error": "1 error occurred:\n\t* could not GET authconfig single-model-test/xf. cause no kind is registered for the type v1beta2.AuthConfig in scheme \"pkg/runtime/scheme.go:100\"\n\n"}   sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler    /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:329   sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem    /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:274   sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
 /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235

Move servicemesh configuration before deploying the manifests to
narrow the race window.

odh-model-controller will change their check in favor of checking
Authorino CRD to avoid the race completely.

Checking AuthorizationPolicy existance in operator/kserve would fix
it as well, but it's a delay for the reconcile loop (vs
odh-model-controller where it's done only on startup), so since
odh-model-controller is going to reimplement the check, keep it as
it is.

Also modelmeshserving component can deploy
odh-model-controller (thanks Vedant for pointing) if it is
enabled. The order is unspecified but due to implementation it will
happen before kserve configuration (order of field in the Components
structure).

Signed-off-by: Yauheni Kaliuta [email protected]

How Has This Been Tested?

  • deploy operator
  • create DSCI and DSC with kserve enabled
  • check for errors in opendatahub-operator-controller-manager and odh-model-controller log

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

@VedantMahabaleshwarkar
Copy link
Contributor

@ykaliuta we would still have a race condition due to modelmesh even with this PR, right?
e.g:

  • Modelmesh and Kserve and both Managed
  • Modelmesh reconciliation is still independent of servicemesh features, so the odh-model-controller rollout might still complete before the kserve+dependency setup is complete.
  • Here odh-model-controller would still run into the same error.

@israel-hdez can correct me if I'm wrong, but in addition to these changes I believe we also wanted to restart the odh-model-controller pods when the management state for kserve is modified. This way even if odh-model-controller exists in an incorrect state (due to modelmesh), it will be corrected when the pods restart

@ykaliuta
Copy link
Contributor Author

@ykaliuta we would still have a race condition due to modelmesh even with this PR, right? e.g:

  • Modelmesh and Kserve and both Managed
  • Modelmesh reconciliation is still independent of servicemesh features, so the odh-model-controller rollout might still complete before the kserve+dependency setup is complete.
  • Here odh-model-controller would still run into the same error.

Correct, but odh-model-controller implements another check, I mentioned it in the description.

@israel-hdez can correct me if I'm wrong, but in addition to these changes I believe we also wanted to restart the odh-model-controller pods when the management state for kserve is modified. This way even if odh-model-controller exists in an incorrect state (due to modelmesh), it will be corrected when the pods restart

You are right, but changing kserve management state is out of scope of the issue. If restart is supposed to happen when the state changes then it is not the case here, the state stays the same.

@israel-hdez
Copy link
Contributor

israel-hdez commented May 22, 2024

@ykaliuta we would still have a race condition due to modelmesh even with this PR, right? e.g:

  • Modelmesh and Kserve and both Managed
  • Modelmesh reconciliation is still independent of servicemesh features, so the odh-model-controller rollout might still complete before the kserve+dependency setup is complete.
  • Here odh-model-controller would still run into the same error.

Correct, but odh-model-controller implements another check, I mentioned it in the description.

I think part of the problem is that both the DSCInitialization and the DataScienceCluster resources can reconcile in parallel.

Because of ModelMesh, the odh-model-controller would still boot, while the reconciliation of the DSCInitialization is still installing OSSM and Authorino. Thus, there is still the chance of a race condition if the Authorino CRD hasn't been created yet.

So, despite the changes to odh-model-controller and this PR would be fixing a race condition over the AuthorizationPolicy, we still have another race condition over Authorino CRDs creation.

@israel-hdez can correct me if I'm wrong, but in addition to these changes I believe we also wanted to restart the odh-model-controller pods when the management state for kserve is modified. This way even if odh-model-controller exists in an incorrect state (due to modelmesh), it will be corrected when the pods restart

You are right, but changing kserve management state is out of scope of the issue. If restart is supposed to happen when the state changes then it is not the case here, the state stays the same.

I'm not sure if it is really out of scope. Users would be unable to query models in such situation and the title of the Jira would still describe the issue.

@ykaliuta
Copy link
Contributor Author

ykaliuta commented May 22, 2024

@ykaliuta we would still have a race condition due to modelmesh even with this PR, right? e.g:

  • Modelmesh and Kserve and both Managed
  • Modelmesh reconciliation is still independent of servicemesh features, so the odh-model-controller rollout might still complete before the kserve+dependency setup is complete.
  • Here odh-model-controller would still run into the same error.

Correct, but odh-model-controller implements another check, I mentioned it in the description.

I think part of the problem is that both the DSCInitialization and the DataScienceCluster resources can reconcile in parallel.

Because of ModelMesh, the odh-model-controller would still boot, while the reconciliation of the DSCInitialization is still installing OSSM and Authorino. Thus, there is still the chance of a race condition if the Authorino CRD hasn't been created yet.

I do not see problem here, correct me if I'm wrong. AuthConfigs CRD (not CR) is created when Authorino operator is installed (supposed to be done manually before ODH/RHOAI as pre-requisite). Regardless of the CRs odh-model-controller will work (will able to watch it), it will just do nothing until operator deploys configuration, if any.

The CRD stays there if Authorino is uninstalled later. But operators checks for subscription before deploying the configuration.

So, despite the changes to odh-model-controller and this PR would be fixing a race condition over the AuthorizationPolicy, we still have another race condition over Authorino CRDs creation.

@israel-hdez can correct me if I'm wrong, but in addition to these changes I believe we also wanted to restart the odh-model-controller pods when the management state for kserve is modified. This way even if odh-model-controller exists in an incorrect state (due to modelmesh), it will be corrected when the pods restart

You are right, but changing kserve management state is out of scope of the issue. If restart is supposed to happen when the state changes then it is not the case here, the state stays the same.

I'm not sure if it is really out of scope. Users would be unable to query models in such situation and the title of the Jira would still describe the issue.

May be Jira should be renamed. Because inside it describes just the race during fresh install, not changing the state afterwards.

@israel-hdez
Copy link
Contributor

I do not see problem here, correct me if I'm wrong. AuthConfigs CRD (not CR) is created when Authorino operator is installed (supposed to be done manually before ODH/RHOAI as pre-requisite).

You are right. I stand corrected.

I'm not sure if it is really out of scope. Users would be unable to query models in such situation and the title of the Jira would still describe the issue.

May be Jira should be renamed. Because inside it describes just the race during fresh install, not changing the state afterwards.

In the comments, it was identified as something that needs to be fixed.
I'll let @Jooho @VaishnaviHire and @mwaykole to comment if such situation should also be fixed here, or if another ticket needs to be created.

@israel-hdez
Copy link
Contributor

israel-hdez commented May 22, 2024

cc @mattmahoneyrh , because he is QA contact ^read previous^

Copy link
Contributor

@bartoszmajsak bartoszmajsak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to first set up infra bits required for this component before deploying it.

I left one small suggestion around error handling, but other than that LGTM.

components/kserve/kserve.go Outdated Show resolved Hide resolved
Jira: https://issues.redhat.com/browse/RHOAIENG-7312

kserve depends on odh-model-controller which it starts by deploying
manifests of Dependent Operator. The controller behaviour depends of
configuration (authorino) which is later deployed by configuring
servicemesh features. Here is a race, there are 2 checks in the
odh-model-controller for the presence of AuthorizationPolicy (which
is deployed by servicemesh configuration):

1) to add a type to the schema
2) to watch the objects of that type.

If the object appears in between odh-model-controller complains:

```
2024-05-16T06:46:03Z ERROR Reconciler error {"controller": "inferenceservice", "controllerGroup": "serving.kserve.io", "controllerKind": "InferenceService", "InferenceService": {"name":"xf","namespace":"single-model-test"}, "namespace": "single-model-test", "name": "xf", "reconcileID": "e6f42f44-1866-45d4-836a-69e4e93edef4", "error": "1 error occurred:\n\t* could not GET authconfig single-model-test/xf. cause no kind is registered for the type v1beta2.AuthConfig in scheme \"pkg/runtime/scheme.go:100\"\n\n"}   sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler    /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:329   sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem    /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:274   sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
 /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235

```

Move servicemesh configuration before deploying the manifests to
narrow the race window.

odh-model-controller will change their check in favor of checking
Authorino CRD to avoid the race completely.

Checking AuthorizationPolicy existance in operator/kserve would fix
it as well, but it's a delay for the reconcile loop (vs
odh-model-controller where it's done only on startup), so since
odh-model-controller is going to reimplement the check, keep it as
it is.

Also modelmeshserving component can deploy
odh-model-controller (thanks Vedant for pointing) if it is
enabled. The order is unspecified by due to implementation it will
happen before kserve configuration (order of field in the Components
structure).

Signed-off-by: Yauheni Kaliuta <[email protected]>
To avoid linter report:

components/kserve/kserve.go:97:1: cyclomatic complexity 31 of func `(*Kserve).ReconcileComponent` is high (> 30) (gocyclo)

move setupKserveConfig() call under "if enabled" branch below.

Signed-off-by: Yauheni Kaliuta <[email protected]>
Copy link
Contributor

@bartoszmajsak bartoszmajsak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@openshift-ci openshift-ci bot added the lgtm label May 23, 2024
@ykaliuta
Copy link
Contributor Author

/retest-required

@ykaliuta ykaliuta requested review from zdtsw and VaishnaviHire and removed request for rareddy May 23, 2024 15:07
Copy link

openshift-ci bot commented May 24, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bartoszmajsak, zdtsw

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot bot merged commit df469bb into opendatahub-io:incubation May 24, 2024
8 checks passed
israel-hdez added a commit to israel-hdez/odh-model-controller that referenced this pull request May 30, 2024
A race condition was found that leads to odh-model-controller starting in a state not suitable for KServe, despite KServe is installed with authorization enabled.

This mitigates such condition by:
* Always adding AuthConfigs to the schema. Adding types to the schema doesn't seem to have any bad effects, even if such types does not exist in the cluster. This prevents a "no kind is registered" error when trying to reconcile an InferenceService if Authorino setup finished after odh-model-controller booted.
* Invoking `Owns` on inferenceservice_controller.go setup based on the existence of the AuthConfig CRD in the cluster, rather than based on the existence of a specific AuthorizationPolicy.

These changes, together with opendatahub-io/opendatahub-operator#1019 should mitigate/fix the race condition and odh-model-controller should properly start in a good state suitable for KServe.

Related to https://issues.redhat.com/browse/RHOAIENG-7312

Signed-off-by: Edgar Hernández <[email protected]>
VaishnaviHire pushed a commit to VaishnaviHire/opendatahub-operator that referenced this pull request Jun 18, 2024
…b-io#1019)

* kserve: configure servicemesh before deploying manifests

Jira: https://issues.redhat.com/browse/RHOAIENG-7312

kserve depends on odh-model-controller which it starts by deploying
manifests of Dependent Operator. The controller behaviour depends of
configuration (authorino) which is later deployed by configuring
servicemesh features. Here is a race, there are 2 checks in the
odh-model-controller for the presence of AuthorizationPolicy (which
is deployed by servicemesh configuration):

1) to add a type to the schema
2) to watch the objects of that type.

If the object appears in between odh-model-controller complains:

```
2024-05-16T06:46:03Z ERROR Reconciler error {"controller": "inferenceservice", "controllerGroup": "serving.kserve.io", "controllerKind": "InferenceService", "InferenceService": {"name":"xf","namespace":"single-model-test"}, "namespace": "single-model-test", "name": "xf", "reconcileID": "e6f42f44-1866-45d4-836a-69e4e93edef4", "error": "1 error occurred:\n\t* could not GET authconfig single-model-test/xf. cause no kind is registered for the type v1beta2.AuthConfig in scheme \"pkg/runtime/scheme.go:100\"\n\n"}   sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler    /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:329   sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem    /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:274   sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
 /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235

```

Move servicemesh configuration before deploying the manifests to
narrow the race window.

odh-model-controller will change their check in favor of checking
Authorino CRD to avoid the race completely.

Checking AuthorizationPolicy existance in operator/kserve would fix
it as well, but it's a delay for the reconcile loop (vs
odh-model-controller where it's done only on startup), so since
odh-model-controller is going to reimplement the check, keep it as
it is.

Also modelmeshserving component can deploy
odh-model-controller (thanks Vedant for pointing) if it is
enabled. The order is unspecified by due to implementation it will
happen before kserve configuration (order of field in the Components
structure).

Signed-off-by: Yauheni Kaliuta <[email protected]>

* kserve: get rid of extra enabled check for setupKserveConfig()

To avoid linter report:

components/kserve/kserve.go:97:1: cyclomatic complexity 31 of func `(*Kserve).ReconcileComponent` is high (> 30) (gocyclo)

move setupKserveConfig() call under "if enabled" branch below.

Signed-off-by: Yauheni Kaliuta <[email protected]>

---------

Signed-off-by: Yauheni Kaliuta <[email protected]>
VaishnaviHire pushed a commit to VaishnaviHire/opendatahub-operator that referenced this pull request Jun 18, 2024
…b-io#1019)

* kserve: configure servicemesh before deploying manifests

Jira: https://issues.redhat.com/browse/RHOAIENG-7312

kserve depends on odh-model-controller which it starts by deploying
manifests of Dependent Operator. The controller behaviour depends of
configuration (authorino) which is later deployed by configuring
servicemesh features. Here is a race, there are 2 checks in the
odh-model-controller for the presence of AuthorizationPolicy (which
is deployed by servicemesh configuration):

1) to add a type to the schema
2) to watch the objects of that type.

If the object appears in between odh-model-controller complains:

```
2024-05-16T06:46:03Z ERROR Reconciler error {"controller": "inferenceservice", "controllerGroup": "serving.kserve.io", "controllerKind": "InferenceService", "InferenceService": {"name":"xf","namespace":"single-model-test"}, "namespace": "single-model-test", "name": "xf", "reconcileID": "e6f42f44-1866-45d4-836a-69e4e93edef4", "error": "1 error occurred:\n\t* could not GET authconfig single-model-test/xf. cause no kind is registered for the type v1beta2.AuthConfig in scheme \"pkg/runtime/scheme.go:100\"\n\n"}   sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler    /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:329   sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem    /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:274   sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
 /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235

```

Move servicemesh configuration before deploying the manifests to
narrow the race window.

odh-model-controller will change their check in favor of checking
Authorino CRD to avoid the race completely.

Checking AuthorizationPolicy existance in operator/kserve would fix
it as well, but it's a delay for the reconcile loop (vs
odh-model-controller where it's done only on startup), so since
odh-model-controller is going to reimplement the check, keep it as
it is.

Also modelmeshserving component can deploy
odh-model-controller (thanks Vedant for pointing) if it is
enabled. The order is unspecified by due to implementation it will
happen before kserve configuration (order of field in the Components
structure).

Signed-off-by: Yauheni Kaliuta <[email protected]>

* kserve: get rid of extra enabled check for setupKserveConfig()

To avoid linter report:

components/kserve/kserve.go:97:1: cyclomatic complexity 31 of func `(*Kserve).ReconcileComponent` is high (> 30) (gocyclo)

move setupKserveConfig() call under "if enabled" branch below.

Signed-off-by: Yauheni Kaliuta <[email protected]>

---------

Signed-off-by: Yauheni Kaliuta <[email protected]>
zdtsw added a commit to red-hat-data-services/rhods-operator that referenced this pull request Jun 19, 2024
* feat: increase QPS and Burst for client (opendatahub-io#1031)

- we might see throttling in some cluster, this is just to uplift the
default value

Signed-off-by: Wen Zhou <[email protected]>
(cherry picked from commit 54ee87d)

* kserve: configure servicemesh before deploying manifests (opendatahub-io#1019)

* kserve: configure servicemesh before deploying manifests

Jira: https://issues.redhat.com/browse/RHOAIENG-7312

kserve depends on odh-model-controller which it starts by deploying
manifests of Dependent Operator. The controller behaviour depends of
configuration (authorino) which is later deployed by configuring
servicemesh features. Here is a race, there are 2 checks in the
odh-model-controller for the presence of AuthorizationPolicy (which
is deployed by servicemesh configuration):

1) to add a type to the schema
2) to watch the objects of that type.

If the object appears in between odh-model-controller complains:

```
2024-05-16T06:46:03Z ERROR Reconciler error {"controller": "inferenceservice", "controllerGroup": "serving.kserve.io", "controllerKind": "InferenceService", "InferenceService": {"name":"xf","namespace":"single-model-test"}, "namespace": "single-model-test", "name": "xf", "reconcileID": "e6f42f44-1866-45d4-836a-69e4e93edef4", "error": "1 error occurred:\n\t* could not GET authconfig single-model-test/xf. cause no kind is registered for the type v1beta2.AuthConfig in scheme \"pkg/runtime/scheme.go:100\"\n\n"}   sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler    /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:329   sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem    /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:274   sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
 /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235

```

Move servicemesh configuration before deploying the manifests to
narrow the race window.

odh-model-controller will change their check in favor of checking
Authorino CRD to avoid the race completely.

Checking AuthorizationPolicy existance in operator/kserve would fix
it as well, but it's a delay for the reconcile loop (vs
odh-model-controller where it's done only on startup), so since
odh-model-controller is going to reimplement the check, keep it as
it is.

Also modelmeshserving component can deploy
odh-model-controller (thanks Vedant for pointing) if it is
enabled. The order is unspecified by due to implementation it will
happen before kserve configuration (order of field in the Components
structure).

Signed-off-by: Yauheni Kaliuta <[email protected]>

* kserve: get rid of extra enabled check for setupKserveConfig()

To avoid linter report:

components/kserve/kserve.go:97:1: cyclomatic complexity 31 of func `(*Kserve).ReconcileComponent` is high (> 30) (gocyclo)

move setupKserveConfig() call under "if enabled" branch below.

Signed-off-by: Yauheni Kaliuta <[email protected]>

---------

Signed-off-by: Yauheni Kaliuta <[email protected]>

* chore: adds godoc to Feature builder (opendatahub-io#1013)

As the first step of improving Feature DSL this PR brings godoc explaining purpose of each method in the builder chain.

(cherry picked from commit 5ce6306)

* fix: wrong path when use devFlag + wrong default value + special name in (opendatahub-io#1024)

trustyai

Signed-off-by: Wen Zhou <[email protected]>
(cherry picked from commit d9e78b4)

* chore: moves operator/subscriptions operations to pkg/cluster (opendatahub-io#1027)

They do not belong to pkg/deploy as they are about reading/writing cluster resources rather than deploying resources.

(cherry picked from commit d90983b)

* chore: Open up util functions for context propagation (opendatahub-io#1033)

context should be determined by the caller and propagated
down the call chain.

(cherry picked from commit 105adae)

* fix: missing label "opendatahub.io/generated-namespace" on auth (opendatahub-io#1038)

Signed-off-by: Wen Zhou <[email protected]>
(cherry picked from commit d6b108b)

* chore: append ownerRef to resources owned by Features (opendatahub-io#1039)

* rename / chg FeatureOwner func

* remove unused old ownerref func

(cherry picked from commit 6583645)

* Revert "chore: append ownerRef to resources owned by Features (opendatahub-io#1039)"

This reverts commit 6583645.

opendatahub-io#1039 (comment)

Signed-off-by: Yauheni Kaliuta <[email protected]>
(cherry picked from commit 952795a)

* Update Owners-aliases list (opendatahub-io#1040)

(cherry picked from commit cc9aecb)

* RHOAIENG-5426: Updated pull request template with prerequisites (opendatahub-io#1042)

* RHOAIENG-5426: Updated pull request template with prerequisites

* PR template changes

(cherry picked from commit 244ca13)

* chore: renames manifests source to location (opendatahub-io#1050)

- Source is already used elsewhere in Feature DSL, so we might want to
  avoid confusion
- Location fits the purpose of this field better

(cherry picked from commit 8921839)

* Fix trustyai changes

* cluster: GetPlatform: replace CSV list with OperatorExists calls (opendatahub-io#1051)

* tests: envtest: Add OperatorCondition CRD

Add OLM's[1] OperatorCondition/OperatorConditionList to the external
CRDs. Will be used in future patches.

[1] https://github.com/operator-framework/operator-lifecycle-manager/blob/master/deploy/upstream/manifests/0.18.3/0000_50_olm_00-operatorconditions.crd.yaml

Signed-off-by: Yauheni Kaliuta <[email protected]>

* tests: dscinitialization: add OperatorCondition CRD to schema

Following patches will change initialization to list the objects.

Signed-off-by: Yauheni Kaliuta <[email protected]>

* cluster: GetPlatform: replace CSV list with OperatorExists calls

Jira: https://issues.redhat.com/browse/RHOAIENG-8483

Depending of the way operators installed CSVs can be seen in all
namespace so listing of them causes producing N * M results (where N
is number of namespaces and M is number of CSVs) which is not
scalable in general and in practice causes timeouts on such large
clusters.

The function basically checks if ODH or RHOAI operator installed and
there is already such function in the package, OperatorExists(). So,
reuse it.

Signed-off-by: Yauheni Kaliuta <[email protected]>

---------

Signed-off-by: Yauheni Kaliuta <[email protected]>
(cherry picked from commit 261bbab)

* chore: remove duplicated platform call in each component (opendatahub-io#1055)

- get in DSC and pass into compoment

Signed-off-by: Wen Zhou <[email protected]>
(cherry picked from commit 1b04761)

* chore: update toolbox sdk version and remove duplicated addtoschema (opendatahub-io#1061)

Signed-off-by: Wen Zhou <[email protected]>
(cherry picked from commit 1b23c9f)

* chore: funcs to create kustomize plugins (opendatahub-io#1062)

The actual use with resMap is inlined and moved to the caller.

This way we can construct plugins on demand and use them as building blocks instead.

(cherry picked from commit 7bf56a4)

* Fix linter errors

* Update scheme

---------

Signed-off-by: Yauheni Kaliuta <[email protected]>
Co-authored-by: Wen Zhou <[email protected]>
Co-authored-by: Yauheni Kaliuta <[email protected]>
Co-authored-by: Bartosz Majsak <[email protected]>
Co-authored-by: Aslak Knutsen <[email protected]>
Co-authored-by: Cameron Garrison <[email protected]>
Co-authored-by: Saravana Balaji Srinivasan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

5 participants