Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Operator #2388

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Add Operator #2388

wants to merge 4 commits into from

Conversation

damemi
Copy link
Contributor

@damemi damemi commented Feb 4, 2025

This adds a very basic operator for installing, uninstalling, and upgrading Odigos. It is based on the same functions used by the CLI, which should be refactored into a common directory at some point in the future.

Important files:

  • operator/api/v1alpha1/odigos_types.go: This creates the operator.odigos.io self-contained API. The CRD in this file represents an installation of Odigos, and supports most of the current config fields. This API can be versioned separately from the main Odigos API fairly easily because it is self-contained, allowing us to support future breaking changes with conversion webhooks.
  • operator/internal/controller/odigos_controller.go: This is the main reconciler for Odigos CRs. It handles new installations (by creating a CR), modifying an existing installation (modifying the CR), uninstalling (deleting the CR), or upgrades (by upgrading the operator, see below)

Almost everything else is generated by operator-sdk and shouldn't need much, if any review.

Usage

The Operator is installed in any namespace the user wants (such as default or odigos-operator-system). The Operator then watches only its own namespace for Odigos objects.

When an Odigos object is created, the Operator will install Odigos in its own namespace. So the operator and Odigos run in the same namespace. Only one Odigos object can exist.

If the Odigos object is modified, the Operator will re-run the install, patching any changes.

If the Odigos object is deleted, the Operator will uninstall Odigos. The Odigos object uses a finalizer to allow resources to be cleaned up.

OpenShift's Operator Lifecycle Manager will handle upgrading the Operator pod from one version to the next. When the new operator pod starts, it will pick up the existing Odigos config and attempt an upgrade.

Important notes

The version of Odigos is pinned by the VERSION variable, defined in operator/Makefile. This uses kustomize to define a configmap that the operator uses to know what version of Odigos to pull.

Because the operator code depends on the current CLI code, the operator should stay pinned to the same version as Odigos for now. This means that upgrades can be done by simply upgrading the operator, which will find the existing Odigos config, and perform the same upgrade logic as the upgrade cli command.

The operator does depend on a broad range of RBAC rules. It requires every permission that Odigos has (to be able to grant Odigos those permissions), plus the ability to create and delete all of Odigos's resources (deployments, serviceaccounts, etc). We might be able to scope the latter permissions to just the operator's namespace.

Screenshots

Here's how it looks to install with OpenShift:

Odigos listed in Operators view

Screenshot 2025-02-07 at 3 39 29 PM

Odigos Operator config page

Screenshot 2025-02-07 at 3 39 39 PM

Creating an Odigos object (installing Odigos)

Screenshot 2025-02-07 at 3 40 05 PM

Odigos object status post-install/upgrade

Screenshot 2025-02-07 at 3 40 30 PM

@damemi damemi force-pushed the operator-bundle branch 5 times, most recently from fdbd0e1 to f7bd445 Compare February 7, 2025 20:24
@damemi damemi marked this pull request as ready for review February 7, 2025 20:45
Copy link
Collaborator

@blumamir blumamir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks this looks really useful.

Didn't have the chance to run it yet, and didn't dive into the bundle stuff, but wrote few nits / minors.

Eventually, we would also want to add it to docs here, perhaps we can break them into 3 separate pages since the list is getting longer? anyway we can also do that later

mgr, err := ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{
Scheme: scheme,
Metrics: metricsServerOptions,
WebhookServer: webhookServer,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need webhook server here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is scaffolded code from kubebuilder/operator-sdk, I don't think we are technically using it right now. But even if we don't set it, controller-runtime still sets it to a default empty Server anyway

I'd rather not modify the default scaffolded code without really understanding it, and this seems like something that could cause headaches if we do add a webhook later on

Comment on lines 57 to 68
// APIKey is an API Key for Odigos Cloud
// +operator-sdk:csv:customresourcedefinitions:type=spec
APIKey string `json:"apiKey,omitempty"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be removed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove the usage in odigos install too? (not in this pr, just asking since I copied it from there):

if odigosCloudApiKeyFlag != "" {
odigosTier = common.CloudOdigosTier
odigosProToken = odigosCloudApiKeyFlag
err = verifyOdigosCloudApiKey(odigosCloudApiKeyFlag)
if err != nil {
fmt.Println("Odigos install failed - invalid api-key format.")
os.Exit(1)
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we already have this one:
https://github.com/odigos-io/odigos/blob/main/.golangci.yml

Do we need a specific lint configuration only for the operator?

ConfigVersion: 1,
}
} else {
odigosConfig = *config
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The objects from kubebuilder usually have this deepCopy functions which I see are used for this. Do we have this here also?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we have that for comon.OdigosConfiguration, looks like it's just a struct, not generated to a runtime object anywhere?

type OdigosConfiguration struct {

odigosConfig.OdigletImage = k8sconsts.OdigletImageUBI9
odigosConfig.InstrumentorImage = k8sconsts.InstrumentorImageUBI9
odigosConfig.AutoscalerImage = k8sconsts.AutoScalerImageUBI9
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do schedualer collector etc get's pulled from redhat here if not set?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They parse it in their own cli install functions based on the odigos config:

imageName := k8sconsts.SchedulerImage
if a.config.OpenshiftEnabled {
imageName = k8sconsts.SchedulerImageUBI9
}

since we want to remove odiglet/instrumentor/autoscaler flags anyway, we should probably update those components to just check the config too

@@ -0,0 +1,22 @@
# This kustomization.yaml is not intended to be run by itself,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these kubsomize things relevant to odigos? do we need to keep them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, this is how the ready-to-deploy operator manifests are built and packaged into the bundle with operator-sdk/kubebuilder

@damemi damemi force-pushed the operator-bundle branch 5 times, most recently from 3151416 to 45bb49c Compare February 24, 2025 19:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants