Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create ServiceMonitor to monitor target allocator #2416

Merged
merged 36 commits into from
Dec 19, 2023

Conversation

changexd
Copy link
Contributor

@changexd changexd commented Dec 5, 2023

Description:
this resolves #2311

Link to tracking Issue:
#2311

Testing:
I've added checks to make sure collector gets the target-allocator jobs

side note
I basically followed implementation from #1874, however I've encountered couple issues I wasn't sure if these are expected or not, if these needed to be fixed, I'll open issues and am willing to fix it.

  1. The observability params in collector is not validated, so I did not add it either, should we add validation to these params?

  2. The label app.kubernetes.io/instance from default service of target allocator is $namespace.$collectorName instead of $namespace.$targetAllocatorName, should we change this?

  3. The service port name from TA is specifically map from “targetallocation” to container port “http”, but the collector service itself maps “prometheus” to container port “prometheus”, is this intentional?, if not, should this be fixed?

  4. After disabling observability, the manager doesn't remove the servicemonitor, only after deletion of otelcol resource, it is then removed, this happens to both collector and target allocator.

@changexd changexd changed the title Ta service monitor Create ServiceMonitor to monitor target allocator Dec 5, 2023
@changexd changexd marked this pull request as ready for review December 6, 2023 09:20
@changexd changexd requested review from a team, yuriolisa and pavolloffay December 6, 2023 09:20
Copy link
Contributor

@jaronoff97 jaronoff97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this LGTM. @changexd are all of the open questions resolved?

@changexd
Copy link
Contributor Author

changexd commented Dec 12, 2023

Not yet, there are two questions waiting for confirmation, should I make the changes first and wait for reviews?

edit: I've made some final changes, I think it is ready for review : )

@changexd changexd requested a review from swiatekm December 12, 2023 02:28
@changexd changexd requested a review from jaronoff97 December 14, 2023 02:22
internal/manifests/targetallocator/servicemonitor.go Outdated Show resolved Hide resolved
kuttl-test.yaml Outdated Show resolved Hide resolved
Copy link
Contributor

@swiatekm swiatekm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall, the label generation should use existing functions. I'd also like to clarify where we should do the top-level checks, but that shouldn't block this PR imo.

internal/manifests/targetallocator/servicemonitor.go Outdated Show resolved Hide resolved
internal/manifests/targetallocator/servicemonitor.go Outdated Show resolved Hide resolved
Comment on lines +37 to +39
if params.OtelCol.Spec.TargetAllocator.Observability.Metrics.EnableMetrics && featuregate.PrometheusOperatorIsAvailable.IsEnabled() {
resourceFactories = append(resourceFactories, manifests.FactoryWithoutError(ServiceMonitor))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we create an explicit convention for where we want these checks to live @jaronoff97 @pavolloffay? My impression was that they should be in the manifest generation functions, which should return nil if there's nothing to generate. But I looked at the same logic for the collector, and there the check is also done in the builder.

I don't think this question necessarily needs to block the PR, we can change it later. But I'd like to have clarity here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on being consistent and putting this knowledge into the manifest functions.

The manifest generation functions gets manifests.Params and can make that decision and encapsulate when the objects should be generated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was originally thinking the check would be done in the resource's builder (like where it is now) so it's easy to understand what should be appearing. This would make individual unit testing easier for each of the manifests as they don't need to check on featuregates, however checking if a feature is enabled to make the manifest should indeed happen within the manifest function. I think I would then recommend the following:

  • featurgates belong in the builder for the resource
  • fields of the resource controlling whether a manifest is created (hpa, ingress, etc.) should be handled in the manifest itself

I think regardless of what we choose, we should do that in a followup.

@jaronoff97
Copy link
Contributor

@changexd looks like approvals are looking good, when you get a chance could you resolve the conflicts so we can get this merged before the next release?

@changexd
Copy link
Contributor Author

changexd commented Dec 19, 2023

@jaronoff97 I've resolved the conflicts in code, as for api.md, I use make api-docs, is this okay since it also updates apis that's not related to this PR, or should I resolve it manually?

@swiatekm swiatekm self-requested a review December 19, 2023 11:22
@jaronoff97
Copy link
Contributor

@changexd that's actually fine – it looks like we missed some api-docs changes in a recent version upgrade. I ran it off of main and there was a diff, so this resolves that 😅

@jaronoff97 jaronoff97 merged commit 661e6b3 into open-telemetry:main Dec 19, 2023
27 checks passed
@jaronoff97
Copy link
Contributor

thank you for your contribution :D 🎉

@changexd changexd deleted the ta-service-monitor branch December 20, 2023 02:36
ItielOlenick pushed a commit to ItielOlenick/opentelemetry-operator that referenced this pull request May 1, 2024
* add create service monitor to targetallocator

* add chlog

* add e2e test to ta service monitor

* fix linting

* make bundle

* fix chlog

* add a condition to avoid creating sm when running in unsupported modes

* update bundle

* move ta sm ports into ta definition

* remove changes to config/manager

* assert ta service monitor creation in e2e

* enable prometheus feature in e2e test

* change description of observability field

* move ta observability check to build

* add target allocator to builder test

* fix lint

* fix ta sm manifest test

* remove changes to manager kustomization

* make ta sm return directly

* move ta service monitor test to dedicated tests

* remove commands in e2e

* use function to generate label for sm

* ta sm labels in builder test

* fix api.md

* fix bundling

* make api-docs again

* fix kustomization manager

* fix ta service account in test according to fixed issues
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[target allocator] Create ServiceMonitor to monitor target allocator
6 participants