Skip to content
Closed
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
6 changes: 4 additions & 2 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -14,15 +14,17 @@ require (
github.com/openshift/client-go v0.0.0-20210112160336-8889f8b15bd6
github.com/openshift/library-go v0.0.0-20210330121117-68dd4a4c9d9e
github.com/pkg/profile v1.4.0 // indirect
github.com/spf13/cobra v1.1.1
github.com/spf13/cobra v1.1.3
github.com/spf13/pflag v1.0.5
gopkg.in/yaml.v2 v2.4.0
helm.sh/helm/v3 v3.6.3
k8s.io/api v0.21.1
k8s.io/apiextensions-apiserver v0.21.0 // indirect
k8s.io/apimachinery v0.21.1
k8s.io/cli-runtime v0.21.0
k8s.io/client-go v0.21.0
k8s.io/component-base v0.21.0
k8s.io/klog/v2 v2.8.0
rsc.io/letsencrypt v0.0.3 // indirect
)

// points to temporary-watch-reduction-patch-1.21 to pick up k/k/pull/101102 - please remove it once the pr merges and a new Z release is cut
Expand Down
409 changes: 401 additions & 8 deletions go.sum

Large diffs are not rendered by default.

8 changes: 8 additions & 0 deletions manifests/03-rbac-role-cluster.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,14 @@ rules:
- create
- update
- delete
- apiGroups:
- ""
resources:
- secrets

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does the operator need to be able to fetch any secret from the cluster? Looks like a potential security issue. If we need we should rather create a Role for a specific namespace. But on the other hand I dont see any usage in the added code, where we are fetching any secret for the metrics.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The helm list --all-namespaces action from metrics.go requires this particular permission. If I remove this it would produce:

metrics.go:40] metric helm_chart_release_health_status unhandled: list: failed to list: secrets is forbidden: User "system:serviceaccount:openshift-console-operator:console-operator" cannot list resource "secrets" in API group "" at the cluster scope

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty concerning because console-operator is effectively running as root on the cluster with this change. Is there any way to add the metrics without needing this permission?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is a hard no. We won't escalate the console operator to have read on Secrets cluster-wide.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understand, we are looking for alternative options to mitigate. It won't be a full solution if we have to use the users identity on the console side, but that is what we will pivot towards. For future reference @spadgett and @jwforres, If we had a way to filter Allow Rules by Resource Type like for example "type": "helm.sh/release.v1", would that work to reduce the access escalation? Trying to figure out if I propose a change in Kubernetes RBAC upstream on this.

verbs:
- get
- watch
- list
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
Expand Down
3 changes: 3 additions & 0 deletions pkg/console/operator/sync_v400.go
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,8 @@ import (
"github.com/openshift/library-go/pkg/route/routeapihelpers"

// operator
helmmetrics "github.com/openshift/console-operator/pkg/helm/metrics"

customerrors "github.com/openshift/console-operator/pkg/console/errors"
"github.com/openshift/console-operator/pkg/console/metrics"
"github.com/openshift/console-operator/pkg/console/status"
Expand Down Expand Up @@ -218,6 +220,7 @@ func (co *consoleOperator) GetActiveRouteInfo(ctx context.Context, activeRouteNa
func (co *consoleOperator) SyncConsoleConfig(ctx context.Context, consoleConfig *configv1.Console, consoleURL string) (*configv1.Console, error) {
oldURL := consoleConfig.Status.ConsoleURL
metrics.HandleConsoleURL(oldURL, consoleURL)
helmmetrics.HandleHelmChartReleaseHealthStatus()

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we call the helmmetrics.HandleHelmChartReleaseHealthStatus() inside the SyncConsoleConfig method? Its not using any of its arguments.
I would suggest to eight put it directly to the main sync_v400 method or create a new controller.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, will move it to the main sync_v400. The original intent was to piggyback on the console_url metric. Also willing to add a new controller but that controller will only consist of this one line helmmetrics.HandleHelmChartReleaseHealthStatus().

if oldURL != consoleURL {
klog.V(4).Infof("updating console.config.openshift.io with url: %v", consoleURL)
updated := consoleConfig.DeepCopy()
Expand Down
90 changes: 90 additions & 0 deletions pkg/helm/metrics/metrics.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
package metrics

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets put this into the pkg/helm/metrics/helm_metrics.go so we can reuse functions like recoverMetricPanic() instead of duplicating them

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to pkg/console/metrics/helm_metrics.go


import (
"log"
"os"

"helm.sh/helm/v3/pkg/action"
"k8s.io/cli-runtime/pkg/genericclioptions"
"k8s.io/client-go/rest"
k8smetrics "k8s.io/component-base/metrics"
"k8s.io/component-base/metrics/legacyregistry"
"k8s.io/klog/v2"
)

var (
helmChartReleaseHealthStatus = k8smetrics.NewGaugeVec(
&k8smetrics.GaugeOpts{
Name: "helm_chart_release_health_status",
Help: "Health of the Helm release",
},
[]string{"releaseName", "chartName", "chartVersion"},
)
)

func init() {
legacyregistry.MustRegister(helmChartReleaseHealthStatus)
}

func HandleHelmChartReleaseHealthStatus() {
defer recoverMetricPanic()

actionConfig, err := getActionConfig()
if err != nil {
klog.Errorf("metric helm_chart_release_health_status unhandled: %v", err)
return
}
listAction := action.NewList(actionConfig)
releases, err := listAction.Run()
if err != nil {
klog.Errorf("metric helm_chart_release_health_status unhandled: %v", err)
return
}

if len(releases) == 0 {
// Initialize metrics with value 0
// Reference: https://prometheus.io/docs/practices/instrumentation/#avoid-missing-metrics
helmChartReleaseHealthStatus.WithLabelValues("", "", "").Set(0)
return
}

for _, release := range releases {
releaseStatus := release.Info.Status.String()
healthStatus := 1
if releaseStatus == "failed" || releaseStatus == "unknown" {
healthStatus = 0
}
klog.V(4).Infof("metric helm_chart_release_health_status %d: %s %s %s", healthStatus, release.Name, release.Chart.Metadata.Name, release.Chart.Metadata.Version)
helmChartReleaseHealthStatus.WithLabelValues(release.Name, release.Chart.Metadata.Name, release.Chart.Metadata.Version).Set(float64(healthStatus))
}
}

// Reference: https://github.com/helm/helm/issues/7430#issuecomment-620489002
func getActionConfig() (*action.Configuration, error) {
actionConfig := new(action.Configuration)
var kubeConfig *genericclioptions.ConfigFlags

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dont think kubeconfig is the right name

Suggested change
var kubeConfig *genericclioptions.ConfigFlags
var configFlags *genericclioptions.ConfigFlags

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also why do we need to initialize the configFlags, since on line in 72 you are setting it ? cant we jusst set it directly ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated :)

// Create the rest config instance with ServiceAccount values loaded in them
config, err := rest.InClusterConfig()
if err != nil {
return nil, err
}
// Create the ConfigFlags struct instance with initialized values from ServiceAccount
kubeConfig = genericclioptions.NewConfigFlags(false)
kubeConfig.APIServer = &config.Host
kubeConfig.BearerToken = &config.BearerToken
kubeConfig.CAFile = &config.CAFile
// Empty string for all namespaces
if err := actionConfig.Init(kubeConfig, "", os.Getenv("HELM_DRIVER"), log.Printf); err != nil {
return nil, err
}
return actionConfig, nil
}

// We will never want to panic our operator because of metric saving.
// Therefore, we will recover our panics here and error log them
// for later diagnosis but will never fail the operator.
func recoverMetricPanic() {
if r := recover(); r != nil {
klog.Errorf("Recovering from metric function - %v", r)
}
}
9 changes: 8 additions & 1 deletion test/e2e/metrics_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,8 @@ import (
)

const (
consoleURLMetric = "console_url"
consoleURLMetric = "console_url"
helmChartReleaseHealthStatusMetric = "helm_chart_release_health_status"
)

func setupMetricsEndpointTestCase(t *testing.T) (*framework.ClientSet, *operatorsv1.Console) {
Expand Down Expand Up @@ -66,6 +67,12 @@ func TestMetricsEndpoint(t *testing.T) {
if !found {
t.Fatalf("did not find %s", consoleURLMetric)
}

t.Logf("searching for %s in metrics data...\n", helmChartReleaseHealthStatusMetric)
found = findLineInResponse(t, respString, helmChartReleaseHealthStatusMetric)
if !found {
t.Fatalf("did not find %s", helmChartReleaseHealthStatusMetric)
}
}

func findLineInResponse(t *testing.T, haystack, needle string) (found bool) {
Expand Down
21 changes: 21 additions & 0 deletions vendor/github.com/Azure/go-ansiterm/LICENSE

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

188 changes: 188 additions & 0 deletions vendor/github.com/Azure/go-ansiterm/constants.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading