Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VMAlertmanager permission problems with persistent storage configuration #762

Closed
Munsio opened this issue Sep 20, 2023 · 4 comments
Closed
Assignees
Labels
bug Something isn't working

Comments

@Munsio
Copy link

Munsio commented Sep 20, 2023

Describe the bug

In the Operator way when creating a VMAlertmanager CRD Object it is possible to define a spec.storage.volumeClaimTemplate for persistent storage of logs and silence rules from Alertmanager.

Unfortunately the user with witch the directory is mounted is the root user and the pod runs as nobody with such a configuration the alertmanager is not able to persist the data.

I know this is only applicable when not using multiple replicas of Alertmanager .

I was able to use an initContainer to chown the mounted directory before. Maybe this could be added in the Documentation?

To Reproduce

apiVersion: operator.victoriametrics.com/v1beta1
kind: VMAlertmanager
metadata:
  name: example-alertmanager
spec:
  replicaCount: 1
  selectAllByDefault: true
  storage:
    volumeClaimTemplate:
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 100Mi
        storageClassName: ceph-rbd-sc
  extraArgs:
    data.maintenance-interval: 5m
  initContainers:
    - name: volume-mount-hack
      image: busybox
      command: ['sh', '-c', 'chown -R 65534:65534 /alertmanager']
      volumeMounts:
        - name: vmalertmanager-example-alertmanager-db
          mountPath: /alertmanager

Version

victoria-metrics-operator-0.27.0 chart with version 0.38.0

Logs

No response

Screenshots

No response

Used command-line flags

No response

Additional information

No response

@Munsio Munsio added the bug Something isn't working label Sep 20, 2023
@Haleygo
Copy link
Contributor

Haleygo commented Sep 21, 2023

Hello! @Munsio
v0.38.0 should contain the fix, do you update the operator after the creation of vmalertmanager?

And I'll transfer this issue to operator repo since it's about operator :)

Update: you're right, it's a different issue, will come up to the fix soon, thanks!

@Haleygo Haleygo transferred this issue from VictoriaMetrics/VictoriaMetrics Sep 21, 2023
@Haleygo Haleygo self-assigned this Sep 21, 2023
@Haleygo
Copy link
Contributor

Haleygo commented Sep 21, 2023

VMAlertmanager is actually using alertmanager image from prom/alertmanager which using nobody as default user since long time ago [maybe vm images should follow the pattern to not run as root by default].
So all the time with spec.storage.volumeClaimTemplate, user will encounter this error and can fix it in three ways:

  1. add initContainers like @Munsio did
  2. enable VM_ENABLESTRICTSECURITY in operator
  3. specify securityContext under pod or container

Prometheus operator doesn't handle or document this, and in charts, it will have a default securityContext using user 2000.
We can probably document this or add some code to implement the needed securityContext if extra volumeClaimTemplate is using. I preferred the second option for better user experience, but it might mess up some users previous vmalertmanager if they already give root permission to the volume.
So I will just add some notes for the field, do you guys have other opinion on this? @f41gh7 @Amper

@f41gh7
Copy link
Collaborator

f41gh7 commented Sep 26, 2023

We can probably document this or add some code to implement the needed securityContext if extra volumeClaimTemplate is using. I preferred the second option for better user experience, but it might mess up some users previous vmalertmanager if they already give root permission to the volume.
So I will just add some notes for the field, do you guys have other opinion on this? @f41gh7 @Amper

I think, it'd be great to provide spec.strictSecurity: true per resource configuration. It solves current issue with alertmanger without affecting other components.

It allows to perform migration for insecure components by granular changes.

E.g. you can start for new components adding strict security. Migrate old ones one-by-one into new model and enforce global param at operator env, when migration finished.

f41gh7 added a commit that referenced this issue Oct 4, 2023
it allows to migrate from insecure deployments to strictly secured per component.
#762
@Haleygo
Copy link
Contributor

Haleygo commented Nov 2, 2023

Hello @Munsio
operator added UseStrictSecurity for all components including vmalertmanager since v0.39.0.
You can solve this by enable UseStrictSecurity under vmalertmanger.

Close this as completed, free feel to reopen if there is problem.

@Haleygo Haleygo closed this as completed Nov 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants