Skip to content
This repository has been archived by the owner on Mar 28, 2020. It is now read-only.

Allow customizing etcd podSpec by sourcing from podTemplate #552

Closed
aaronlevy opened this issue Dec 29, 2016 · 12 comments
Closed

Allow customizing etcd podSpec by sourcing from podTemplate #552

aaronlevy opened this issue Dec 29, 2016 · 12 comments

Comments

@aaronlevy
Copy link

In some cases, a user may want to customize the etcd pods which are created by the etcd-operator, but this is not currently possible, as the podSpec is hard-coded in the operator itself.

An example use case:

Adding a custom annotation, such as checkpointer.alpha.coreos.com/checkpoint=true so that the CoreOS pod-checkpointer will save local copies of the launched etcd pods.

This could be hard-coded in the etcd-operator spec - but there may be other use-cases for customization.

One option here would be to allow (maybe by flag) sourcing a podTemplate, which the user is responsible for creating, then the etcd-operator can use to make copies from.

Example podTemplate:

apiVersion: v1
kind: PodTemplate
metadata:
  name: etcd-pod
  namespace: kube-system
template:
  metadata:
    name: etcd-pod
    namespace: kube-system
    annotations:
      foo: bar
  spec:
      [...]

Then maybe source by flag:

etcd-operator --pod-template=kube-system/etcd-pod

Not sure if this is the right way forward, but I can foresee the need for users to make minor customizations to the launched etcd pods - and wanted to track this as an option

@hongchaodeng
Copy link
Member

My thoughts:
Advantage point that some use cases might want to customize etcd pod spec, e.g. annotations, resources.
Disadvantage point that user now can do random things.

Due to the disadvantage point, we need to define what can be allowed to do.

@xiang90
Copy link
Collaborator

xiang90 commented Dec 29, 2016

@aaronlevy @hongchaodeng

We should hard code self-hosted related annotation for now. we already have a special case for selfhosted. If more users start to want customized spec, we can do it. But I would like to wait to see the demand first.

@davidquarles
Copy link

I'd love the ability to add custom labels so that we can more effectively target these pods in our prometheus aggregates.

@hongchaodeng
Copy link
Member

hongchaodeng commented Mar 4, 2017 via email

@davidquarles
Copy link

sure thing -- we put a generic role label on everything. the primary purpose is to be able to track how much of our allocated cpu/memory is being used across each bucket (and at higher levels). in practice this means a couple recording rules:

namespace_role:memory_usage_ratio =
  sum by(namespace, role)(
    sum by (namespace, pod_name)(container_memory_usage_bytes)
      * on (namespace, pod_name) group_left(role) (sum by (namespace, pod_name) keep_common(pod_info)))
  /
  sum by (namespace, role)(
    sum by (namespace, pod_name)(container_spec_memory_limit_bytes)
      * on (namespace, pod_name) group_left(role) (sum by (namespace, pod_name) keep_common(pod_info)))

namespace_role:cpu_usage_ratio =
  sum by (namespace, role)(
    sum by (namespace, pod_name)(rate(container_cpu_usage_seconds_total[1m]))
      * on (namespace, pod_name) group_left(role) (sum by (namespace, pod_name) keep_common(pod_info))) * 1024
  /
  sum by (namespace, role)(
    sum by (namespace, pod_name)(container_spec_cpu_shares)
      * on (namespace, pod_name) group_left(role) (sum by (namespace, pod_name) keep_common(pod_info)))

we then have grafana utilization dashboards that allow parameterization by namespace and/or role, and we target this label in generic alertmanager rules as well.

@hongchaodeng
Copy link
Member

@davidquarles
Your use case is valid. But I don't want to expose entire pod template only for labels. We might just expose labels. Can you create another issue?

@davidquarles
Copy link

Cool, totally -- access to the raw pod template is definitely way more than I need.

Given direct access to the pod template, the only other thing I might have modified / am curious about in terms of the spec is member storage. Does exposing stable / persistent storage (as opposed to emptyDir volumes) not make sense? Consuming the root partition seems a bit scary (and somewhat cumbersome to manage), especially if these aren't dedicated nodes. Is it because the pods themselves don't have a stable identity, the way they do under StatefulSet? I think we're going to use this in a new production system, so any insight into how to best scale this pattern (on GCE) is appreciated.

I'm also happy to contribute if you'd like a hand. I'm just trying to map everything out. Thanks!

@xiang90
Copy link
Collaborator

xiang90 commented Mar 8, 2017

@davidquarles There are a few stuff on pod spec that the etcd operator wants to take full control of. We do not want to expose these at the moment.

Stable storage is actually one of them. We suggest you to use backup instead of local "stable" storage. etcd has built in replication mechanism. Over-replication is mostly unnecessary. We can revisit that in the future. We do not use stateful set for the same reason.

I am OK with adding customized labels though

@aaronlevy
Copy link
Author

One other random "this future feature might be useful" are pod presets: https://github.com/kubernetes/community/blob/master/contributors/design-proposals/pod-preset.md

This might allow a user to set presets on a pod, but they won't override anything etcd sets on the pod if there is a conflict.

@davidquarles
Copy link

@xiang90 Thanks, that's all good to know. Would it make sense for these nodes to mount a separate disk for /var/lib/kubelet/pods, then, to allow online resizing?

@xiang90
Copy link
Collaborator

xiang90 commented Mar 8, 2017

@davidquarles

That is also a valid requirement. do you mind opening a separate issue to track it?

@xiang90
Copy link
Collaborator

xiang90 commented Apr 26, 2017

I am closing this. We solved this issue by providing pod policy, which contains the pod options that we want users to customize. In such way, we can do better validation and restrict what can be changed. It works well for now. If we see a real requirement for exposing all pod fields, we might expose pod template in pod policy itself. but we did not see that requirement for now.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants