-
Notifications
You must be signed in to change notification settings - Fork 740
How can I specify volumes spec? #957
Comments
Duplicate to #873 . We can provide such options in PodPolicy. |
One problem we have for enabling hostPath is that it could reuse data that exists before. |
I see. Is there any workaround for now? |
There isn't any current workaround. But in the future I envision the solution would be that: like a human operator, etcd operator will mount the datadir to hostPath to endure restarts, and clean up/prepare datadir before starting etcd process, or clean up datadir after stopping etcd process. |
@hongchaodeng That would be great. Thanks! For now, I'm trying to hack PodPolicy for this, but I failed to build it in |
That's not etcd operator issue. It's glide issue. |
A long term solution for this would be to use persistent local storage: kubernetes/community#306 |
hostpath can be a quick hack. however you have to per config the hostpath as far as i can tell (set permission for example). we can use emptydir now since we never restart a failed etcd server for non-self hosted etcd anyway. in theory, tmpfs should work just fine... i do not really know how to work around this unless kubernetes/community#306 lands. we also want to be able to specify a stable storage on a local node, then we can try to restart a failed pod on the same node to save replication cost. |
/cc @junghoahnsc |
you need to install Mercurial, which is similar to git. some deps are hg deps. that is not really an etcd operator issue. so do not create an issue for it. |
Forgive my naive question, why can't use previous data?Wouldn't it catch up with other peer after restarting? And what is the problem with restarting, I feel it need less effort for replication |
I could run a cluster with a hostpath hack for now, but I hope k8s supports this soon :) |
@GrapeBaBa Yes and no. Let's say I have etcd-0, etcd-1, etcd-2, and etcd-0 went down, then restarting etcd-0 and reusing previous data should be fine. However, this is making assumptions on specific nodes, specific failure scenarios. First, how did we know the node for etcd-0 was still there. Second, how did we know etcd-0 was removed? Third, how did we know it is not created by another member like etcd-3? etc. We want generic abstractions over hostnames and storage. The hostPath is just a hack to use specific mount volume at the moment. It's not a good abstraction for cluster level storage management. This is a feature that we should try to work with k8s upstream to get a better volume support for new disk (e.g SSD) mount partitions. |
@junghoahnsc |
@hongchaodeng Besides hostpath, why operator use restart never strategy when using emptydir? |
when an etcd member crashes and becomes unresponsive, we will simply delete the pod instead of relying on its own to restart on the same node. this is because the limited restart policy k8s provides. there is no way to tell a pod that it should not restart on failure type X, but restart on failure type Y. You have to restart in all cases or none. In quite a few cases, like if there is a disk corruption or raft panics, the member should not restart or it will run into restarting loop. there is simply no way to specify that. never restart is just a current compromise, which happen to enable you to use emptydir with tmpfs to achieve max throughput... (/cc @junghoahnsc). we might revisit this if kubelet can provide better restart policy (or we do better liveness checking). or when local pv lands, never restarting a pod wont be a problem at all since if we restart the pod on the same node at a higher priority, the data stickiness is still there anyway. |
@xiang90 @hongchaodeng Thanks. |
My naive hack with using |
@junghoahnsc If you can afford the memory usage, probably a better approach for you is to use in memory empty dir for now to max your performance. |
@xiang90 thanks for suggestion, but I don't think we can provide enough memory to hold all data. BTW, I have two questions on backup.
|
no, it is not expected. can you create a new issue for this with steps to reproduce?
sidecar might need to save intermediate backup to disk before it can upload to S3. sidecar has no anti-affinity with etcd pods, so it can end up on the same machine with etcd pod. does this cause you any issue? |
I create a new issue. For sidecar, I set anti-affinity with etcd pods. When I enabled backup, one pod was not scheduled. |
@junghoahnsc ok. i understand the issue now. can you also create a new issue to track the sidecar pod problem? we will resolve it. |
Sure, created. |
Hello,
I'm trying to deploy etcd cluster in GKE. I created a node pool with SSD and I think
I need to specify volumes spec to use it like
But I couldn't find any way to specify this from examples. How can I do that?
Thanks,
The text was updated successfully, but these errors were encountered: