Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reserve ports so node retains same address:port across restarts #72

Open
seanjensengrey opened this issue Aug 1, 2016 · 7 comments
Open

Comments

@seanjensengrey
Copy link

seanjensengrey commented Aug 1, 2016

reserve riak ports so that it is retained across a cluster restart

this would allow us to a hot cluster resize where nodes are added (and currently get stuck waiting for handoff) by also doing a rolling restart of the stuck nodes. By retaining the ports, connection information wouldn't also have to be updated in the clients (basho bench).

@mdigan
Copy link

mdigan commented Aug 1, 2016

@seanjensengrey I'm not sure I understand the issue completely. Can you list the order of events as you'd like them to happen in this scenario? Also, would a service discovery mechanism be a more flexible solution for the client issue you mentioned?

@seanjensengrey
Copy link
Author

This would advantageous for a couple reasons

  • cluster could be restarted w/o clients getting updated
  • allow us to run basho bench against during a node addition / rolling restart scenario

Currently when we do a cluster restart, the IPs stay the same, the persistent volumes are retained but the port numbers are different. So that needs to get communicated to every downstream client that is doing a direct connect. In a high throughput environment (bulk loads, Spark, etc), going through a proxy like the RMF Director or HAProxy can be a performance hit.

If we did a reservation for those port numbers, they would stay fixed for the life of the cluster.

We are currently seeing a handoff issue when adding nodes to a live cluster that requires restarting the existing nodes. By adding reservations, we allow hot node additions via a rolling restart.

@sanmiguel
Copy link
Contributor

Withholding ports across task restarts like this is something we've typically assumed is not possible in Mesos - or at least, not guaranteed. My instinctive answer to this was "it's simply not how Mesos works", but I'm now struggling to find the documentation where it says about about not assuming specific ports are available. I had thought it was a criterion in DC/OS Certification...

There might be actions we can have the scheduler take to try to do this, but I don't know if we can guarantee it succeeding every time. I will need to experiment with this in the coming days to get you a more definitive answer.

WRT IPs remaining the same: this is an implementation detail of the current persistent volume machinery in Mesos. A persistent volume, by default, lives on the filesystem of the host it is created upon initially. We re-use that persistent volume when restarting the node, so the node ends up on the same agent after restart simply because that's where the volume is. If we change to using a different persistent volume setup, or the way this one works by default changes, this will no longer hold true.

Meanwhile, could you please share some more details on the handoff issue? What are the symptoms? How can I recreate it?
I feel like the handoff issue ought to be a separate issue...

@seanjensengrey
Copy link
Author

I have created RTS-1275 to track the handoff issue. If you add nodes to a cluster while it is under load (RMF + Riak TS 1.3.1) the new nodes will not take any ring ownership until the cluster is restarted.

It sounded like from what Drew said is if we use reserved ports, they can stay fixed across restarts. If this is any way kludge or is not DC/OS certifiable, lets not implement this.

https://gist.github.com/travisbhartwell/4ab563b62b3a48e128fb356806a5df33

@seanjensengrey
Copy link
Author

I found the DCOS certification spec, https://docs.google.com/document/d/1rtuddOSyZwg7gC3Uye3TqdqT4fm1wWuJiMYQZUwLAM8/edit#heading=h.q5rzjg7ij60y

And it looks like we can make dynamic reservations for ip ports, and ip port ranges.

@mdigan
Copy link

mdigan commented Aug 2, 2016

@sanmiguel I think this is the most recent published version of the DC/OS certification spec: https://docs.mesosphere.com/1.7/usage/managing-services/developing-services/certification-requirements/

and confusingly enough, there is also this doc:
https://docs.mesosphere.com/1.7/usage/managing-services/developing-services/service-requirements-spec/

I do see some differences between the google doc version and the docs.mesosphere.com versions, but I can't find anything about dynamic reservations for ip ports and ip port ranges at first glance.

@sanmiguel
Copy link
Contributor

You're both right - I think I conflated not being able to reserve a specific port (a la epmd requiring 4369) and not being able to re-use a specific port from an offer.

@seanjensengrey it should be possible for us to get this working in the way you wanted.

Sorry for the confusion.

@sanmiguel sanmiguel self-assigned this Aug 3, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants