Allow an optional instance name, use it for consistent hashing #25

antirez · 2012-12-05T18:09:06Z

The problem

Twemproxy can be configured in order to avoid auto ejecting nodes, and when it is configured this way the user can rely on the fact that a given key will always be mapped to the same server, as long as the list of hosts remain the same.

This is very useful when using the proxy with Redis, especially when Redis is not used as a cache but as a data store, because we are sure keys are never moved in other instances, never leaked, and so forth, so the cluster is consistent.

However since Twemproxy adds a given host into the hash ring by hashing the ip:port:priority string directly, it is not possible for users to relocate instances without as a side effect changing the key-instance mapping. This little detail makes very hard to work with Twemproxy and Redis in production environments where network addresses can change.

Actually this is a problem with Memcached as well. For instance if our memcached cluster changes subclass, the consistent hashing will completely shuffle the map, and this will result info many cache misses happening after the reconfiguration.

Proposed solution

The proposed solution is to change the configuration so that instead of a list of instances like:

servers:
   - 127.0.0.1:6379:1
   - 127.0.0.1:6380:1
   - 127.0.0.1:6381:1
   - 127.0.0.1:6382:1

It is (optionally) possible to specify an host / name pair for every instance:

servers:
   - 127.0.0.1:6379:1 server1
   - 127.0.0.1:6380:1 server2
   - 127.0.0.1:6381:1 server3
   - 127.0.0.1:6382:1 server4

When an instance name is specified, it is used to insert the node in the hash ring instead to hash the ip:port:priority.

Open problems

One open problem with this solution is that modifying the priority will still mess with the mapping.
There are several solutions to this problem:

Simply ignore the problem and warn the user in the documentation.
Ignore the priority when an instance name is specified.
Ignore the priority when an instance name is specified, but read it instead from the name. For instance an instance name like "myserver:100" has priority 100. In this way it is obvious that to change the priority the user is forced to change the name, and hence the map.

The text was updated successfully, but these errors were encountered:

jzawodn · 2012-12-05T18:26:12Z

+1

This is pretty much exactly how we do it at craigslist with our sharding setup. We has to a "node name" rather than directly to an IP:PORT pair, so it's possible to move data without losing any keys.

http://blog.zawodny.com/2011/02/26/redis-sharding-at-craigslist/

antirez · 2012-12-05T18:40:20Z

Thanks for the ACK Jeremy! I also did the same when trying to implement Dynamo concepts on top of Redis.

manjuraj · 2012-12-05T19:27:08Z

I like this idea of using the "node name" (when specified) instead of "host:port" pair as input to consistent hashing. I also believe that this should be fairly easy to implement

Regarding the open problem of priority, we can just use the priority from the "host:port:priority" triplet. For example, for a input like "127.0.0.1:6382:1 server4" we will use "1" as the priority of server4

antirez · 2012-12-05T19:54:03Z

@manjuraj doesn't the priority affect the way the hash ring is populated? (more repliacas of the same node if priority is higher)? If not I was addressing a non existing problem (that just changing the priority would change the map).

charsyam · 2012-12-05T20:12:16Z

@manjuraj I have a question. if server->port is 11211, then why don't you attach port number in hash string? is there special issue?

           if (server->port == KETAMA_DEFAULT_PORT) {
                hostlen = snprintf(host, KETAMA_MAX_HOSTLEN, "%.*s-%u",
                                   server->name.len, server->name.data,
                                   pointer_index - 1);
            } else {
                hostlen = snprintf(host, KETAMA_MAX_HOSTLEN, "%.*s:%u-%u",
                                   server->name.len, server->name.data,
                                   server->port, pointer_index - 1);
            }

manjuraj · 2012-12-06T00:35:32Z

@charsyam This code exists for backward compatibility reasons.

When we deployed twemproxy inside twitter for memcached protocol, for a while we would do dual reads - read data through proxy and read data directly from backend server cluster and ensure that we read the same data from both code paths. Since the client was using libmemcached, we had to make sure that we used the same consistent hashing algorithm as that used by libmemcached library to ensure that keys get mapped to the same server.

I guess, we can now update this code to not attach a port number only if the server pool is a memcache server pool

manjuraj · 2012-12-06T00:47:05Z

@antirez priority refers to the weight of a server. For example, if I am running redis on a server1 with 4G and another redis on server2 with 8G, I would want to give server2 twice the weight given to server1 in order for the keys to distribute evenly across the total cluster memory

So, if a server migrates from "127.0.0.1:6379:X server1" to "1.2.3.4:8888:Y server1", we ensure that we keep the weights X and Y same to keep the key mapping stable

antirez · 2012-12-06T08:20:40Z

@manjuraj yes, I and you understand this, but IMHO this is the random user interaction:

"Hey we got this new fast box with BIG RAM! Holy Shit let's move one of our instances there"

- 192.168.1.3:6379:10 server1
+ 192.168.1.5:6379:99 server1

"Look, I updated the priority because this box is so much bigger!"

And the user ends with data shuffled around instances in a way that is very hard to recover.

So back to my proposals, honestly, both ignoring priority and putting it into the name sound wrong to me. For the following reasons:

Ignoring priority is a surprising behavior.
Forcing it to be part of the name could work but there is a numerical part anyway, like "myserver:1000", users may still think that the numerical part can be changed without problems.

It's probably better just to use warnings inside the documentation to make sure people understand that changing priority OR instance name will result in different mapping of keys.

charsyam · 2012-12-06T09:47:29Z

@antirez @manjuraj it is complicated problem. I also think ignoring priority is good way when redis is true. but it can also cause some misconception because twemproxy also has to support memcache.

like craiglist. some can use like below too.

192.168.1.3:2000:1 server1-1
192.168.1.3:2001:1 server1-2
192.168.1.3:2002:1 server1-3
192.168.1.3:2003:1 server1-4

but, no one can deny that users will easily make a mistake.

antirez · 2012-12-06T09:54:20Z

Maybe the ultimate solution is that:

If node ejection is false.
If redis is true
If for every node the user specified a node name

THEN -> Exit with an error if the specified priority is not always "1", with an error message that makes sense, like:
"You are proxying Redis protocol with node ejection disabled and explicit names for all the nodes. In this setup usually a static map between keys and hosts is needed, so all the instances must be configured with priority 1 (otherwise changing the priority may change how keys are mapped to servers)."

Optionally one may support an option to still allow non-1 priority with Redis server in this setup.

Ok I think so far this is absolutely the best option we have.

manjuraj · 2012-12-10T19:11:55Z

fixed by @charsyam; docs updated: https://github.com/twitter/twemproxy/blob/master/notes/recommendation.md#node-names-for-consistent-hashing

charsyam mentioned this issue Dec 6, 2012

Allow an optional instance name for consistent hashing #26

Merged

manjuraj closed this as completed Dec 10, 2012

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow an optional instance name, use it for consistent hashing #25

Allow an optional instance name, use it for consistent hashing #25

antirez commented Dec 5, 2012

jzawodn commented Dec 5, 2012

antirez commented Dec 5, 2012

manjuraj commented Dec 5, 2012

antirez commented Dec 5, 2012

charsyam commented Dec 5, 2012

manjuraj commented Dec 6, 2012

manjuraj commented Dec 6, 2012

antirez commented Dec 6, 2012

charsyam commented Dec 6, 2012

antirez commented Dec 6, 2012

manjuraj commented Dec 10, 2012

Allow an optional instance name, use it for consistent hashing #25

Allow an optional instance name, use it for consistent hashing #25

Comments

antirez commented Dec 5, 2012

The problem

Proposed solution

Open problems

jzawodn commented Dec 5, 2012

antirez commented Dec 5, 2012

manjuraj commented Dec 5, 2012

antirez commented Dec 5, 2012

charsyam commented Dec 5, 2012

manjuraj commented Dec 6, 2012

manjuraj commented Dec 6, 2012

antirez commented Dec 6, 2012

charsyam commented Dec 6, 2012

antirez commented Dec 6, 2012

manjuraj commented Dec 10, 2012