-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
twemproxy work with redis-sentinel #324
Open
andyqzb
wants to merge
36
commits into
twitter:master
Choose a base branch
from
andyqzb:ha_with_sentinel_rebase
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+1,387
−31
Open
Changes from 26 commits
Commits
Show all changes
36 commits
Select commit
Hold shift + click to select a range
a63cd7a
Merge pull request #1 from twitter/master
andyqzb 47afc28
Merge pull request #2 from twitter/master
andyqzb 49d62f7
Merge pull request #3 from twitter/master
andyqzb dcc2613
Merge pull request #4 from twitter/master
andyqzb 1914e4a
add sentinels config to nc_conf
andyqzb a0ba197
add nc_sentinel.c which works with sentinel
andyqzb 334b7e5
add mbuf_read_string and server_switch function
andyqzb 9fde4c5
add conf_server member to struct server
andyqzb a6abc7e
modify sentinel_status to member of struct conn and fix some warning
andyqzb e0d8cf2
modify sentinel connect always no matter what preconnect is
andyqzb 431889d
don't log fake request in req_log
andyqzb 49c2d7b
add sentinel tag to struct server, add sentinel disconnect and deinit…
andyqzb b935675
fix sentinel sentinel disconnect bug, and add sentinel deinit check
andyqzb e27b51c
skip sentinel server's stats
andyqzb ca5311b
add sentinel reconnect after conn is closed
andyqzb 36c5c2f
add sentinel conf, merge conn_get_sentinel to conn_get
andyqzb 70cb70a
delete conn_get_sentinel prototype
andyqzb 948ffb2
modify aligning of struct server
andyqzb 2612dcf
add assert in server_init
andyqzb 06303aa
add a sentinel address to pool sigma in example configure
andyqzb 4faefb0
add sentinels description to README.md and redis.md
andyqzb 95c87eb
merge twitter/master and resolve conflict
andyqzb b954260
update README.md
andyqzb b6b2c5d
log error msg when open temp conf failed
andyqzb be5c36d
add sentinel test cases
andyqzb 9e77b58
copy sentinel binary in travis.sh
andyqzb a910ca2
update travis.sh to support sentinel
andyqzb 131712c
support set keepalive interval parameter
andyqzb ea45e69
Merge pull request #5 from twitter/master
andyqzb fcc80f7
modify sentinel req to RESP prototol
andyqzb 89ddda9
Merge branch 'twitter:master' into master
andyqzb 081b327
Merge branch 'twitter:master' into master
andyqzb a62fa1d
modify sentinel tests to support python3 and fix merge conflict
andyqzb a7dc446
modify nc_sentinel.c to support multi version redis-sentinel
andyqzb c48722f
check the error code of conf_write
andyqzb ef3d8d0
continue to process left master info when proxy met a unknown server …
andyqzb File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: don't you need three sentinels for quorum in this example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mishan You can configure any number of sentinels no matter how many sentinels you use. Becase twemproxy just need connect to one of them to fetch the redis address.
Of cause, configuring all the sentinels you use is better. Because if some of the sentinels are dead, twemproxy can connect to the alive sentinels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid being confused, I added one sentinel address to the example configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is actually a good reason to connect to multiple Sentinels. If the Sentinel you are connected to is on the minority side of a network partition, and the original master is in that partition, it could report it's "old" master. By querying all of the Sentinels you'd detect it and route to the majority master.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@therealbill In the partition, if twemproxy can only connect a part of the sentinels, and some told you new master, some told you old master. Twemproxy don't know whick is right if the both two sides don't have instances over half of all. Another, the design of sentinel doesn't work well in network partition, so I think take consideration of network partition doesn't make much sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is quite the contrary. If Twemproxy can’t verify the current master it should refuse to proxy commands for the server. If you’ve only got one sentinel you can communicate with, it would be wise to assume the current master is no longer the master, but minimally cautious to simply refuse the connections for that slot. That is preferable to having data go to multiple places. One of the big benefits to using Twemproxy+sentinel is the ability, if done correctly, to make split-brain scenario something the clients are immune to. I’ve done it outside of Twemproxy so it is certainly doable - and not terribly complicated.
Conceptually the best place to run those sentinels will be the Twemproxy nodes themselves. However that isn’t alway possible. As such your network between Sentinels and Sentinels+Redis can partition but twemproxy could still see the whole picture. But regardless by having the proxy not accept commands (or minimally not accept write commands) when you aren’t getting results which are confirmable you remove split-brain by nipping it at the very point it can happen - you stop proxying that node/slot.
The design of sentinel works fine in network partitions, it is the client which needs to be informed of changes, and Twemproxy is functionally the client in this scenario. Ideally, Twemproxy would subscribe to each sentinel’s master-change event channel.[1] Then, on a master-change event it would update that slot’s master setting and proxy to the new master. By only checking one sentinel you can’t reliably handle the case - and you don’t get the information as fast when you do get it. But by checking each of the sentinel servers you will either know you can’t communicate and can fail to deny-to-safe-mode, or catch the event from the majority’s side and know you need to reconfigure even though the other sentinel(s) hasn’t caught the change. Incidentally there is a lag between the failover and the non-leader Sentinels so even outside of a network split the proper route is to catch it from the leader.
Ideally you want these actions to be idempotent as well. In other words if you get a change notification multiple times, or the current pull, and the "new-master" is the same as last-updated-master, don’t make changes. I suspect you’ve already done that but sometime people miss that so I mention it solely out of that possibility. But always communicate with more than just one sentinel.
Perhaps to do this we could either:
On the idea of not listing servers, perhaps we could do something like:
Where server name is use by twemproxy for slot identification, but podname is the name you pass to sentinel to get the master & slaves (remember, a sentinel can manage multiple Redis master/slave combos). Then you can do all discovery via sentinel.
On a related note I’m not seeing which code maps a given pod (“server” in Twemproxy parlance) to a given set of Sentinels. Each pod (server) may be managed by different sentinel pools. I’ve also not found where we allow you to use a pod-name other than the server name in the servers section. We should cover that as well. Perhaps something along the lines of (assuming we do not do full discovery via sentinel):
sentinel: true
servers:
Or (better, IMO):
sentinel: true
servers:
sentiinels:
IP:PORT:constellation1
IP:PORT:constellation1
IP:PORT:constellation1
IP:PORT:constellation2
IP:PORT:constellation2
IP:PORT:constellation2
Where constellation = “the group of sentinels managing a given pod”. This latter option could also work well for full-discovery in sentinel. I also think it looks the most clean and readable. We might consider quoting pod-name. IIRC you can pretty much use any ascii character other than a space in a name in sentinels.
Just throwing out some ideas to hopefully express better what I find as missing or ways we can do it even better. To say I use sentinel heavily might be a bit of an understatement.
I see we currently have IP:port:weight for sentinels, but I hope we aren’t using weight at all - ideally we should not have it at all to avoid confusion.
It’d been a long day and I’ll have more time of the next week or two to really dig into this. I apologize for not getting to it sooner, and am really glad to see someone taking the first steps in making what I maintain to be a significant improvement to Twemproxy! So thanks for getting this ball rolling. :D
Cheers,
Bill
P.S. I’ll probably work on or find someone to work on adding Red Skull (http://redskull.io) connectivity as an alternative to Sentinel connectivity to simplify things - but not until I’ve added non-JSON API interfaces to Red Skull.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@therealbill Thanks for your suggestions.
For the first suggestion, twemproxy should connect to all the sentinels. I think we shouldn't make twemproxy too complicated. It's just a client which supports sentinels. The problems in network partition should be solved by sentinel. For example, if the minority of sentinels don't server the clients, twemproxy will connect to a sentinel sunccessfully until it find a sentinel in majority part. Doing this in sentinel is simpler than doing it in twemproxy.
The podname suggestion is a good idea. I have thought to make mastername in sentinel as a mixed type like poolname-servername. So we can manage all the servers in different pool in the same sentinel. But I think the idea of specifying each server a different sentinel pool is not needed. I don’t think there are people want to config one pool's servers into different sentinel pools.
The weight in sentinel config is not used. I just want to reuse the code of loading server config in sentinel config load. Modifying the code to drop weight in sentinel load is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a net split does occur and the original master is in the minority partition, twemproxy will still be connected to it. When sentinel initiates a fail-over normally, it sends a client kill to the old master (if available) to DC existing connections. However, in this scenario that won’t DC twemproxy. Thus Twemproxy needs to be checking/monitoring for failovers and updating/reconnecting as appropriate. Anything short of that means to have reliable redundancy and avoid or minimize the split-brain scenario you have to use the current method of rewriting the config and restarting twemproxy.
It just occurred to me you might be trying to implement this as “just” a TCP proxy/load balancer to Sentinels. I surely hope that isn’t the case as you can’t do it reliably. The way the Sentinel and client setup works clients need direct access to every sentinel for the reasons I listed above regarding why Twemproxy would need to do the things I’m talking about. Clients need to connect to the sentinel and do more than just get the current master. They need PUBSUB and the ability to talk to each sentinel directly to ensure they aren’t talking to a minority sentinel which contained the original master. Please tell me you’re not trying to make Twemproxy a load balancer for Sentinel. :) It would be either as complicated or, more likely, more complicated than what I am talking about above. If you’re using twemproxy you don’t want to talk directly to the backend redis instances - that defeats the main purpose of twemproxy.
You can learn some of these issues as redis/redis#2257 and redis/redis#2045
Cheers,
Bill
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@therealbill I have seen the guidelines of sentinel clients. It said client just need to connect one of the sentinels just like what the patch does. Just like what the guidelines said below, It won't work well in the network partition.
I think twemproy shouldn't do too much things about distributed system. Sentinel is responsible for the cluster. So if you want the cluster works well under the network partition, the sentinels of minority should refuse to server the clients just like zookeeper or chubby.
I have seen the redis issue 2257. I think it's a suggestion to reduce the connections of sentinels. Sentinel is ok when it has 2000+ connections(under the issue condition). The epoll can process tens of thousands of connections easily. Maybe It's a problem when the cluster has thousands of redis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then don't use it with sentinel. By placing Twemproxy in front and having sentinel hidden from the clients you're preventing the clients from doing what you describe. You take on the responsibility of protecting the clients when you prevent them from protecting themselves.
Doing the simple thing is not always the right thing. If you're going to make it impossible for the client to protect itself, you must take the precautions on it's behalf. Twemproxy partitions data, and with the addition of sentinel support it thus becomes a core part of a distributed system. At that point the assertion it shouldn't do the right thing with regard to proper behavior in distributed systems is rather moot.
You can wax on all day about what sentinel could or should do, but you're dealing with Twemproxy, not sentinel. I'm not sure you understand what sentinel is, as it doesn't serve clients it is a lookup service.
When a network partition happens, sentinel has no way to know if a new master has been elected, and as such takes no action. It doesn't know if it is alone and all sentinels are alone (thus no master change happens) or it is in the only minority. And even then, it doesn't know who the new master is. Thus it can't know it is "in the minority" partition. Therefore it can't tell the clients to stop talking to the server as they may still be talking to the right server.
But Twemproxy can figure it out. Any client can, until you put Twemproxy in front of it and prevent it from doing so.
The point of this issue, Andy, is showing that the condition you think is rare is actually common and in a non-trivial amount of the time required. This is just one of many conditions I've seen. I've seen thousands of these setups, multiple sets of sentinels across a bank of pods is not a rare condition. Nor is it going to become one anytime soon.
Cheers,
Bill