Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Real-world test cases #10

Open
eric opened this issue Apr 22, 2012 · 2 comments
Open

Proposal: Real-world test cases #10

eric opened this issue Apr 22, 2012 · 2 comments

Comments

@eric
Copy link
Collaborator

eric commented Apr 22, 2012

I wanted to document some of the real-world test cases I've been envisioning for a test suite for this library.

The Setup

It seems like it would be pretty easy to setup a local environment to test some of this stuff:

  • 3 zookeeper servers
  • 2 redis servers
  • 2 clients
  • 2 node monitors

to give us a chance to kill or hang each component and make sure everything reacts appropriately.

Scenarios

Here is an incomplete list of tests that I think should be run against a real set of redis servers and clients.

  • Kill a redis server with SIGKILL (a kill -9) — ensure the failover happens immediately
  • Pause a redis server (causing a hang) with SIGSTOP — ensure the monitor process notices the hang and starts a failover
  • Kill the master monitor process with SIGKILL — ensure another monitor takes over
  • Pause the master monitor process with SIGSTOP and then kill redis with SIGKILL — How long does this take to failover?

Monitoring

While running these tests, it would be worthwhile for the redis clients to be constantly running SET commands against redis.

Tracking the average and max times for requests would be helpful in understanding how long failover really takes. Using my metriks library may be helpful in getting those statistics easily.

I envision the redis client processes having an at_exit defined that would output statistics like the number of keys set, the number of errors, and the average and max times per SET. We could easily compare the number of keys they thought they set with the number that the final master has, to see what sort of failures happened.

@ryanlecompte
Copy link
Owner

Nice! Thanks for putting these testing scenarios together. I have been doing similar testing locally with a 5 node Redis cluster and 5 node ZK cluster. I also have 2 node managers. All of my testing has been with SIGKILL, however. I'd love to get your help on setting this up too. You have some great ideas here.

@eric
Copy link
Collaborator Author

eric commented Apr 22, 2012

Using SIGSTOP and SIGCONT is a great way to ensure that everything works properly with a hung process instead of just a killed one — both cases are important to handle, but the hung case can be harder.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants