-
Notifications
You must be signed in to change notification settings - Fork 703
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] The engine isn't usable when running with a large number of databases. #1597
Comments
this is something mention in this thread redis/redis#12738 (comment). @soloestoy Thoughts? |
can't we just keep a linked list of dbs with expiry? or use the BIT as @zuiderkwast suggested before? |
I think we should set a reasonable limit on the number of databases. Even when using bitmaps, setting a bitmap for INT_MAX databases would still require 256MB of space, which is clearly not suitable. Moreover, with such a large bitmap, the efficiency of computations each time would also be an issue. |
@ranshid
Private_Dirty is 0, it seems that memset isn't actually invoked, but this can be verified. Regarding bitmap, I'm not sure why we need this, is changing databases that common and latency sensitive? I thought maybe we can use a hastable directly. If there's a reason to use bitmap, and we're worried about memory usage, we can use bloom filter, occasionally getting false positives shouldn't be an issue. But the question really is whether this bug is worth fixing. |
Personally, I'm only concerned that if we enable databases for cluster mode, the used memory for an empty node jumps 10 times, from 1MB to 10MB. It can be seen as a regression, which sticks out. If users use thousands of databases, it's their own problem IMAO. |
If we think #1601 is worth implementing, this change could be further justified. |
I kind of agree with this. As long as the default behavior stays the same, it should be OK that setting databases to some high value causes issues. The only concern I have is that the default database value is 16, which we artificially rewrite to be 1, but some users could be passing in 16 at startup. We shouldn't regress too much in that case. |
I am not sure what you are referring. I meant using the Binary Index Tree (not exactly a bitmap) in order to manage the sizes of the dbs and their expiry in order to manage the activeExpiry better. I am not sure, though, a simple linked list of db's with expiry would not do the work still. If each DB uses 3 kvstores (expiry, keys, sharded pubsubs) so it allocates 256KB upon creation so each DB will have about 770KB zmalloc size right? so that does not sound like a disaster in terms of a single DB but scale bad for thousands.
|
IMHO, if we support multi databases in cluster, an empty node' memory jumps from 1MB to 10MB is not a big deal. Empty is just a temporary state, it's impossible that users just create empty node for fun. And remember that we have already change the default |
I agree that I would not invest in a complicated change just to solve this. I think that if we do have a quick ans simple way to just better support the default config it might valid to do it, but we can also do as a follow-up. |
I think we're all coming at this from a managed service provider where 9MB is not a big deal. I've seen self-hosted folks complain about this size of memory jump though, since it's often multi-tenant and sometimes running on limited hardware. If it's easy to do it's better to not have to announce a big memory jump in a new version, it would be better for it to behave the same and better when a user upgrades. |
Valkey supports up to
INT_MAX
databases. While most customers use the default (16 databases), if Valkey is started with a large number of databases, the engine's CPU usage increases significantly, rendering it almost unusable, even when the databases are empty and there is no incoming traffic.For example, an engine which was started with
exhibits a latency of over 200 milliseconds:
Perf shows:
We can see the the CPU is busy (%70 utilization):
Expected Behavior
The engine should remain responsive with low CPU usage when handling empty databases without any load.
Proposed Solutions
The text was updated successfully, but these errors were encountered: