-
Notifications
You must be signed in to change notification settings - Fork 427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Data Buckets] Distributed Databucket Caching #3500
Conversation
# Notes - Adds a data bucket cache so we're not needlessly hitting the database every time we need to read a data bucket value.
Putting all of my tests here Bulk NPC LoadingBucket Miss CacheBenchmarkBefore we implemented caching After 35x at 100,000 iterations General Lifecycle Testingsub EVENT_SAY {
if ($text=~/set/i) {
quest::set_data("test", 100);
}
if ($text=~/get/i) {
$client->Message(15, quest::get_data("test"));
}
if ($text=~/delete/i) {
quest::delete_data("test");
}
} |
- Adds a data bucket cache so we're not needlessly hitting the database every time we need to read a data bucket value.
2d3ea0b
to
1709398
Compare
Been pairing with Kingly on this PR The distributed cache stays in sync with what is maintained in the database. The database is always the source of truth and database writes are always done immediately. Test Cases
Creation (Created in zone, and then propagated)Update (Zone to Zone)Bucket Deletion (with propagation)Reading of an Expired Key (Deletion propagated) |
Tested using the same key across scoped and global and found an interesting condition that was fixed in 87c6a77 Test script sub EVENT_SAY {
if ($text=~/set/i) {
$client->SetBucket("Test", 107, "10s");
} elsif ($text=~/get/i) {
my $data = $client->GetBucket("Test");
quest::message(315, $data ne "" ? "player: " . $data : 0);
} elsif ($text=~/delete/i) {
$client->DeleteBucket("Test");
}
if ($text=~/set/i) {
quest::set_data("Test", 3, "20s");
} elsif ($text=~/get/i) {
my $data = quest::get_data("Test");
quest::message(315, $data ne "" ? "global: " . $data : 0);
} elsif ($text=~/delete/i) {
quest::delete_data("Test");
}
} |
Tested spamming deletes and creates back to back |
Been running for several days on Wayfarers |
Goal(s)
As a server operator, have keys cached in memory for readily available access instead of hitting the database. This would allow me to be able to use data buckets more aggressively in cases where I would like to more frequently reference values without incurring database I/O penalty
Problem(s)
Round Trip Database Hits
If you want to use a single data bucket entry in a database intensively for a series of quests and potentially poll it more frequently, you result in more frequent database hits which can add up over time.
Frequent Quest Utilization
If using data-buckets and iterating through an entity list, you can easily add up to 30-50 queries to check the existence of a single key if it is not already cached. This results in round trips to the database and on a game server, we want to avoid doing so if we can.
Solution(s)
Distributed Memory Cache
If we turn the server back-end into a distributed memory cache, we can maintain a reference to flags that are most important and most relevant to an entities progress and have it be available with almost minimal cost.
This would consist of having all zones be part of the distributed cache, while using world as the in between for updates.
Challenge(s)
Race Conditions
Keeping data unique across all zone processes can become a problem of race conditions.
When updates are created in one zone, they need to be propagated to other zones. You can have packets be stormed cross zone especially when the same flag is being manipulated in multiple zones at the same time (albeit rarer).
De-Duplication
You need to have a way to keep the updates unique and discard and de-dupe the ones that are not the latest update.
The update message would need to not only include what is to be updated (key, value, expires, scopes) to the other processes, but would also need to include a timestamp down to the nanosecond so that other processes can use its same reference of time to decide to either discard or update an internal reference to a cached entry.
When a zone process receives a request to update its cache, it will compare it with what it has locally to see if the timestamp is less than the one it received. If the new message is older than the current timestamp, it will discard the message
Updates
Update messages are generated from within the existing data-bucket
Set
methods and would still generate a database write while preparing a message/packet to be sent to the other zone processes.The update would include a struct that would load the necessary update and communicate it to the other zone processes.
update_time
will be a Unix timestamp at the resolution level of nanoseconds which should be enough resolution to delineate duplicate update messages relative to system clock.When Updates are Sent
After a message is committed to the database and a database record
id
is set. It can be used to pack aDataBucketEntry
struct and propagate the message.When Updates are Processed
Message is received from zone <-> world <-> zone, the decision logic is as followed
std::vector<DataBucketCacheEntry>
and find if there is an update that matches theid
that of what was sent in the updateid
that matches, simply discard the messageid
- determine if the new messagesupdate_time
> localupdate_time
- if it is, then update the local record using vector.at(index) = (new) DataBucketCacheEntry
Deletions
Deletions would need to be handled in the same way as updates, by deleting entries located locally within a zone.
Loading Scoped Entries
Scoped entries would be loaded when that entity enters a zone and loaded within that zone. This would enable any bucket entries or flags that are checked on zone-in to simply reference what is in memory instead of making repeated hits to the database.
Other zone processes don't need to be aware of that entities flags because it is unlikely it will need to be referenced cross zone and if it needs to reference it, a database call may be necessary but an infrequent event.
Unloading Scoped Entities
Scoped entries would be unloaded when an entity is destructed, IE when a client leaves a zone.
Testing
Putting all of my tests here
Bulk NPC Loading
Bucket Miss Cache
Benchmark
Before we implemented caching
After
35x at 100,000 iterations
340x at 1,000,000 iterations
General Lifecycle Testing
Distributed Cache Testing
The distributed cache stays in sync with what is maintained in the database. The database is always the source of truth and database writes are always done immediately.
Test Cases
get
test commands simultaneously, both showing expected results when a key is deleted, updated, or setget
test commands simultaneously while usingset
on alternating characters with no expirationget
test commands while usingset
on alternating characters with 3 second expiration. We ensure that the expiration is renewed in destination zones. Also alternated this between the zones and deletion/creation/update life cycle works as expected in all respects. Both showing expected results when a key is deleted, updated, or setCreation (Created in zone, and then propagated)
Update (Zone to Zone)
Bucket Deletion (with propagation)
Reading of an Expired Key (Deletion propagated)