Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Databuckets] Improved Reliability and Performance of Databuckets #4562

Merged
merged 15 commits into from
Dec 12, 2024

Conversation

Akkadius
Copy link
Member

@Akkadius Akkadius commented Nov 29, 2024

Description

This PR aims to solve the following -

  • Performance - Address observed high-scale performance issues of the distributed cache model under higher load scenarios
  • Reliability - Address rarer inconsistencies of out of sync distributed cache. Data guaranteed reliability is a high priority
  • Code Simplicity - By addressing the two we have big opportunities to simplify some of the code and make it easier to reason about

With this PR, distributed caching has been removed. The distributed caching was beneficial before the scoped caching model was introduced, but also the complexity was not worth the observed performance issues and data reliability issues observed at scale since the initial implementation.

Changes

  • Distributed caching has been removed entirely, massively reducing World network strain and (N x zoneserver) network load
  • Client databucket cache is now cleared from the zone when a client zones
  • Client databucket cache is now cleared from the zone when a client loads in a zone
  • Bot databucket cache is now cleared from the zone when a client zones
  • Bot databucket cache is cleared from the zone when it is newly spawned
  • Simplify the cache code to no longer include an updated_timestamp and update_type wrapper type and now we simply use the native DataBucketsRepository::DataBuckets type
  • We no longer bulk load databuckets for NPCs on zone spawn because they no longer get cached local or distributed
  • We only cache misses as defined by entities scoped in DataBucket::CanCache

Stays the same

  • Bulk load cache for Client and Bot (defined by DataBucket::CanCache)
  • Cache local to zone for Client and Bot (defined by DataBucket::CanCache)

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • Performance Improvement

Testing

This has been tested on THJ for a few weeks now and has dramatically reduced strain on World sending updates to 700x+ zoneservers. It has posed no issues.

select * from data_buckets where character_id = 1;
+----+----------+-----------+---------+--------------+--------+--------+
| id | key      | value     | expires | character_id | npc_id | bot_id |
+----+----------+-----------+---------+--------------+--------+--------+
|  2 | test_key | something |       0 |            1 |      0 |      0 |
+----+----------+-----------+---------+--------------+--------+--------+
sub EVENT_SAY {
    if ($text=~/hail/i) {
        $client->Message(15, $client->GetBucket("test_key"));
    }
}

image

Below you can see the first cache miss, returning from database and then subsequent cache return. (No queries)

  Zone | DataBucket | GetData Getting bucket key [test_key] bot_id [0] character_id [1] npc_id [0] -- [overthere] (The Overthere) inst_id [0]
  Zone | DataBucket | GetData Getting bucket key [test_key] bot_id [0] character_id [1] npc_id [0] -- [overthere] (The Overthere) inst_id [0]
  Zone | DataBucket | GetData Returning key [test_key] value [something] from cache -- [overthere] (The Overthere) inst_id [0]
  Zone | DataBucket | GetData Getting bucket key [test_key] bot_id [0] character_id [1] npc_id [0] -- [overthere] (The Overthere) inst_id [0]
  Zone | DataBucket | GetData Returning key [test_key] value [something] from cache -- [overthere] (The Overthere) inst_id [0]
  Zone | DataBucket | GetData Getting bucket key [test_key] bot_id [0] character_id [1] npc_id [0] -- [overthere] (The Overthere) inst_id [0]
  Zone | DataBucket | GetData Returning key [test_key] value [something] from cache -- [overthere] (The Overthere) inst_id [0]
  Zone | DataBucket | GetData Getting bucket key [test_key] bot_id [0] character_id [1] npc_id [0] -- [overthere] (The Overthere) inst_id [0]

When a client camps / zones, the cached keys are purged correctly.

   Zone | DataBucket | DeleteCachedBuckets LoadType [Client] id [1] cache size before [2] after [1] -- [overthere] (The Overthere) inst_id [0]

Below is a test of a global bucket (doesn't cache so performs queries)

  Zone | DataBucket | GetData Getting bucket key [test_key_global] bot_id [0] character_id [0] npc_id [0] -- [overthere] (The Overthere) inst_id [0]
  Zone |   Query    | QueryDatabase SELECT id, `key`, value, expires, character_id, npc_id, bot_id FROM data_buckets WHERE character_id = 0 AND npc_id = 0 AND bot_id = 0 AND `key` = 'test_key_global' LIMIT 1 -- (1 row returned) (0.000061s)-- [overthere] (The Overthere) inst_id [0]

Checklist

  • I have tested my changes
  • I have performed a self-review of my code. Ensuring variables, functions and methods are named in a human-readable way, comments are added only where naming of variables, functions and methods can't give enough context.
  • I own the changes of my code and take responsibility for the potential issues that occur

@Akkadius Akkadius changed the title [Databuckets] Performance: Don't Propagate Client Updates [Databuckets] Optimize Client-Scoped Databucket Updates to Reduce Network Load Nov 29, 2024
@Akkadius Akkadius changed the title [Databuckets] Optimize Client-Scoped Databucket Updates to Reduce Network Load [Databuckets] Improved Reliability and Performance of Databuckets Dec 6, 2024
@EQEmu EQEmu deleted a comment from catapultam-habeo Dec 8, 2024
@Akkadius Akkadius force-pushed the akkadius/databuckets-perf branch from 8c81441 to 5bb8789 Compare December 8, 2024 04:32
@catapultam-habeo
Copy link
Contributor

This works as advertised. Noticeable improvement in performance, and several recurring but relatively inconsistent issues around unscoped bucket reliability have disappeared since implementing this, with no clear downsides.

@Akkadius Akkadius merged commit 66a7dd0 into master Dec 12, 2024
2 checks passed
@Akkadius Akkadius deleted the akkadius/databuckets-perf branch December 12, 2024 07:17
MortimerGreenwald pushed a commit to MortimerGreenwald/Server that referenced this pull request Dec 13, 2024
…u#4570)

Bumps [golang.org/x/crypto](https://github.com/golang/crypto) from 0.21.0 to 0.31.0.
- [Commits](golang/crypto@v0.21.0...v0.31.0)

---
updated-dependencies:
- dependency-name: golang.org/x/crypto
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>

[Rules] Add rules for requiring custom files from client (EQEmu#4561)

* rules for enabling requiring custom files

* shorten default

* variable name

* check account status for enforcing client key

* rule for custom files admin level

---------

[Commands] Add #find ldon_theme Subcommand (EQEmu#4564)

[Cleanup] Remove Unused Group Methods (EQEmu#4559)

[Feature] Enable bazaar window 'Find Trader' functionality (EQEmu#4560)

* First pass to enable trader 'Find Trader' functionality

* Move SendBulkTraders out of zoning routines and send as part of the opening of the bazaar search window.
Add zone instance to SendBulkTraders to support multi-instanced bazaars.

[Databuckets] Improved Reliability and Performance of Databuckets (EQEmu#4562)

* [Databuckets] Don't broadcast client-scoped updates

* Remove temp feature flag

* Remove distributed caching, only cache for character scoped data, simplify

* Update bot.cpp

* Cleanup

* Update data_bucket.cpp

* Cleanup

* Cleanup

* Remove BulkLoadEntities from LoadNPCTypes

* Update data_bucket.cpp

* Cleanup

* More cleanup

* More cleanup

* BulkLoadEntities to BulkLoadEntitiesToCache

* Add CanCache in DeleteData to gate an unnecessary call

[Cleanup] Convert Event Parses to Single Line (EQEmu#4569)

* [Cleanup] Convert Event Parses to Single Line

* Push

* Update spells.cpp

* Update spells.cpp

---------

Co-Authored-By: Paul Johnson <[email protected]>
Co-Authored-By: Akkadius <[email protected]>
Co-Authored-By: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
MortimerGreenwald pushed a commit to MortimerGreenwald/Server that referenced this pull request Dec 13, 2024
…u#4570)

Bumps [golang.org/x/crypto](https://github.com/golang/crypto) from 0.21.0 to 0.31.0.
- [Commits](golang/crypto@v0.21.0...v0.31.0)

---
updated-dependencies:
- dependency-name: golang.org/x/crypto
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>

[Rules] Add rules for requiring custom files from client (EQEmu#4561)

* rules for enabling requiring custom files

* shorten default

* variable name

* check account status for enforcing client key

* rule for custom files admin level

---------

[Commands] Add #find ldon_theme Subcommand (EQEmu#4564)

[Cleanup] Remove Unused Group Methods (EQEmu#4559)

[Feature] Enable bazaar window 'Find Trader' functionality (EQEmu#4560)

* First pass to enable trader 'Find Trader' functionality

* Move SendBulkTraders out of zoning routines and send as part of the opening of the bazaar search window.
Add zone instance to SendBulkTraders to support multi-instanced bazaars.

[Databuckets] Improved Reliability and Performance of Databuckets (EQEmu#4562)

* [Databuckets] Don't broadcast client-scoped updates

* Remove temp feature flag

* Remove distributed caching, only cache for character scoped data, simplify

* Update bot.cpp

* Cleanup

* Update data_bucket.cpp

* Cleanup

* Cleanup

* Remove BulkLoadEntities from LoadNPCTypes

* Update data_bucket.cpp

* Cleanup

* More cleanup

* More cleanup

* BulkLoadEntities to BulkLoadEntitiesToCache

* Add CanCache in DeleteData to gate an unnecessary call

[Cleanup] Convert Event Parses to Single Line (EQEmu#4569)

* [Cleanup] Convert Event Parses to Single Line

* Push

* Update spells.cpp

* Update spells.cpp

---------

Co-Authored-By: Paul Johnson <[email protected]>
Co-Authored-By: Akkadius <[email protected]>
Co-Authored-By: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Revert "Bump golang.org/x/crypto in /utils/scripts/build/should-release (EQEmu#4570)"

This reverts commit 612b03a.
@Akkadius Akkadius mentioned this pull request Jan 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants