Skip to content

fix: size calculation for zstd frame cache#5859

Merged
hanabi1224 merged 8 commits intomainfrom
hm/fix-zstd-frame-cache-size-calculation
Jul 28, 2025
Merged

fix: size calculation for zstd frame cache#5859
hanabi1224 merged 8 commits intomainfrom
hm/fix-zstd-frame-cache-size-calculation

Conversation

@hanabi1224
Copy link
Copy Markdown
Contributor

@hanabi1224 hanabi1224 commented Jul 23, 2025

Summary of changes

Changes introduced in this pull request:

  • fix size calculation for zstd frame cache
  • make max size configurable via env var
  • switch to SizeTrackingLruCache
  • update default max size to 256MiB

cache size is ~120MiB after running Forest on mainnet for a while. How about changing the default max size to 512MiB or even 256MiB @LesnyRumcajs

# HELP zstd_frame_cache_0_size_bytes Size of LruCache zstd_frame_cache_0 in bytes
# TYPE zstd_frame_cache_0_size_bytes gauge
# UNIT zstd_frame_cache_0_size_bytes bytes
zstd_frame_cache_0_size_bytes 125976889
# HELP zstd_frame_cache_0_len Length of LruCache zstd_frame_cache_0
# TYPE zstd_frame_cache_0_len gauge
zstd_frame_cache_0_len 5699

Reference issue to close (if applicable)

Closes #5858

Other information and links

Change checklist

  • I have performed a self-review of my own code,
  • I have made corresponding changes to the documentation. All new code adheres to the team's documentation standards,
  • I have added tests that prove my fix is effective or that my feature works (if possible),
  • I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features

    • Added documentation for a new environment variable to configure the default maximum size of the zstd frame cache.
  • Documentation

    • Updated glossary with the term "zstd".
    • Documented the FOREST_ZSTD_FRAME_CACHE_DEFAULT_MAX_SIZE environment variable, including its purpose, default, and example values.
  • Refactor

    • Improved cache concurrency and performance by removing unnecessary locking (mutexes) from the zstd frame cache.
    • Enhanced cache size tracking with atomic operations and more accurate entry size accounting.
    • Introduced environment-based configuration for cache size limits.
    • Expanded cache utility with new methods for unbounded cache creation and LRU entry removal.
  • Chores

    • Added a lint rule to disallow unbounded LRU cache creation to prevent potential memory leaks.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jul 23, 2025

Walkthrough

The changes refactor the zstd frame cache to use a size-limited, atomically tracked LRU cache, configurable via an environment variable. Mutex-based synchronization is removed in favor of atomic operations, and the cache API is updated accordingly. Documentation is updated to reflect the new environment variable and terminology.

Changes

File(s) Change Summary
docs/dictionary.txt Added "zstd" entry to the dictionary.
docs/docs/users/reference/env_variables.md Documented new environment variable FOREST_ZSTD_FRAME_CACHE_DEFAULT_MAX_SIZE.
src/db/car/any.rs, src/db/car/forest.rs, src/db/car/many.rs Replaced Arc<Mutex<ZstdFrameCache>> with Arc<ZstdFrameCache> in struct fields and method signatures; removed mutex usage.
src/db/car/mod.rs Refactored ZstdFrameCache to use atomic size tracking, a custom size-aware LRU cache, and environment-configurable max size. Added tests.
src/utils/cache/lru.rs Added unbounded cache constructors and pop_lru to SizeTrackingLruCache; refactored cache creation logic.
.clippy.toml Added ban on lru::LruCache::unbounded method to avoid unbounded cache usage.
CHANGELOG.md Added changelog entries documenting zstd frame cache size metrics and environment variable configuration.

Sequence Diagram(s)

sequenceDiagram
    participant Env as Environment
    participant App as Application Startup
    participant Cache as ZstdFrameCache
    participant LRU as SizeTrackingLruCache

    Env->>App: FOREST_ZSTD_FRAME_CACHE_DEFAULT_MAX_SIZE
    App->>Cache: Initialize with max size from env (default 256 MiB)
    App->>LRU: Create size-tracking LRU cache
    Cache->>LRU: get(offset, key, cid)
    Cache->>LRU: put(offset, key, index)
    LRU-->>Cache: Evict if over max size, update atomic size
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~45 minutes

Assessment against linked issues

Objective Addressed Explanation
The cache is limited and optionally configurable (from #5858)
Lru::unbounded is banned in clippy (from #5858)

Assessment against linked issues: Out-of-scope changes

No out-of-scope changes detected.

Possibly related PRs

Suggested reviewers

  • elmattic
  • sudo-shashank

Note

⚡️ Unit Test Generation is now available in beta!

Learn more here, or try it out under "Finishing Touches" below.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 297605a and 56bdcf4.

📒 Files selected for processing (3)
  • CHANGELOG.md (2 hunks)
  • src/db/car/any.rs (1 hunks)
  • src/db/car/forest.rs (5 hunks)
✅ Files skipped from review due to trivial changes (1)
  • CHANGELOG.md
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/db/car/any.rs
  • src/db/car/forest.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: All lint checks
  • GitHub Check: tests
  • GitHub Check: tests-release
  • GitHub Check: cargo-publish-dry-run
  • GitHub Check: Build Ubuntu
  • GitHub Check: Build MacOS
  • GitHub Check: Deploy to Cloudflare Pages
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch hm/fix-zstd-frame-cache-size-calculation

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@hanabi1224 hanabi1224 force-pushed the hm/fix-zstd-frame-cache-size-calculation branch from d0ddf50 to 2cdded9 Compare July 23, 2025 09:52
@hanabi1224 hanabi1224 marked this pull request as ready for review July 23, 2025 09:52
@hanabi1224 hanabi1224 requested a review from a team as a code owner July 23, 2025 09:52
@hanabi1224 hanabi1224 requested review from LesnyRumcajs and elmattic and removed request for a team July 23, 2025 09:52
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/db/car/forest.rs (1)

232-234: Consider adding type annotation for clarity.

The .into() conversions suggest the HashMap now uses a wrapped key type (likely CidWrapper for size tracking). Consider adding an explicit type annotation to the block_map declaration for better code clarity.

-let mut block_map = HashMap::new();
+let mut block_map: HashMap<CidWrapper, Vec<u8>> = HashMap::new();
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5ee2b92 and 2cdded9.

📒 Files selected for processing (7)
  • docs/dictionary.txt (1 hunks)
  • docs/docs/users/reference/env_variables.md (1 hunks)
  • src/db/car/any.rs (1 hunks)
  • src/db/car/forest.rs (6 hunks)
  • src/db/car/many.rs (2 hunks)
  • src/db/car/mod.rs (2 hunks)
  • src/utils/cache/lru.rs (3 hunks)
🧠 Learnings (1)
src/db/car/mod.rs (1)

Learnt from: hanabi1224
PR: #5841
File: src/utils/get_size/mod.rs:10-10
Timestamp: 2025-07-17T15:21:40.753Z
Learning: The get_size2 crate's GetSize trait provides default implementations: get_stack_size() uses std::mem::size_of, get_heap_size() returns 0, and get_size() returns their sum. An empty impl like impl GetSize for MyType {} is valid and uses these defaults, making it suitable for simple wrapper types that don't allocate heap memory.

🧬 Code Graph Analysis (1)
src/db/car/mod.rs (2)
src/db/car/forest.rs (3)
  • std (107-107)
  • new (104-122)
  • get (207-245)
src/utils/cache/lru.rs (2)
  • cache (117-119)
  • unbounded_with_default_metrics_registry (113-115)
🧰 Additional context used
🧠 Learnings (1)
src/db/car/mod.rs (1)

Learnt from: hanabi1224
PR: #5841
File: src/utils/get_size/mod.rs:10-10
Timestamp: 2025-07-17T15:21:40.753Z
Learning: The get_size2 crate's GetSize trait provides default implementations: get_stack_size() uses std::mem::size_of, get_heap_size() returns 0, and get_size() returns their sum. An empty impl like impl GetSize for MyType {} is valid and uses these defaults, making it suitable for simple wrapper types that don't allocate heap memory.

🧬 Code Graph Analysis (1)
src/db/car/mod.rs (2)
src/db/car/forest.rs (3)
  • std (107-107)
  • new (104-122)
  • get (207-245)
src/utils/cache/lru.rs (2)
  • cache (117-119)
  • unbounded_with_default_metrics_registry (113-115)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: tests
  • GitHub Check: tests-release
  • GitHub Check: cargo-publish-dry-run
  • GitHub Check: Build MacOS
  • GitHub Check: Build Ubuntu
  • GitHub Check: All lint checks
🔇 Additional comments (15)
docs/dictionary.txt (1)

120-120: LGTM!

The addition of "zstd" to the dictionary is appropriate given the PR's focus on zstd frame cache functionality.

docs/docs/users/reference/env_variables.md (1)

52-52: Verify the default cache size choice.

The documentation shows a default of 1073741824 bytes (1 GiB), but the PR objectives suggest considering 256 MiB or 512 MiB as the default. Please confirm if 1 GiB is the intended default value.

src/db/car/any.rs (1)

92-98: LGTM!

The removal of the Mutex wrapper aligns with the refactoring to make ZstdFrameCache internally thread-safe.

src/db/car/many.rs (1)

26-26: LGTM!

The removal of Mutex wrapper from shared_cache is consistent with the refactoring across the codebase to use an internally thread-safe ZstdFrameCache.

Also applies to: 67-67, 75-75

src/db/car/forest.rs (1)

65-65: LGTM!

The removal of Mutex wrapper and direct method calls on ZstdFrameCache are consistent with the refactoring to use an internally thread-safe cache implementation.

Also applies to: 98-98, 118-118, 186-191, 215-215, 235-235

src/utils/cache/lru.rs (4)

62-74: LGTM! Clean refactoring with idiomatic Rust patterns.

The new_inner helper method effectively consolidates the cache creation logic, using Option<NonZeroUsize> to elegantly handle both bounded and unbounded cases.


76-81: Good refactoring to use the new helper method.

The method maintains its original signature while delegating to new_inner, preserving backward compatibility.


100-115: Well-structured API additions for unbounded cache support.

The three new unbounded constructors follow the established naming pattern and delegation hierarchy, providing a consistent API for both bounded and unbounded cache creation.


141-143: Essential addition for LRU eviction support.

The pop_lru method correctly exposes the underlying cache's eviction functionality with proper thread-safety through the write lock.

src/db/car/mod.rs (6)

10-10: Appropriate imports for the new functionality.

The added imports support atomic size tracking, lazy initialization, and size calculations needed for the refactored cache implementation.

Also applies to: 17-25


55-56: Correct atomic types for thread-safe size tracking.

The use of AtomicUsize for current_size and SizeTrackingLruCache enables lock-free size tracking, while CidWrapper provides the necessary GetSize implementation.


69-72: Proper initialization with metrics support.

Creating an unbounded cache is appropriate since size limits are enforced manually through the eviction logic. The metrics registration provides valuable observability.


78-84: Thread-safe implementation with proper type conversion.

The method correctly takes &self (enabling shared access) and acquires a write lock since LRU caches update access order on reads. The Cid to CidWrapper conversion is handled properly.


87-115: Well-implemented size-aware insertion with atomic operations.

The method correctly:

  • Skips oversized entries to prevent thrashing
  • Uses atomic operations for thread-safe size tracking
  • Implements proper eviction to maintain the size limit
  • Uses saturating arithmetic to prevent overflow

Note: There's a benign race condition where the cache might temporarily exceed max_size between threads, but this is acceptable for a cache implementation.


36-49: Environment variable properly documented and implementation approved

The environment variable FOREST_ZSTD_FRAME_CACHE_DEFAULT_MAX_SIZE is listed in docs/docs/users/reference/env_variables.md with a clear description and default value. The code uses LazyLock, parses into NonZeroUsize, logs successes and failures, and falls back to 1 GiB—no further changes needed.

  • Documentation location:
    • docs/docs/users/reference/env_variables.md: entry for FOREST_ZSTD_FRAME_CACHE_DEFAULT_MAX_SIZE

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
src/db/car/mod.rs (2)

78-84: Consider using read lock for get operations.

The method correctly changes to &self for concurrent access, but acquiring a write lock for read operations will prevent concurrent reads and impact performance.

Consider using a read lock if the SizeTrackingLruCache supports it:

-        self.lru
-            .cache()
-            .write()
-            .get(&(offset, key))
+        self.lru
+            .cache()
+            .read()
+            .get(&(offset, key))

However, if LRU updates require write access for "recently used" tracking, the current implementation may be necessary.


87-115: Well-implemented put method with minor concurrency considerations.

The size-based insertion and eviction logic is well-structured and handles edge cases correctly.

Consider these improvements for robustness:

  1. Memory ordering: For critical size updates, consider using Ordering::AcqRel instead of Relaxed to ensure consistency:
-            self.current_size.fetch_add(entry_size, Ordering::Relaxed);
+            self.current_size.fetch_add(entry_size, Ordering::AcqRel);
  1. Eviction loop bounds: Add a safety limit to prevent excessive eviction iterations:
+        let mut eviction_count = 0;
+        const MAX_EVICTIONS: usize = 1000;
         while self.current_size.load(Ordering::Relaxed) > self.max_size {
+            if eviction_count >= MAX_EVICTIONS {
+                tracing::warn!("Eviction loop exceeded maximum iterations");
+                break;
+            }
             if let Some((prev_key, prev_entry)) = self.lru.pop_lru() {
                 self.current_size.fetch_sub(
                     prev_key.get_size().saturating_add(prev_entry.get_size()),
                     Ordering::Relaxed,
                 );
+                eviction_count += 1;
             } else {
                 break;
             }
         }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2cdded9 and 6235c17.

📒 Files selected for processing (2)
  • docs/docs/users/reference/env_variables.md (1 hunks)
  • src/db/car/mod.rs (2 hunks)
🧠 Learnings (1)
src/db/car/mod.rs (1)

Learnt from: hanabi1224
PR: #5841
File: src/utils/get_size/mod.rs:10-10
Timestamp: 2025-07-17T15:21:40.753Z
Learning: The get_size2 crate's GetSize trait provides default implementations: get_stack_size() uses std::mem::size_of, get_heap_size() returns 0, and get_size() returns their sum. An empty impl like impl GetSize for MyType {} is valid and uses these defaults, making it suitable for simple wrapper types that don't allocate heap memory.

🚧 Files skipped from review as they are similar to previous changes (1)
  • docs/docs/users/reference/env_variables.md
🧰 Additional context used
🧠 Learnings (1)
src/db/car/mod.rs (1)

Learnt from: hanabi1224
PR: #5841
File: src/utils/get_size/mod.rs:10-10
Timestamp: 2025-07-17T15:21:40.753Z
Learning: The get_size2 crate's GetSize trait provides default implementations: get_stack_size() uses std::mem::size_of, get_heap_size() returns 0, and get_size() returns their sum. An empty impl like impl GetSize for MyType {} is valid and uses these defaults, making it suitable for simple wrapper types that don't allocate heap memory.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: cargo-publish-dry-run
  • GitHub Check: Build MacOS
  • GitHub Check: Build Ubuntu
  • GitHub Check: Check
  • GitHub Check: All lint checks
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: tests
  • GitHub Check: tests-release
  • GitHub Check: Deploy to Cloudflare Pages
🔇 Additional comments (5)
src/db/car/mod.rs (5)

10-10: LGTM: Import changes support the new cache implementation.

The new imports are appropriate for the atomic-based cache implementation with size tracking.

Also applies to: 17-25


36-49: LGTM: Well-implemented configurable cache size.

The lazy initialization with environment variable support is well-designed:

  • Proper error handling with logging
  • Uses NonZeroUsize to prevent invalid values
  • Reasonable 256 MiB default as mentioned in PR objectives

55-56: LGTM: Struct fields updated for atomic operations.

The change to AtomicUsize and SizeTrackingLruCache properly supports the new thread-safe, size-aware cache implementation.


61-61: LGTM: Default implementation uses the new configurable size.

Correctly references the new static configuration value.


69-73: LGTM: Constructor properly initializes atomic fields.

The initialization of AtomicUsize and SizeTrackingLruCache is correct for the new implementation.

Comment thread src/db/car/mod.rs Outdated
Comment thread src/db/car/mod.rs
pub fn put(&mut self, offset: FrameOffset, key: CacheKey, index: HashMap<Cid, Vec<u8>>) {
fn size_of_entry(entry: &HashMap<Cid, Vec<u8>>) -> usize {
entry.values().map(Vec::len).sum::<usize>()
pub fn put(&self, offset: FrameOffset, key: CacheKey, index: HashMap<CidWrapper, Vec<u8>>) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd love it if we could have some coverage here, both on the logic level and size calculation correctness.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Comment thread src/utils/cache/lru.rs
cache: Arc::new(RwLock::new(
capacity
.map(LruCache::new)
.unwrap_or_else(LruCache::unbounded),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we ban the LruCache::unbounded and mark this as exception with proper comment?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

hanabi1224 and others added 2 commits July 23, 2025 22:12
Co-authored-by: Hubert <hubert@chainsafe.io>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
src/db/car/mod.rs (2)

36-51: Environment variable configuration looks good, but consider the past review suggestion.

The implementation properly handles environment variable parsing with appropriate error handling and logging. However, there's a past review comment suggesting a specific format for the error message.

The error message format should match the suggestion from the previous review to maintain consistency.


89-117: Complex but well-implemented size-aware cache insertion with proper eviction logic.

The implementation correctly handles:

  • Size calculation for both keys and values
  • Atomic size tracking for thread safety
  • Differentiated logic for replacements vs new insertions
  • Proper eviction when cache exceeds max size

However, this is complex logic that benefits from thorough testing as noted in the past review comments.

The size calculation correctness and logic-level coverage mentioned in the previous review would be valuable for this implementation.

🧹 Nitpick comments (1)
src/db/car/mod.rs (1)

80-85: Method signature change enables concurrent access, but consider read performance.

The change to &self and internal locking enables concurrent cache usage. However, using a write lock for read operations (even for LRU updates) may impact read performance under high concurrency.

Consider whether the LRU update on reads is worth the write lock overhead, or if read-only cache lookups might be acceptable in some use cases.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fec0565 and 297605a.

📒 Files selected for processing (3)
  • .clippy.toml (1 hunks)
  • src/db/car/mod.rs (2 hunks)
  • src/utils/cache/lru.rs (4 hunks)
✅ Files skipped from review due to trivial changes (1)
  • .clippy.toml
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/utils/cache/lru.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: tests
  • GitHub Check: tests-release
  • GitHub Check: Build MacOS
  • GitHub Check: cargo-publish-dry-run
  • GitHub Check: Build Ubuntu
  • GitHub Check: All lint checks
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: Check
  • GitHub Check: Deploy to Cloudflare Pages
🔇 Additional comments (5)
src/db/car/mod.rs (5)

10-10: LGTM! Import changes support the new atomic cache implementation.

The new imports are appropriate for the refactored cache implementation using atomic operations and size tracking.

Also applies to: 17-25


57-58: Struct refactoring supports thread-safe cache operations.

The change from regular size tracking to AtomicUsize and the switch to SizeTrackingLruCache properly supports the new concurrent access pattern without requiring mutex synchronization.


63-63: Default implementation correctly uses the configurable max size.

The change to use the environment-configurable default max size is implemented correctly.


71-74: Constructor properly initializes the new cache implementation.

The initialization of AtomicUsize and SizeTrackingLruCache with metrics is implemented correctly and provides good observability.


120-160: Excellent test coverage addressing previous review concerns.

The test implementation effectively addresses the past review comment about needing coverage for size calculation correctness and logic-level testing. The randomized testing approach with both insertion and replacement scenarios provides good coverage of the complex cache logic.

The test correctly verifies that the atomic current_size stays consistent with the actual LRU cache size, which is the critical invariant for this implementation.

LesnyRumcajs
LesnyRumcajs previously approved these changes Jul 24, 2025
@sudo-shashank
Copy link
Copy Markdown
Contributor

@elmattic this needs your review

Copy link
Copy Markdown
Contributor

@elmattic elmattic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. If we introduce a new environment variable, it should be added to the CHANGELOG though.

@hanabi1224 hanabi1224 enabled auto-merge July 28, 2025 11:51
@hanabi1224 hanabi1224 requested a review from LesnyRumcajs July 28, 2025 11:51
@hanabi1224 hanabi1224 added this pull request to the merge queue Jul 28, 2025
Merged via the queue into main with commit 5e9a527 Jul 28, 2025
47 checks passed
@hanabi1224 hanabi1224 deleted the hm/fix-zstd-frame-cache-size-calculation branch July 28, 2025 16:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Limit unbounded LRU cache

4 participants