Skip to content

Reimplement index scan iter to not use hold_range_in_memory#6917

Closed
HaoranYi wants to merge 1 commit intoanza-xyz:masterfrom
HaoranYi:clean_hold_range
Closed

Reimplement index scan iter to not use hold_range_in_memory#6917
HaoranYi wants to merge 1 commit intoanza-xyz:masterfrom
HaoranYi:clean_hold_range

Conversation

@HaoranYi
Copy link
Copy Markdown

@HaoranYi HaoranYi commented Jul 10, 2025

Problem

hold_range_in_memory was originally introduced to support rent scanning by keeping certain key ranges in memory. However, since we no longer perform range-based rent scanning, this mechanism is now obsolete.

This PR refactors the index scan iterator to stop relying on hold_range_in_memory. With this change, there are no remaining dependencies on hold_range_in_memory in the index scan path.

This lays the groundwork for a future PR (#6920) to fully remove hold_range_in_memory from the codebase.

Summary of Changes

  • reimplement index scan iter to not use hold_range_in_memory.

Fixes #

@HaoranYi HaoranYi force-pushed the clean_hold_range branch 2 times, most recently from 3b79ae9 to e4a25b3 Compare July 10, 2025 14:49
@HaoranYi HaoranYi changed the title reimplement index scan iter to not using hold_range Reimplement index scan iter to not using hold_range Jul 10, 2025
@HaoranYi HaoranYi changed the title Reimplement index scan iter to not using hold_range Reimplement index scan iter to not use hold_range_in_memory Jul 10, 2025
@codecov-commenter
Copy link
Copy Markdown

Codecov Report

Attention: Patch coverage is 95.65217% with 1 line in your changes missing coverage. Please review.

Project coverage is 83.3%. Comparing base (553fc1d) to head (2d57db1).
Report is 2 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #6917   +/-   ##
=======================================
  Coverage    83.2%    83.3%           
=======================================
  Files         853      853           
  Lines      377587   377598   +11     
=======================================
+ Hits       314488   314569   +81     
+ Misses      63099    63029   -70     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment on lines +278 to +286
// We need to hold the flushing_active lock first before gathering the items from the index.
#[allow(unused_assignments)]
let mut flush_guard = None;
loop {
flush_guard = FlushGuard::lock(&self.flushing_active);
if flush_guard.is_some() {
break;
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the FlushGuard instead of the EvictionsGuard?

(forgive me, I've forgotten a lot of the details here. I see in `hold_range_in_memory() we grab the EvictionsGuard.)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EvictionsGuard will be removed too, see #6920. It is only relevant when we are holding range in the index.
Here we just need to lock flush_guard, which prevent evict_from_cache.
flush_internal calls evict_from_cache.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To give a broader context, please see #6920, which is the PR does the final removal.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going back over the code, I think we use the FlushGuard to indicate that we are flushing. We aren't flushing here, but I guess it is being used to prevent other flushing from happening. However, the existing flush() functions do not spin on the FlushGuard. I haven't thought through all the implications here.

Also, just below we do self.map_internal.write().unwrap() and grab the write lock on the in-mem entries for this bin. That seems sufficient enough to prevent flushing, since in flush_scan() it grabs the read lock and evict_from_cache() grabs the write lock.

I think that means we can remove the FlushGuard here too?

(And probably change the write lock below to a read lock?)

@jeffwashington
Copy link
Copy Markdown

Please consider something like this as a first step.
#6923

@HaoranYi
Copy link
Copy Markdown
Author

Closing this PR as it has been superseded by #6923

@HaoranYi HaoranYi closed this Jul 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants