Skip to content

[WIP] JetStream: Make timestamp-based seeks O(log n) with binary search in filestore#7352

Closed
darioalessandro wants to merge 3 commits intonats-io:mainfrom
darioalessandro:fetch-msgs-using-binary-search
Closed

[WIP] JetStream: Make timestamp-based seeks O(log n) with binary search in filestore#7352
darioalessandro wants to merge 3 commits intonats-io:mainfrom
darioalessandro:fetch-msgs-using-binary-search

Conversation

@darioalessandro
Copy link
Copy Markdown

@darioalessandro darioalessandro commented Sep 24, 2025

Description

Improves timestamp-based message lookups used by JetStream consumers (OptStartTime) by replacing linear scans with binary search.

  • Block selection (O(log n))

    • server/filestore.go: selectMsgBlockForStart now uses binary search across fs.blks leveraging each block’s last.ts.
  • Intra-block lookup (O(log n))

    • server/filestore.go: GetSeqFromTime now performs a binary search within the selected block via binarySearchSeqFromTime, probing timestamps with mb.fetchMsgNoCopy.

No on-disk format changes; semantics unchanged (lower-bound: first seq with ts ≥ target).

Motivation

Fetching by timestamp was slow due to:

  • Linear block selection.
  • Linear intra-block scan, fetching each message to check timestamps.

This reduces complexity to O(log #blocks + log #msgs-in-block), speeding consumer startup by timestamp and server APIs using GetSeqFromTime.

Changes

  • server/filestore.go

    • Switch selectMsgBlockForStart to binary search across blocks.
    • Replace linear scan in GetSeqFromTime with a binary search (binarySearchSeqFromTime).
  • server/filestore_binary_search_test.go

    • Correctness tests using real stored timestamps.
    • Edge cases: empty store, single message, after-last (returns lastSeq+1).
    • Deletions: verifies gaps and returns next valid sequence.
    • Concurrency: multiple goroutines.
    • Benchmark for binary-search path.

Benchmark

goos: linux
goarch: amd64
pkg: github.com/nats-io/nats-server/v2/server
cpu: AMD Ryzen 9 5950X 16-Core Processor            
BenchmarkGetSeqFromTimeComparison/N=10000/Mid/BinarySearch-32         	 2190444	       524.5 ns/op	       0 B/op	       0 allocs/op
BenchmarkGetSeqFromTimeComparison/N=10000/Mid/LinearSearch-32         	    7725	    163145 ns/op	       0 B/op	       0 allocs/op
BenchmarkGetSeqFromTimeComparison/N=10000/NearEnd/BinarySearch-32     	 2243874	       522.4 ns/op	       0 B/op	       0 allocs/op
BenchmarkGetSeqFromTimeComparison/N=10000/NearEnd/LinearSearch-32     	    3777	    323088 ns/op	       1 B/op	       0 allocs/op
BenchmarkGetSeqFromTimeComparison/N=100000/Mid/BinarySearch-32        	 1950970	       629.7 ns/op	       0 B/op	       0 allocs/op
BenchmarkGetSeqFromTimeComparison/N=100000/Mid/LinearSearch-32        	     768	   1615759 ns/op	       2 B/op	       0 allocs/op
BenchmarkGetSeqFromTimeComparison/N=100000/NearEnd/BinarySearch-32    	 1957323	       621.4 ns/op	       0 B/op	       0 allocs/op
BenchmarkGetSeqFromTimeComparison/N=100000/NearEnd/LinearSearch-32    	     348	   3285581 ns/op	      12 B/op	       0 allocs/op
PASS
ok  	github.com/nats-io/nats-server/v2/server	15.116s

Prior linear scan was O(n) and significantly slower on large blocks.

Compatibility

  • No API changes.
  • No persistence format changes.
  • Works with compressed/encrypted blocks (still loads block as before, but avoids O(n) scan).
  • Maintains lower-bound semantics and consumer clamping behavior.

Risks/Considerations

  • Deleted sequences inside a block are handled by probing the next available messages during search.
  • Cache-loading behavior unchanged; only the search strategy is improved.

Follow-ups (optional)

  • Consider a sparse timestamp index (in-memory or sidecar) for even faster repeated timestamp seeks on very large blocks (not included here).

Signed-off-by: Dario Lencina darioalessandrolencina@gmail.com

neilalexander added a commit that referenced this pull request Sep 26, 2025
Replaces #7352
Fixes #7353

Signed-off-by: Neil Twigg <neil@nats.io>
neilalexander added a commit that referenced this pull request Sep 26, 2025
Replaces #7352
Fixes #7353

Signed-off-by: Neil Twigg <neil@nats.io>
neilalexander added a commit that referenced this pull request Sep 26, 2025
This updates the filestore `GetSeqFromTime` function to use a binary
search, such that the computational complexity is more predictable. This
closes #7352 and fixes #7353 as this is a more idiomatic approach.

Beforehand, the benchmark was very fast for earlier timestamps (as
expected with a linear scan) but comparatively glacial for later ones:
```
go test -v ./server -run=XXX -bench=BenchmarkFileStoreGetSeqFromTime
goos: darwin
goarch: arm64
pkg: github.com/nats-io/nats-server/v2/server
cpu: Apple M2 Ultra
BenchmarkFileStoreGetSeqFromTime
BenchmarkFileStoreGetSeqFromTime/Start
BenchmarkFileStoreGetSeqFromTime/Start-24         	18652426	        61.56 ns/op	       0 B/op	       0 allocs/op
BenchmarkFileStoreGetSeqFromTime/Middle
BenchmarkFileStoreGetSeqFromTime/Middle-24        	  114956	     10275 ns/op	       2 B/op	       0 allocs/op
BenchmarkFileStoreGetSeqFromTime/End
BenchmarkFileStoreGetSeqFromTime/End-24           	   60656	     19740 ns/op	       0 B/op	       0 allocs/op
PASS
ok  	github.com/nats-io/nats-server/v2/server	11.673s
```

After the change, this is now more predictable in the entire range:
```
go test -v ./server -run=XXX -bench=BenchmarkFileStoreGetSeqFromTime
goos: darwin
goarch: arm64
pkg: github.com/nats-io/nats-server/v2/server
cpu: Apple M2 Ultra
BenchmarkFileStoreGetSeqFromTime
BenchmarkFileStoreGetSeqFromTime/Start
BenchmarkFileStoreGetSeqFromTime/Start-24         	 9555956	       130.8 ns/op	       0 B/op	       0 allocs/op
BenchmarkFileStoreGetSeqFromTime/Middle
BenchmarkFileStoreGetSeqFromTime/Middle-24        	 9335430	       127.4 ns/op	       0 B/op	       0 allocs/op
BenchmarkFileStoreGetSeqFromTime/End
BenchmarkFileStoreGetSeqFromTime/End-24           	10001110	       118.6 ns/op	       0 B/op	       0 allocs/op
PASS
ok  	github.com/nats-io/nats-server/v2/server	11.296s
```

Signed-off-by: Neil Twigg <neil@nats.io>
neilalexander added a commit that referenced this pull request Sep 29, 2025
Replaces #7352
Fixes #7353

Signed-off-by: Neil Twigg <neil@nats.io>
neilalexander added a commit that referenced this pull request Sep 29, 2025
Replaces #7352
Fixes #7353

Signed-off-by: Neil Twigg <neil@nats.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants