Checkpoint(ClientInfo) Caching System#1424
Conversation
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughAdds a thread-safe, adaptive write-back in-memory ClientInfo cache to the MongoDB backend; integrates cache-first read/write flows with DB fallbacks, targeted invalidation, metrics, lifecycle APIs, a cache-bypass housekeeping deactivation API, and integration tests for cross-server routing and cache semantics. (50 words) Changes
Sequence Diagram(s)sequenceDiagram
participant API as Caller
participant MongoClient as MongoClient (client.go)
participant Cache as ClientInfoCache
participant DB as MongoDB
API->>MongoClient: Activate / TryAttach / Deactivate / Find / Update
MongoClient->>Cache: Get(refKey)
alt cache hit
Cache-->>MongoClient: ClientInfo (cached)
MongoClient->>Cache: Update / MarkDirty / Invalidate (as needed)
MongoClient-->>API: return cached response
else cache miss
MongoClient->>DB: Query / Upsert ClientInfo
DB-->>MongoClient: DB result
MongoClient->>Cache: Set(refKey, ClientInfo)
MongoClient-->>API: return DB response
end
Note over Cache,DB: Background flushes adapt to write-pressure (batched or single)
Cache->>DB: FlushToDB (bulk or single)
DB-->>Cache: ack
API->>MongoClient: Shutdown
MongoClient->>Cache: CloseClientInfoCache() -> final flush
Estimated code review effort🎯 4 (Complex) | ⏱️ ~40 minutes Possibly related issues
Possibly related PRs
Suggested labels
Suggested reviewers
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.2.2)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/product/migration-guide for migration instructions Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (2)
server/backend/database/mongo/client.go (1)
875-877: Consider caching the activated client instead of invalidating.After successfully activating a client in the database, the code invalidates the cache entry. This seems counterproductive as the client info is fresh and could benefit future reads.
Replace the invalidation with a cache update:
- // Invalidate cache for this client - c.clientInfoCache.Invalidate(clientInfo.RefKey()) + // Update cache with the activated client info + c.clientInfoCache.Set(clientInfo.RefKey(), &clientInfo)server/backend/database/mongo/client_cache.go (1)
450-482: Optimize update model to avoid empty $max operator.The code always includes a
$maxfield even when it might be empty, which is unnecessary.Only include
$maxwhen needed:func (c *ClientInfoCache) buildUpdateModel(refKey types.ClientRefKey, clientInfo *database.ClientInfo) mongo.WriteModel { // Build the update document updateDoc := bson.M{ "$set": bson.M{ "updated_at": clientInfo.UpdatedAt, }, - "$max": bson.M{}, } + maxUpdates := bson.M{} // Add document-specific updates for docID, docInfo := range clientInfo.Documents { docKey := clientDocInfoKey(docID, "server_seq") - updateDoc["$max"].(bson.M)[docKey] = docInfo.ServerSeq + maxUpdates[docKey] = docInfo.ServerSeq docKey = clientDocInfoKey(docID, "client_seq") - updateDoc["$max"].(bson.M)[docKey] = docInfo.ClientSeq + maxUpdates[docKey] = docInfo.ClientSeq docKey = clientDocInfoKey(docID, StatusKey) updateDoc["$set"].(bson.M)[docKey] = docInfo.Status } + if len(maxUpdates) > 0 { + updateDoc["$max"] = maxUpdates + } + // Update attached_docs array with correct state attachedDocs := clientInfo.AttachedDocuments() updateDoc["$set"].(bson.M)["attached_docs"] = attachedDocs
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
server/backend/database/mongo/client.go(13 hunks)server/backend/database/mongo/client_cache.go(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: in the `reactivateclients` function in `migrations/v0.5.6/document-detach.go`, when `totalcount` is ...
Learnt from: raararaara
PR: yorkie-team/yorkie#1062
File: migrations/v0.5.6/document-detach.go:166-167
Timestamp: 2024-11-14T05:10:21.629Z
Learning: In the `ReactivateClients` function in `migrations/v0.5.6/document-detach.go`, when `totalCount` is zero, the cursor will be empty, and the loop where `percentage` is calculated won't execute, so division by zero is not a concern.
Applied to files:
server/backend/database/mongo/client.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: load-test
- GitHub Check: complex-test
- GitHub Check: build
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (3)
server/backend/database/mongo/client.go (2)
1397-1414: LGTM! Selective cache invalidation implementation.The selective invalidation logic is well-implemented with proper fallback to full invalidation when selective invalidation fails.
2216-2233: LGTM! Consistent selective invalidation pattern.The same selective invalidation pattern is correctly applied in PurgeDocument, maintaining consistency with UpdateDocInfoStatusToRemoved.
server/backend/database/mongo/client_cache.go (1)
135-146: Fix O(n) lookup performance in GetByKey.The current implementation iterates through all cache entries to find a client by key, which is inefficient for large caches.
Consider maintaining a secondary index for efficient lookups:
type ClientInfoCache struct { mu sync.RWMutex cache map[types.ClientRefKey]*CachedClientInfo + keyIndex map[string]types.ClientRefKey // Maps "projectID:key" to ClientRefKey // ... other fields } func (c *ClientInfoCache) GetByKey(projectID types.ID, key string) *database.ClientInfo { c.mu.RLock() defer c.mu.RUnlock() + indexKey := string(projectID) + ":" + key + if refKey, exists := c.keyIndex[indexKey]; exists { + if cached, exists := c.cache[refKey]; exists && time.Now().Before(cached.ExpiresAt) { + return cached.ClientInfo + } + } return nil }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
server/backend/database/mongo/client.go(13 hunks)server/backend/database/mongo/client_cache.go(1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: in `pkg/document/change/id.go`, the `initialid` variable represents the initial state where nothing ...
Learnt from: chacha912
PR: yorkie-team/yorkie#1164
File: test/helper/helper.go:120-125
Timestamp: 2025-02-21T04:54:29.326Z
Learning: In `pkg/document/change/id.go`, the `InitialID` variable represents the initial state where nothing has been edited. Consider using a function like `GetInitialID()` instead of the package-level variable to enforce proper initialization and prevent shared mutable state.
Applied to files:
server/backend/database/mongo/client.go
📚 Learning: in the `reactivateclients` function in `migrations/v0.5.6/document-detach.go`, when `totalcount` is ...
Learnt from: raararaara
PR: yorkie-team/yorkie#1062
File: migrations/v0.5.6/document-detach.go:166-167
Timestamp: 2024-11-14T05:10:21.629Z
Learning: In the `ReactivateClients` function in `migrations/v0.5.6/document-detach.go`, when `totalCount` is zero, the cursor will be empty, and the loop where `percentage` is calculated won't execute, so division by zero is not a concern.
Applied to files:
server/backend/database/mongo/client.go
📚 Learning: in `server/rpc/cluster_server.go`, it's acceptable to return errors directly without additional cont...
Learnt from: hackerwins
PR: yorkie-team/yorkie#1036
File: server/rpc/cluster_server.go:56-121
Timestamp: 2024-10-17T06:34:51.208Z
Learning: In `server/rpc/cluster_server.go`, it's acceptable to return errors directly without additional context in the `DetachDocument` method.
Applied to files:
server/backend/database/mongo/client.go
🧬 Code Graph Analysis (1)
server/backend/database/mongo/client.go (5)
api/types/resource_ref_key.go (2)
DocRefKey(24-27)ClientRefKey(35-38)server/backend/database/mongo/client_cache.go (2)
ClientInfoCache(86-95)NewClientInfoCache(98-117)server/backend/database/client_info.go (8)
ClientActivated(43-43)ErrClientNotActivated(32-32)DocumentAttached(49-49)ClientDocInfo(56-60)DocumentAttaching(48-48)ErrAttachedDocumentExists(37-37)ClientDeactivated(42-42)ClientInfo(66-92)server/backend/database/database.go (1)
ErrClientNotFound(46-46)server/logging/logging.go (1)
DefaultLogger(85-90)
🔇 Additional comments (19)
server/backend/database/mongo/client.go (6)
54-58: LGTM! Clean struct field addition.The new
clientInfoCachefield is properly added to theClientstruct alongside existing cache fields.
119-131: LGTM! Proper cache initialization pattern.The cache initialization correctly creates the client instance first, then initializes the cache with a reference to it. This avoids the double initialization issue mentioned in past reviews.
142-148: LGTM! Proper cache cleanup.The cache cleanup is properly implemented in the
Closemethod with appropriate error handling.
878-918: LGTM! Cache integration in TryAttaching looks correct.The method properly checks cache first, validates state, updates cache, and flushes immediately for consistency.
963-1004: LGTM! Proper cache handling in DeactivateClient.The method correctly validates client state from cache and flushes changes immediately for critical operations.
1051-1086: LGTM! Cache-first pattern in FindClientInfoByRefKey.The method properly implements cache-first lookup with fallback to database and cache population.
server/backend/database/mongo/client_cache.go (13)
33-51: LGTM! Well-structured cache configuration.The configuration struct provides comprehensive tuning parameters for cache behavior including TTL, flush intervals, and pressure thresholds.
53-65: LGTM! Reasonable default configuration values.The default values provide a good balance between performance and resource usage. The 5-second base flush interval and 10-minute TTL are appropriate for most use cases.
67-84: LGTM! Comprehensive cache entry metadata.The
CachedClientInfoandWritePressurestructs capture all necessary metadata for cache management and adaptive flushing.
97-117: LGTM! Proper cache initialization.The constructor properly initializes all fields and starts background goroutines for adaptive flushing and cleanup.
171-201: LGTM! Fixed deadlock issue from past reviews.The
UpdateClientInfomethod now correctly avoids the deadlock by implementing the logic inline rather than callingSet()recursively.
203-240: LGTM! Sophisticated merge logic for ClientInfo.The merge logic properly handles sequence number updates with $max semantics and correctly resets sequences for detached/removed documents.
242-257: LGTM! Improved error handling for invalidation.The
Invalidatemethod now properly returns errors instead of silently ignoring them, addressing the concern from past reviews.
259-270: LGTM! Proper error handling in InvalidateAll.The method correctly flushes all dirty entries before clearing and propagates any flush errors.
354-366: LGTM! Well-designed adaptive flush interval calculation.The pressure-based interval calculation provides good responsiveness under high write load while maintaining efficiency during low activity.
395-420: LGTM! Comprehensive expired entry cleanup.The cleanup logic properly handles dirty expired entries by flushing them before removal and includes appropriate logging.
443-482: LGTM! Efficient bulk flush implementation.The bulk flush implementation efficiently batches updates and uses MongoDB's bulk write operations for optimal performance.
484-517: LGTM! Comprehensive MongoDB update model.The update model correctly uses $max operations for sequence numbers and $set for other fields, maintaining consistency with MongoDB semantics.
531-537: LGTM! Proper cache shutdown.The
Closemethod correctly stops background goroutines and flushes remaining dirty entries.
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (2)
server/backend/database/mongo/client.go (2)
795-846: Remove metadata from upsert filter to prevent duplicate documents.Including
metadatain the filter can cause duplicate client documents if metadata changes between calls. This should be in the$setor$setOnInsertclause instead.Apply this diff:
res, err := c.collection(ColClients).UpdateOne(ctx, bson.M{ "project_id": projectID, "key": key, - "metadata": metadata, }, bson.M{ "$set": bson.M{ StatusKey: database.ClientActivated, "updated_at": now, + "metadata": metadata, }, }, options.UpdateOne().SetUpsert(true))
1043-1080: Ensure flush reliability for critical push-pull operations.The write-back strategy here relies on background flushing which could fail. Consider adding retry logic or monitoring for flush failures.
🧹 Nitpick comments (1)
server/backend/database/mongo/client_cache.go (1)
466-470: Consider re-enabling cleanup after resolving the underlying issue.The cleanup is disabled to prevent "client not found" errors in housekeeping. This seems like a temporary workaround - consider tracking this as a TODO with an issue number.
// Cleanup cache after successful flush -// Temporarily disabled to prevent "client not found" errors in housekeeping -// if c.config.EnableFlushCleanup { -// c.cleanupAfterFlush() -// } +// TODO(#issue): Re-enable cleanup after fixing "client not found" errors in housekeeping +// Temporarily disabled to prevent errors when housekeeping queries recently flushed clients +// if c.config.EnableFlushCleanup { +// c.cleanupAfterFlush() +// }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
server/backend/backend.go(1 hunks)server/backend/database/mongo/client.go(17 hunks)server/backend/database/mongo/client_cache.go(1 hunks)
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: in `pkg/document/change/id.go`, the `initialid` variable represents the initial state where nothing ...
Learnt from: chacha912
PR: yorkie-team/yorkie#1164
File: test/helper/helper.go:120-125
Timestamp: 2025-02-21T04:54:29.326Z
Learning: In `pkg/document/change/id.go`, the `InitialID` variable represents the initial state where nothing has been edited. Consider using a function like `GetInitialID()` instead of the package-level variable to enforce proper initialization and prevent shared mutable state.
Applied to files:
server/backend/database/mongo/client.go
📚 Learning: in the `reactivateclients` function in `migrations/v0.5.6/document-detach.go`, when `totalcount` is ...
Learnt from: raararaara
PR: yorkie-team/yorkie#1062
File: migrations/v0.5.6/document-detach.go:166-167
Timestamp: 2024-11-14T05:10:21.629Z
Learning: In the `ReactivateClients` function in `migrations/v0.5.6/document-detach.go`, when `totalCount` is zero, the cursor will be empty, and the loop where `percentage` is calculated won't execute, so division by zero is not a concern.
Applied to files:
server/backend/database/mongo/client.go
📚 Learning: in `server/rpc/cluster_server.go`, it's acceptable to return errors directly without additional cont...
Learnt from: hackerwins
PR: yorkie-team/yorkie#1036
File: server/rpc/cluster_server.go:56-121
Timestamp: 2024-10-17T06:34:51.208Z
Learning: In `server/rpc/cluster_server.go`, it's acceptable to return errors directly without additional context in the `DetachDocument` method.
Applied to files:
server/backend/database/mongo/client.go
📚 Learning: go 1.22+ supports the integer range syntax (`for range n` and `for i := range n`), which is particul...
Learnt from: window9u
PR: yorkie-team/yorkie#1166
File: test/bench/webhook_bench_test.go:122-163
Timestamp: 2025-03-12T06:17:36.021Z
Learning: Go 1.22+ supports the integer range syntax (`for range n` and `for i := range n`), which is particularly useful for benchmarking code. This allows for more concise iteration over integer values compared to traditional three-clause for loops.
Applied to files:
server/backend/database/mongo/client.go
🧬 Code Graph Analysis (1)
server/backend/backend.go (2)
server/backend/database/mongo/client.go (1)
Client(51-59)server/logging/logging.go (1)
DefaultLogger(85-90)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: load-test
- GitHub Check: complex-test
- GitHub Check: bench
- GitHub Check: build
🔇 Additional comments (17)
server/backend/backend.go (1)
231-236: LGTM! Proper cache cleanup during shutdown.The cache closure is correctly placed before database disconnection, and the error handling follows the existing pattern in the shutdown sequence.
server/backend/database/mongo/client.go (9)
55-59: LGTM! Clean cache field addition to Client struct.The clientInfoCache field is properly placed alongside other cache fields.
120-131: LGTM! Correct single-instance initialization pattern.The client instance is created once and properly passed to the cache constructor, avoiding the previous double-initialization issue.
135-156: LGTM! Well-structured cache lifecycle management.The separation of cache closure from the main Close() method is appropriate, with clear documentation and defensive nil checking.
855-923: LGTM! Well-implemented cache-first approach for TryAttaching.The method properly checks cache first, performs in-memory updates with deep copying to prevent corruption, and falls back to database operations when needed. Metrics recording is consistently applied.
1175-1216: LGTM! Efficient cache utilization for finding attached clients.The method effectively uses cache-first strategy and ensures consistent ordering of results. The fallback to database is properly implemented.
1373-1390: LGTM! Optimized selective cache invalidation.The selective invalidation of only attached clients is much more efficient than the previous full invalidation approach. The fallback ensures correctness.
2013-2044: LGTM! Efficient cache-first document attachment check.The method properly utilizes the cache to avoid database queries when possible, with correct handling of the exclude client parameter.
2208-2225: LGTM! Consistent selective invalidation for document purge.The invalidation logic matches the pattern used in UpdateDocInfoStatusToRemoved, maintaining consistency across the codebase.
2317-2331: LGTM! Clean metrics exposure methods.The wrapper methods provide safe access to cache metrics with appropriate nil checking.
server/backend/database/mongo/client_cache.go (7)
50-135: LGTM! Well-structured cache configuration and initialization.The configuration provides good tuning knobs with sensible defaults. The initialization properly starts background tasks.
137-179: LGTM! Proper use of deep copying for cache safety.All getter methods correctly return deep copies to prevent external modifications from corrupting the cache.
181-225: LGTM! Correct cache entry management with proper dirty tracking.The Set and UpdateClientInfo methods properly manage cache entries with appropriate dirty flag handling and size-based eviction.
227-265: LGTM! Correct implementation of MongoDB $max semantics.The merge logic properly handles sequence number updates with $max behavior and correctly resets sequences for detached/removed documents.
267-295: LGTM! Improved error handling for cache invalidation.The invalidation methods now properly return errors from flush operations, addressing the previous concern about silent failures.
360-427: LGTM! Smart adaptive flush mechanism with improved error handling.The adaptive flush properly adjusts intervals based on write pressure and now logs flush errors as suggested in previous reviews.
530-693: LGTM! Comprehensive metrics tracking and reporting.The metrics system provides detailed insights into cache performance with proper thread-safe access and informative logging output.
There was a problem hiding this comment.
Go Benchmark Analysis 📊
This is a comparison result between the previous(aeb1b83) and the current commit(5a3ee2e).
Significant Changes (≥20% difference)
| Benchmark suite | Previous | Current | Change |
|---|---|---|---|
| BenchmarkDeletionConcurrency/concurrent_text_delete_range_100/ (ns/op) | 757.10 ms | 504.78 ms | 🟢 -33.33% |
| BenchmarkDeletionConcurrency/concurrent_text_delete_range_1000/ (ns/op) | 6.27 s | 4.04 s | 🟢 -35.56% |
| BenchmarkDeletionConcurrency/concurrent_tree_delete_range_100/ (ns/op) | 772.82 ms | 517.34 ms | 🟢 -33.06% |
| BenchmarkDeletionConcurrency/concurrent_tree_delete_range_1000/ (ns/op) | 6.84 s | 4.87 s | 🟢 -28.75% |
| BenchmarkDeletionConcurrency/concurrent_text_edit_delete_all_100/ (ns/op) | 655.70 ms | 434.95 ms | 🟢 -33.67% |
| BenchmarkDeletionConcurrency/concurrent_text_edit_delete_all_1000/ (ns/op) | 5.16 s | 3.37 s | 🟢 -34.61% |
| BenchmarkDeletionConcurrency/concurrent_text_edit_delete_all_1000/ (B/op) | 347.88 MB | 254.87 MB | 🟢 -26.74% |
| BenchmarkDeletionConcurrency/concurrent_tree_edit_delete_all_100/ (ns/op) | 668.00 ms | 464.00 ms | 🟢 -30.54% |
| BenchmarkDeletionConcurrency/concurrent_tree_edit_delete_all_1000/ (ns/op) | 5.76 s | 3.91 s | 🟢 -32.05% |
| BenchmarkRPC/client_to_server/ (ns/op) | 264.43 ms | 175.80 ms | 🟢 -33.52% |
| BenchmarkRPC/client_to_server/ (allocs/op) | 186,886 allocs | 144,093 allocs | 🟢 -22.90% |
| BenchmarkRPC/client_to_client_via_server/ (ns/op) | 199.82 ms | 143.14 ms | 🟢 -28.37% |
| BenchmarkTree/20000_vertices_from_protobuf/ (ns/op) | 898.93 ms | 717.70 ms | 🟢 -20.16% |
| BenchmarkTree/30000_vertices_from_protobuf/ (ns/op) | 2.01 s | 1.58 s | 🟢 -21.32% |
| BenchmarkVersionVector/clients_10/ (3_pushpull(ms)) | 4.00 ms | 2.00 ms | 🟢 -50.00% |
| BenchmarkVersionVector/clients_10/ (7_pushpull_after_detach(ms)) | 5.00 ms | 3.00 ms | 🟢 -40.00% |
| BenchmarkVersionVector/clients_100/ (3_pushpull(ms)) | 4.00 ms | 2.00 ms | 🟢 -50.00% |
| BenchmarkVersionVector/clients_100/ (7_pushpull_after_detach(ms)) | 5.00 ms | 3.00 ms | 🟢 -40.00% |
| BenchmarkVersionVector/clients_100/ (B/op) | 77.37 MB | 60.55 MB | 🟢 -21.73% |
| BenchmarkVersionVector/clients_1000/ (3_pushpull(ms)) | 5.00 ms | 3.00 ms | 🟢 -40.00% |
| BenchmarkVersionVector/clients_1000/ (7_pushpull_after_detach(ms)) | 5.00 ms | 3.00 ms | 🟢 -40.00% |
Key Observations 🔍
- The benchmark data shows significant improvements in various benchmarks, with notable reductions in operation timings ranging from about 20% to over 50%.
- Overall, there is a trend towards performance enhancements, as seen in the majority of benchmarks, including document manipulation, synchronization, and webhook processing.
- The improvements in data processing efficiency can lead to faster execution times and decreased resource consumption, indicating positive advancements in the system's performance.
Given the significant improvements in multiple benchmarks, the system seems to have undergone optimizations resulting in better performance across various operations.
Clock Analysis
Lamport vs VersionVector
#### BenchmarkVersionVector Test
##### Lamport (v0.5.2)
| Metric | 10 clients | 100 clients | 1000 clients |
| ---------------------------------- | --------------- | --------------- | --------------- |
| Total Operation Time | 84.96 ms | 793.94 ms | 34.79 s |
| Memory Allocations | 35.11 MB | 219.92 MB | 4.91 GB |
| Number of Allocations (allocs/op) | 69,271 | 1,246,728 | 81,485,288 |
|||||
| ChangePack Size | 138.0 B | 137.0 B | 141.0 B |
| Snapshot Size | 379.0 B | 3.08 KB | 30.08 KB |
| Push-Pull Time | 2.0 ms | 1.5 ms | 4.0 ms |
|||||
| Attach Time | 4.5 ms | 11.0 ms | 31.0 ms |
| ChangePack After Detach | 138.0 B | 140.0 B | 141.0 B |
| Snapshot After Detach | 136.0 B | 137.0 B | 139.0 B |
| Push-Pull After Detach | 2.5 ms | 5.0 ms | 9.5 ms |
##### Version Vector (current)
| Metric | 10 clients | 100 clients | 1000 clients |
| ---------------------------------- | --------------- | --------------- | --------------- |
| Total Operation Time | 104.73 ms | 882.57 ms | 11.01 s |
| Memory Allocations | 6.44 MB | 60.55 MB | 1.42 GB |
| Number of Allocations (allocs/op) | 56,805 | 582,131 | 17,889,022 |
|||||
| ChangePack Size | 185.0 B | 185.0 B | 186.0 B |
| Snapshot Size | 399.0 B | 3.10 KB | 30.10 KB |
| Push-Pull Time | 2.0 ms | 2.0 ms | 3.0 ms |
|||||
| Attach Time | 5.0 ms | 6.0 ms | 20.0 ms |
| ChangePack After Detach | 230.0 B | 231.0 B | 231.0 B |
| Snapshot After Detach | 156.0 B | 156.0 B | 156.0 B |
| Push-Pull After Detach | 3.0 ms | 3.0 ms | 3.0 ms |
---
#### BenchmarkSyncConcurrency Test
| Metric | Clients | Lamport (v0.5.2) | Version Vector (current) |
|-----------------------------------|------------------|------------------|---------------------------|
| Total Operation Time | 1-100-10-10 | 7.48 s | 7.63 s |
| | 100-100-10-10 | 9.62 s | 8.55 s |
| | 300_100-10-10 | 14.77 s | 10.37 s |
| Memory Allocations | 1-100-10-10 | 1.15 GB | 3.95 GB |
| | 100-100-10-10 | 1.45 GB | 4.03 GB |
| | 300_100-10-10 | 2.19 GB | 4.28 GB |
| Number of Allocations (allocs/op) | 1-100-10-10 | 17,107,766 | 106,227,255 |
| | 100-100-10-10 | 20,084,480 | 107,190,889 |
| | 300_100-10-10 | 30,359,215 | 109,596,723 |
---
#### Summary
- The Version Vector clock system shows improved performance compared to Lamport based on the benchmarks. It demonstrates lower total operation times, reduced memory allocations, and a decrease in the number of allocations per operation.
- The Version Vector clock system's push-pull times are consistently better across different client configurations compared to Lamport, indicating its efficiency in handling synchronization between distributed nodes.
- The Version Vector clock system, particularly in scenarios with higher client concurrency, showcases better scalability and resource utilization, as evidenced by the lower memory allocations and allocations per operation metrics.Detailed Test Results
BenchmarkDeletionConcurrency
| Benchmark suite | Previous | Current | Change |
|---|---|---|---|
| concurrent_text_delete_range_100/ (ns/op) | 757.10 ms | 504.78 ms | 🟢 -33.33% |
| concurrent_text_delete_range_100/ (B/op) | 63.61 MB | 51.73 MB | 🟢 -18.67% |
| concurrent_text_delete_range_100/ (allocs/op) | 671,892 allocs | 577,357 allocs | 🟢 -14.07% |
| concurrent_text_delete_range_1000/ (ns/op) | 6.27 s | 4.04 s | 🟢 -35.56% |
| concurrent_text_delete_range_1000/ (B/op) | 365.95 MB | 294.14 MB | 🟢 -19.62% |
| concurrent_text_delete_range_1000/ (allocs/op) | 6,301,300 allocs | 5,503,534 allocs | 🟢 -12.66% |
| concurrent_tree_delete_range_100/ (ns/op) | 772.82 ms | 517.34 ms | 🟢 -33.06% |
| concurrent_tree_delete_range_100/ (B/op) | 72.48 MB | 60.46 MB | 🟢 -16.58% |
| concurrent_tree_delete_range_100/ (allocs/op) | 761,429 allocs | 620,362 allocs | 🟢 -18.53% |
| concurrent_tree_delete_range_1000/ (ns/op) | 6.84 s | 4.87 s | 🟢 -28.75% |
| concurrent_tree_delete_range_1000/ (B/op) | 894.85 MB | 827.58 MB | 🟢 -7.52% |
| concurrent_tree_delete_range_1000/ (allocs/op) | 6,679,344 allocs | 5,966,831 allocs | 🟢 -10.67% |
| concurrent_text_edit_delete_all_100/ (ns/op) | 655.70 ms | 434.95 ms | 🟢 -33.67% |
| concurrent_text_edit_delete_all_100/ (B/op) | 53.97 MB | 46.48 MB | 🟢 -13.89% |
| concurrent_text_edit_delete_all_100/ (allocs/op) | 568,989 allocs | 487,134 allocs | 🟢 -14.39% |
| concurrent_text_edit_delete_all_1000/ (ns/op) | 5.16 s | 3.37 s | 🟢 -34.61% |
| concurrent_text_edit_delete_all_1000/ (B/op) | 347.88 MB | 254.87 MB | 🟢 -26.74% |
| concurrent_text_edit_delete_all_1000/ (allocs/op) | 5,242,863 allocs | 4,461,782 allocs | 🟢 -14.90% |
| concurrent_tree_edit_delete_all_100/ (ns/op) | 668.00 ms | 464.00 ms | 🟢 -30.54% |
| concurrent_tree_edit_delete_all_100/ (B/op) | 62.32 MB | 51.06 MB | 🟢 -18.07% |
| concurrent_tree_edit_delete_all_100/ (allocs/op) | 615,778 allocs | 561,938 allocs | 🟢 -8.74% |
| concurrent_tree_edit_delete_all_1000/ (ns/op) | 5.76 s | 3.91 s | 🟢 -32.05% |
| concurrent_tree_edit_delete_all_1000/ (B/op) | 723.04 MB | 644.14 MB | 🟢 -10.91% |
| concurrent_tree_edit_delete_all_1000/ (allocs/op) | 5,660,492 allocs | 4,927,157 allocs | 🟢 -12.96% |
BenchmarkDocument
| Benchmark suite | Previous | Current | Change |
|---|---|---|---|
| constructor_test/ (ns/op) | 1663.00 ns | 1608.00 ns | 🟢 -3.31% |
| constructor_test/ (B/op) | 1.59 KB | 1.59 KB | ⚪ 0% |
| constructor_test/ (allocs/op) | 27 allocs | 27 allocs | ⚪ 0% |
| status_test/ (ns/op) | 1250.00 ns | 1186.00 ns | 🟢 -5.12% |
| status_test/ (B/op) | 1.56 KB | 1.56 KB | ⚪ 0% |
| status_test/ (allocs/op) | 25 allocs | 25 allocs | ⚪ 0% |
| equals_test/ (ns/op) | 8722.00 ns | 8405.00 ns | 🟢 -3.63% |
| equals_test/ (B/op) | 8.33 KB | 8.33 KB | ⚪ 0% |
| equals_test/ (allocs/op) | 139 allocs | 139 allocs | ⚪ 0% |
| nested_update_test/ (ns/op) | 18450.00 ns | 17989.00 ns | 🟢 -2.50% |
| nested_update_test/ (B/op) | 12.86 KB | 12.86 KB | ⚪ 0% |
| nested_update_test/ (allocs/op) | 272 allocs | 272 allocs | ⚪ 0% |
| delete_test/ (ns/op) | 24388.00 ns | 23992.00 ns | 🟢 -1.62% |
| delete_test/ (B/op) | 16.37 KB | 16.37 KB | ⚪ 0% |
| delete_test/ (allocs/op) | 352 allocs | 352 allocs | ⚪ 0% |
| object_test/ (ns/op) | 9598.00 ns | 9119.00 ns | 🟢 -4.99% |
| object_test/ (B/op) | 7.42 KB | 7.42 KB | ⚪ 0% |
| object_test/ (allocs/op) | 124 allocs | 124 allocs | ⚪ 0% |
| array_test/ (ns/op) | 30959.00 ns | 29377.00 ns | 🟢 -5.11% |
| array_test/ (B/op) | 12.64 KB | 12.64 KB | ⚪ 0% |
| array_test/ (allocs/op) | 283 allocs | 283 allocs | ⚪ 0% |
| text_test/ (ns/op) | 34547.00 ns | 33265.00 ns | 🟢 -3.71% |
| text_test/ (B/op) | 15.31 KB | 15.31 KB | ⚪ 0% |
| text_test/ (allocs/op) | 503 allocs | 503 allocs | ⚪ 0% |
| text_composition_test/ (ns/op) | 33060.00 ns | 32385.00 ns | 🟢 -2.04% |
| text_composition_test/ (B/op) | 16.94 KB | 16.94 KB | ⚪ 0% |
| text_composition_test/ (allocs/op) | 493 allocs | 493 allocs | ⚪ 0% |
| rich_text_test/ (ns/op) | 90928.00 ns | 88777.00 ns | 🟢 -2.37% |
| rich_text_test/ (B/op) | 38.48 KB | 38.48 KB | ⚪ 0% |
| rich_text_test/ (allocs/op) | 1,185 allocs | 1,185 allocs | ⚪ 0% |
| counter_test/ (ns/op) | 18838.00 ns | 18269.00 ns | 🟢 -3.02% |
| counter_test/ (B/op) | 12.32 KB | 12.31 KB | 🟢 -0.02% |
| counter_test/ (allocs/op) | 254 allocs | 254 allocs | ⚪ 0% |
| text_edit_gc_100/ (ns/op) | 1.54 ms | 1.47 ms | 🟢 -4.47% |
| text_edit_gc_100/ (B/op) | 808.43 KB | 808.64 KB | 🔴 +0.03% |
| text_edit_gc_100/ (allocs/op) | 16,884 allocs | 16,885 allocs | ⚪ 0% |
| text_edit_gc_1000/ (ns/op) | 59.20 ms | 58.74 ms | 🟢 -0.77% |
| text_edit_gc_1000/ (B/op) | 46.27 MB | 46.27 MB | ⚪ 0% |
| text_edit_gc_1000/ (allocs/op) | 181,584 allocs | 181,597 allocs | ⚪ 0% |
| text_split_gc_100/ (ns/op) | 2.28 ms | 2.18 ms | 🟢 -4.33% |
| text_split_gc_100/ (B/op) | 1.53 MB | 1.53 MB | ⚪ 0% |
| text_split_gc_100/ (allocs/op) | 15,553 allocs | 15,552 allocs | ⚪ 0% |
| text_split_gc_1000/ (ns/op) | 136.84 ms | 133.19 ms | 🟢 -2.67% |
| text_split_gc_1000/ (B/op) | 137.28 MB | 137.28 MB | ⚪ 0% |
| text_split_gc_1000/ (allocs/op) | 180,995 allocs | 180,993 allocs | ⚪ 0% |
| text_100/ (ns/op) | 245920.00 ns | 234855.00 ns | 🟢 -4.50% |
| text_100/ (B/op) | 113.26 KB | 113.26 KB | ⚪ 0% |
| text_100/ (allocs/op) | 4,985 allocs | 4,985 allocs | ⚪ 0% |
| text_1000/ (ns/op) | 2.61 ms | 2.57 ms | 🟢 -1.52% |
| text_1000/ (B/op) | 1.08 MB | 1.08 MB | ⚪ 0% |
| text_1000/ (allocs/op) | 49,088 allocs | 49,088 allocs | ⚪ 0% |
| array_1000/ (ns/op) | 1.42 ms | 1.35 ms | 🟢 -4.45% |
| array_1000/ (B/op) | 1.13 MB | 1.13 MB | ⚪ 0% |
| array_1000/ (allocs/op) | 13,884 allocs | 13,883 allocs | ⚪ 0% |
| array_10000/ (ns/op) | 14.72 ms | 15.26 ms | 🔴 +3.68% |
| array_10000/ (B/op) | 10.29 MB | 10.29 MB | ⚪ 0% |
| array_10000/ (allocs/op) | 140,734 allocs | 140,734 allocs | ⚪ 0% |
| array_gc_100/ (ns/op) | 149378.00 ns | 145222.00 ns | 🟢 -2.78% |
| array_gc_100/ (B/op) | 104.29 KB | 104.28 KB | 🟢 -0.02% |
| array_gc_100/ (allocs/op) | 1,470 allocs | 1,469 allocs | 🟢 -0.07% |
| array_gc_1000/ (ns/op) | 1.61 ms | 1.55 ms | 🟢 -3.73% |
| array_gc_1000/ (B/op) | 1.18 MB | 1.18 MB | 🔴 +0.02% |
| array_gc_1000/ (allocs/op) | 14,929 allocs | 14,930 allocs | ⚪ 0% |
| counter_1000/ (ns/op) | 215926.00 ns | 211844.00 ns | 🟢 -1.89% |
| counter_1000/ (B/op) | 194.39 KB | 194.39 KB | ⚪ 0% |
| counter_1000/ (allocs/op) | 5,773 allocs | 5,773 allocs | ⚪ 0% |
| counter_10000/ (ns/op) | 2.26 ms | 2.22 ms | 🟢 -1.66% |
| counter_10000/ (B/op) | 2.23 MB | 2.23 MB | ⚪ 0% |
| counter_10000/ (allocs/op) | 59,780 allocs | 59,780 allocs | ⚪ 0% |
| object_1000/ (ns/op) | 1.61 ms | 1.57 ms | 🟢 -2.49% |
| object_1000/ (B/op) | 1.48 MB | 1.48 MB | 🟢 -0.02% |
| object_1000/ (allocs/op) | 11,927 allocs | 11,926 allocs | ⚪ 0% |
| object_10000/ (ns/op) | 17.04 ms | 17.68 ms | 🔴 +3.72% |
| object_10000/ (B/op) | 12.75 MB | 12.75 MB | ⚪ 0% |
| object_10000/ (allocs/op) | 121,238 allocs | 121,233 allocs | ⚪ 0% |
| tree_100/ (ns/op) | 783357.00 ns | 761794.00 ns | 🟢 -2.75% |
| tree_100/ (B/op) | 524.17 KB | 524.16 KB | ⚪ 0% |
| tree_100/ (allocs/op) | 4,722 allocs | 4,722 allocs | ⚪ 0% |
| tree_1000/ (ns/op) | 52.99 ms | 52.19 ms | 🟢 -1.52% |
| tree_1000/ (B/op) | 43.74 MB | 43.74 MB | ⚪ 0% |
| tree_1000/ (allocs/op) | 46,131 allocs | 46,131 allocs | ⚪ 0% |
| tree_10000/ (ns/op) | 6.74 s | 7.01 s | 🔴 +4.08% |
| tree_10000/ (B/op) | 4.30 GB | 4.30 GB | ⚪ 0% |
| tree_10000/ (allocs/op) | 460,206 allocs | 460,190 allocs | ⚪ 0% |
| tree_edit_gc_100/ (ns/op) | 2.79 ms | 2.73 ms | 🟢 -1.93% |
| tree_edit_gc_100/ (B/op) | 2.40 MB | 2.40 MB | ⚪ 0% |
| tree_edit_gc_100/ (allocs/op) | 11,668 allocs | 11,668 allocs | ⚪ 0% |
| tree_edit_gc_1000/ (ns/op) | 225.37 ms | 219.76 ms | 🟢 -2.49% |
| tree_edit_gc_1000/ (B/op) | 213.95 MB | 213.95 MB | ⚪ 0% |
| tree_edit_gc_1000/ (allocs/op) | 116,143 allocs | 116,150 allocs | ⚪ 0% |
| tree_split_gc_100/ (ns/op) | 2.17 ms | 2.10 ms | 🟢 -3.20% |
| tree_split_gc_100/ (B/op) | 1.61 MB | 1.61 MB | ⚪ 0% |
| tree_split_gc_100/ (allocs/op) | 9,606 allocs | 9,606 allocs | ⚪ 0% |
| tree_split_gc_1000/ (ns/op) | 159.50 ms | 150.24 ms | 🟢 -5.81% |
| tree_split_gc_1000/ (B/op) | 149.85 MB | 149.85 MB | ⚪ 0% |
| tree_split_gc_1000/ (allocs/op) | 107,758 allocs | 107,760 allocs | ⚪ 0% |
BenchmarkDocumentDeletion
| Benchmark suite | Previous | Current | Change |
|---|---|---|---|
| single_text_delete_all_10000/ (ns/op) | 18.85 ms | 19.42 ms | 🔴 +3.06% |
| single_text_delete_all_10000/ (B/op) | 10.52 MB | 10.51 MB | 🟢 -0.05% |
| single_text_delete_all_10000/ (allocs/op) | 76,131 allocs | 76,129 allocs | ⚪ 0% |
| single_text_delete_all_100000/ (ns/op) | 279.60 ms | 317.53 ms | 🔴 +13.57% |
| single_text_delete_all_100000/ (B/op) | 104.71 MB | 104.76 MB | 🔴 +0.05% |
| single_text_delete_all_100000/ (allocs/op) | 765,993 allocs | 766,185 allocs | 🔴 +0.03% |
| single_tree_delete_all_1000/ (ns/op) | 53.44 ms | 56.35 ms | 🔴 +5.44% |
| single_tree_delete_all_1000/ (B/op) | 44.66 MB | 44.66 MB | ⚪ 0% |
| single_tree_delete_all_1000/ (allocs/op) | 58,202 allocs | 58,203 allocs | ⚪ 0% |
| single_text_delete_range_100/ (ns/op) | 446712.00 ns | 481346.00 ns | 🔴 +7.75% |
| single_text_delete_range_100/ (B/op) | 243.27 KB | 243.28 KB | ⚪ 0% |
| single_text_delete_range_100/ (allocs/op) | 7,074 allocs | 7,074 allocs | ⚪ 0% |
| single_text_delete_range_1000/ (ns/op) | 9.21 ms | 9.72 ms | 🔴 +5.50% |
| single_text_delete_range_1000/ (B/op) | 6.37 MB | 6.37 MB | ⚪ 0% |
| single_text_delete_range_1000/ (allocs/op) | 73,370 allocs | 73,370 allocs | ⚪ 0% |
| single_tree_delete_range_100/ (ns/op) | 978320.00 ns | 1.01 ms | 🔴 +2.96% |
| single_tree_delete_range_100/ (B/op) | 708.37 KB | 708.43 KB | ⚪ 0% |
| single_tree_delete_range_100/ (allocs/op) | 6,392 allocs | 6,392 allocs | ⚪ 0% |
| single_tree_delete_range_1000/ (ns/op) | 62.00 ms | 64.66 ms | 🔴 +4.29% |
| single_tree_delete_range_1000/ (B/op) | 54.51 MB | 54.51 MB | ⚪ 0% |
| single_tree_delete_range_1000/ (allocs/op) | 63,432 allocs | 63,434 allocs | ⚪ 0% |
BenchmarkGetDocuments
| Benchmark suite | Previous | Current | Change |
|---|---|---|---|
| without_root_presence_10/ (ns/op) | 1.54 ms | 1.53 ms | 🟢 -0.48% |
| without_root_presence_10/ (B/op) | 132.29 KB | 133.98 KB | 🔴 +1.28% |
| without_root_presence_10/ (allocs/op) | 1,176 allocs | 1,176 allocs | ⚪ 0% |
| with_root_presence_10/ (ns/op) | 4.36 ms | 4.33 ms | 🟢 -0.53% |
| with_root_presence_10/ (B/op) | 627.62 KB | 608.07 KB | 🟢 -3.11% |
| with_root_presence_10/ (allocs/op) | 5,774 allocs | 5,761 allocs | 🟢 -0.23% |
| without_root_presence_100/ (ns/op) | 2.79 ms | 2.82 ms | 🔴 +0.81% |
| without_root_presence_100/ (B/op) | 821.06 KB | 834.08 KB | 🔴 +1.58% |
| without_root_presence_100/ (allocs/op) | 6,117 allocs | 6,117 allocs | ⚪ 0% |
| with_root_presence_100/ (ns/op) | 42.20 ms | 42.65 ms | 🔴 +1.07% |
| with_root_presence_100/ (B/op) | 7.36 MB | 7.56 MB | 🔴 +2.71% |
| with_root_presence_100/ (allocs/op) | 66,935 allocs | 66,869 allocs | 🟢 -0.10% |
| without_root_presence_1000/ (ns/op) | 15.05 ms | 15.68 ms | 🔴 +4.21% |
| without_root_presence_1000/ (B/op) | 7.90 MB | 7.98 MB | 🔴 +1.08% |
| without_root_presence_1000/ (allocs/op) | 54,838 allocs | 54,851 allocs | 🔴 +0.02% |
| with_root_presence_1000/ (ns/op) | 397.87 ms | 406.44 ms | 🔴 +2.15% |
| with_root_presence_1000/ (B/op) | 73.25 MB | 73.21 MB | 🟢 -0.06% |
| with_root_presence_1000/ (allocs/op) | 665,185 allocs | 664,791 allocs | 🟢 -0.06% |
BenchmarkRPC
| Benchmark suite | Previous | Current | Change |
|---|---|---|---|
| client_to_server/ (ns/op) | 264.43 ms | 175.80 ms | 🟢 -33.52% |
| client_to_server/ (B/op) | 14.67 MB | 11.23 MB | 🟢 -23.40% |
| client_to_server/ (allocs/op) | 186,886 allocs | 144,093 allocs | 🟢 -22.90% |
| client_to_client_via_server/ (ns/op) | 199.82 ms | 143.14 ms | 🟢 -28.37% |
| client_to_client_via_server/ (B/op) | 23.32 MB | 21.01 MB | 🟢 -9.93% |
| client_to_client_via_server/ (allocs/op) | 298,676 allocs | 254,197 allocs | 🟢 -14.89% |
| attach_large_document/ (ns/op) | 1.31 s | 1.20 s | 🟢 -8.12% |
| attach_large_document/ (B/op) | 1.85 GB | 1.88 GB | 🔴 +1.36% |
| attach_large_document/ (allocs/op) | 11,942 allocs | 10,417 allocs | 🟢 -12.77% |
| adminCli_to_server/ (ns/op) | 600.28 ms | 591.88 ms | 🟢 -1.40% |
| adminCli_to_server/ (B/op) | 34.89 MB | 34.93 MB | 🔴 +0.13% |
| adminCli_to_server/ (allocs/op) | 402,709 allocs | 402,851 allocs | 🔴 +0.04% |
BenchmarkLocker
| Benchmark suite | Previous | Current | Change |
|---|---|---|---|
| (ns/op) | 85.00 ns | 83.48 ns | 🟢 -1.79% |
| (B/op) | 32.00 B | 32.00 B | ⚪ 0% |
| (allocs/op) | 1 allocs | 1 allocs | ⚪ 0% |
BenchmarkLockerParallel
| Benchmark suite | Previous | Current | Change |
|---|---|---|---|
| (ns/op) | 46.35 ns | 46.44 ns | 🔴 +0.19% |
| (B/op) | 0.00 B | 0.00 B | ⚪ 0% |
| (allocs/op) | 0 allocs | 0 allocs | ⚪ 0% |
BenchmarkLockerMoreKeys
| Benchmark suite | Previous | Current | Change |
|---|---|---|---|
| (ns/op) | 188.50 ns | 189.60 ns | 🔴 +0.58% |
| (B/op) | 30.00 B | 30.00 B | ⚪ 0% |
| (allocs/op) | 0 allocs | 0 allocs | ⚪ 0% |
BenchmarkRWLocker
| Benchmark suite | Previous | Current | Change |
|---|---|---|---|
| RWLock_rate_2/ (ns/op) | 49.77 ns | 50.30 ns | 🔴 +1.06% |
| RWLock_rate_2/ (B/op) | 0.00 B | 0.00 B | ⚪ 0% |
| RWLock_rate_2/ (allocs/op) | 0 allocs | 0 allocs | ⚪ 0% |
| RWLock_rate_10/ (ns/op) | 44.57 ns | 44.76 ns | 🔴 +0.43% |
| RWLock_rate_10/ (B/op) | 0.00 B | 0.00 B | ⚪ 0% |
| RWLock_rate_10/ (allocs/op) | 0 allocs | 0 allocs | ⚪ 0% |
| RWLock_rate_100/ (ns/op) | 60.12 ns | 62.13 ns | 🔴 +3.34% |
| RWLock_rate_100/ (B/op) | 1.00 B | 2.00 B | 🔴 +100.00% |
| RWLock_rate_100/ (allocs/op) | 0 allocs | 0 allocs | ⚪ 0% |
| RWLock_rate_1000/ (ns/op) | 89.66 ns | 92.20 ns | 🔴 +2.83% |
| RWLock_rate_1000/ (B/op) | 8.00 B | 8.00 B | ⚪ 0% |
| RWLock_rate_1000/ (allocs/op) | 0 allocs | 0 allocs | ⚪ 0% |
BenchmarkPresenceConcurrency
| Benchmark suite | Previous | Current | Change |
|---|---|---|---|
| 0-100-10/ (ns/op) | 953.00 ms | 867.79 ms | 🟢 -8.94% |
| 0-100-10/ (B/op) | 405.06 MB | 394.70 MB | 🟢 -2.56% |
| 0-100-10/ (allocs/op) | 5,161,259 allocs | 5,030,839 allocs | 🟢 -2.53% |
| 100-100-10/ (ns/op) | 1.01 s | 911.34 ms | 🟢 -10.12% |
| 100-100-10/ (B/op) | 433.74 MB | 420.86 MB | 🟢 -2.97% |
| 100-100-10/ (allocs/op) | 5,649,452 allocs | 5,516,167 allocs | 🟢 -2.36% |
| 300-100-10/ (ns/op) | 1.06 s | 974.87 ms | 🟢 -8.41% |
| 300-100-10/ (B/op) | 495.06 MB | 483.57 MB | 🟢 -2.32% |
| 300-100-10/ (allocs/op) | 6,623,330 allocs | 6,494,601 allocs | 🟢 -1.94% |
BenchmarkChange
| Benchmark suite | Previous | Current | Change |
|---|---|---|---|
| Push_10_Changes/ (ns/op) | 4.12 ms | 3.68 ms | 🟢 -10.61% |
| Push_10_Changes/ (B/op) | 204.74 KB | 216.52 KB | 🔴 +5.75% |
| Push_10_Changes/ (allocs/op) | 1,975 allocs | 2,089 allocs | 🔴 +5.77% |
| Push_100_Changes/ (ns/op) | 17.28 ms | 17.23 ms | 🟢 -0.30% |
| Push_100_Changes/ (B/op) | 1.32 MB | 1.36 MB | 🔴 +3.27% |
| Push_100_Changes/ (allocs/op) | 10,624 allocs | 10,944 allocs | 🔴 +3.01% |
| Push_1000_Changes/ (ns/op) | 148.46 ms | 149.89 ms | 🔴 +0.96% |
| Push_1000_Changes/ (B/op) | 13.21 MB | 13.36 MB | 🔴 +1.18% |
| Push_1000_Changes/ (allocs/op) | 99,898 allocs | 100,591 allocs | 🔴 +0.69% |
| Pull_10_Changes/ (ns/op) | 2.40 ms | 1.98 ms | 🟢 -17.60% |
| Pull_10_Changes/ (B/op) | 143.44 KB | 167.91 KB | 🔴 +17.06% |
| Pull_10_Changes/ (allocs/op) | 1,571 allocs | 1,842 allocs | 🔴 +17.25% |
| Pull_100_Changes/ (ns/op) | 3.47 ms | 3.07 ms | 🟢 -11.48% |
| Pull_100_Changes/ (B/op) | 701.24 KB | 781.28 KB | 🔴 +11.41% |
| Pull_100_Changes/ (allocs/op) | 6,129 allocs | 7,004 allocs | 🔴 +14.28% |
| Pull_1000_Changes/ (ns/op) | 10.98 ms | 10.90 ms | 🟢 -0.80% |
| Pull_1000_Changes/ (B/op) | 6.16 MB | 6.25 MB | 🔴 +1.61% |
| Pull_1000_Changes/ (allocs/op) | 53,115 allocs | 54,602 allocs | 🔴 +2.80% |
BenchmarkSnapshot
| Benchmark suite | Previous | Current | Change |
|---|---|---|---|
| Push_3KB_snapshot/ (ns/op) | 17.81 ms | 17.61 ms | 🟢 -1.11% |
| Push_3KB_snapshot/ (B/op) | 1.51 MB | 1.62 MB | 🔴 +7.65% |
| Push_3KB_snapshot/ (allocs/op) | 13,149 allocs | 14,961 allocs | 🔴 +13.78% |
| Push_30KB_snapshot/ (ns/op) | 153.99 ms | 154.38 ms | 🔴 +0.25% |
| Push_30KB_snapshot/ (B/op) | 18.84 MB | 19.34 MB | 🔴 +2.65% |
| Push_30KB_snapshot/ (allocs/op) | 185,402 allocs | 196,567 allocs | 🔴 +6.02% |
| Pull_3KB_snapshot/ (ns/op) | 4.13 ms | 3.73 ms | 🟢 -9.72% |
| Pull_3KB_snapshot/ (B/op) | 1.16 MB | 1.28 MB | 🔴 +9.96% |
| Pull_3KB_snapshot/ (allocs/op) | 15,285 allocs | 16,730 allocs | 🔴 +9.45% |
| Pull_30KB_snapshot/ (ns/op) | 17.58 ms | 15.00 ms | 🟢 -14.66% |
| Pull_30KB_snapshot/ (B/op) | 10.80 MB | 10.96 MB | 🔴 +1.51% |
| Pull_30KB_snapshot/ (allocs/op) | 142,560 allocs | 144,249 allocs | 🔴 +1.18% |
BenchmarkSplayTree
| Benchmark suite | Previous | Current | Change |
|---|---|---|---|
| stress_test_100000/ (ns/op) | 0.18 ns | 0.17 ns | 🟢 -4.92% |
| stress_test_100000/ (B/op) | 0.00 B | 0.00 B | ⚪ 0% |
| stress_test_100000/ (allocs/op) | 0 allocs | 0 allocs | ⚪ 0% |
| stress_test_200000/ (ns/op) | 0.36 ns | 0.35 ns | 🟢 -1.40% |
| stress_test_200000/ (B/op) | 0.00 B | 0.00 B | ⚪ 0% |
| stress_test_200000/ (allocs/op) | 0 allocs | 0 allocs | ⚪ 0% |
| stress_test_300000/ (ns/op) | 0.54 ns | 0.53 ns | 🟢 -2.43% |
| stress_test_300000/ (B/op) | 0.00 B | 0.00 B | ⚪ 0% |
| stress_test_300000/ (allocs/op) | 0 allocs | 0 allocs | ⚪ 0% |
| random_access_100000/ (ns/op) | 0.01 ns | 0.02 ns | 🔴 +32.46% |
| random_access_100000/ (B/op) | 0.00 B | 0.00 B | ⚪ 0% |
| random_access_100000/ (allocs/op) | 0 allocs | 0 allocs | ⚪ 0% |
| random_access_200000/ (ns/op) | 0.03 ns | 0.03 ns | 🟢 -8.00% |
| random_access_200000/ (B/op) | 0.00 B | 0.00 B | ⚪ 0% |
| random_access_200000/ (allocs/op) | 0 allocs | 0 allocs | ⚪ 0% |
| random_access_300000/ (ns/op) | 0.05 ns | 0.05 ns | 🟢 -1.83% |
| random_access_300000/ (B/op) | 0.00 B | 0.00 B | ⚪ 0% |
| random_access_300000/ (allocs/op) | 0 allocs | 0 allocs | ⚪ 0% |
| editing_trace_bench/ (ns/op) | 0.00 ns | 0.00 ns | 🔴 +19.93% |
| editing_trace_bench/ (B/op) | 0.00 B | 0.00 B | ⚪ 0% |
| editing_trace_bench/ (allocs/op) | 0 allocs | 0 allocs | ⚪ 0% |
BenchmarkSync
| Benchmark suite | Previous | Current | Change |
|---|---|---|---|
| memory_sync_10_test/ (ns/op) | 6759.00 ns | 6874.00 ns | 🔴 +1.70% |
| memory_sync_10_test/ (B/op) | 1.34 KB | 1.34 KB | 🟢 -0.15% |
| memory_sync_10_test/ (allocs/op) | 35 allocs | 35 allocs | ⚪ 0% |
| memory_sync_100_test/ (ns/op) | 54975.00 ns | 55461.00 ns | 🔴 +0.88% |
| memory_sync_100_test/ (B/op) | 9.53 KB | 9.47 KB | 🟢 -0.61% |
| memory_sync_100_test/ (allocs/op) | 269 allocs | 267 allocs | 🟢 -0.74% |
| memory_sync_1000_test/ (ns/op) | 597216.00 ns | 598385.00 ns | 🔴 +0.20% |
| memory_sync_1000_test/ (B/op) | 76.53 KB | 75.93 KB | 🟢 -0.78% |
| memory_sync_1000_test/ (allocs/op) | 2,133 allocs | 2,119 allocs | 🟢 -0.66% |
| memory_sync_10000_test/ (ns/op) | 7.59 ms | 7.31 ms | 🟢 -3.73% |
| memory_sync_10000_test/ (B/op) | 762.26 KB | 731.50 KB | 🟢 -4.03% |
| memory_sync_10000_test/ (allocs/op) | 20,551 allocs | 20,257 allocs | 🟢 -1.43% |
BenchmarkSyncConcurrency
| Benchmark suite | Previous | Current | Change |
|---|---|---|---|
| 1-100-10/ (ns/op) | 8.01 s | 7.63 s | 🟢 -4.71% |
| 1-100-10/ (B/op) | 4.33 GB | 3.95 GB | 🟢 -8.90% |
| 1-100-10/ (allocs/op) | 106,562,743 allocs | 106,227,255 allocs | 🟢 -0.31% |
| 100-100-10/ (ns/op) | 9.05 s | 8.55 s | 🟢 -5.49% |
| 100-100-10/ (B/op) | 4.43 GB | 4.03 GB | 🟢 -9.15% |
| 100-100-10/ (allocs/op) | 107,728,269 allocs | 107,190,889 allocs | 🟢 -0.50% |
| 300_100-10/ (ns/op) | 11.26 s | 10.37 s | 🟢 -7.91% |
| 300_100-10/ (B/op) | 4.67 GB | 4.28 GB | 🟢 -8.36% |
| 300_100-10/ (allocs/op) | 110,650,518 allocs | 109,596,723 allocs | 🟢 -0.95% |
BenchmarkTextEditing
| Benchmark suite | Previous | Current | Change |
|---|---|---|---|
| (ns/op) | 5.64 s | 5.24 s | 🟢 -7.02% |
| (B/op) | 3.88 GB | 3.88 GB | ⚪ 0% |
| (allocs/op) | 20,100,376 allocs | 20,100,549 allocs | ⚪ 0% |
BenchmarkTree
| Benchmark suite | Previous | Current | Change |
|---|---|---|---|
| 10000_vertices_to_protobuf/ (ns/op) | 4.84 ms | 4.55 ms | 🟢 -5.93% |
| 10000_vertices_to_protobuf/ (B/op) | 6.68 MB | 6.68 MB | ⚪ 0% |
| 10000_vertices_to_protobuf/ (allocs/op) | 90,026 allocs | 90,026 allocs | ⚪ 0% |
| 10000_vertices_from_protobuf/ (ns/op) | 223.90 ms | 188.89 ms | 🟢 -15.64% |
| 10000_vertices_from_protobuf/ (B/op) | 442.15 MB | 442.15 MB | ⚪ 0% |
| 10000_vertices_from_protobuf/ (allocs/op) | 280,047 allocs | 280,045 allocs | ⚪ 0% |
| 20000_vertices_to_protobuf/ (ns/op) | 10.28 ms | 9.24 ms | 🟢 -10.19% |
| 20000_vertices_to_protobuf/ (B/op) | 13.53 MB | 13.53 MB | ⚪ 0% |
| 20000_vertices_to_protobuf/ (allocs/op) | 180,029 allocs | 180,029 allocs | ⚪ 0% |
| 20000_vertices_from_protobuf/ (ns/op) | 898.93 ms | 717.70 ms | 🟢 -20.16% |
| 20000_vertices_from_protobuf/ (B/op) | 1.70 GB | 1.70 GB | ⚪ 0% |
| 20000_vertices_from_protobuf/ (allocs/op) | 560,078 allocs | 560,125 allocs | ⚪ 0% |
| 30000_vertices_to_protobuf/ (ns/op) | 15.84 ms | 14.52 ms | 🟢 -8.30% |
| 30000_vertices_to_protobuf/ (B/op) | 19.94 MB | 19.94 MB | ⚪ 0% |
| 30000_vertices_to_protobuf/ (allocs/op) | 270,030 allocs | 270,030 allocs | ⚪ 0% |
| 30000_vertices_from_protobuf/ (ns/op) | 2.01 s | 1.58 s | 🟢 -21.32% |
| 30000_vertices_from_protobuf/ (B/op) | 3.75 GB | 3.75 GB | ⚪ 0% |
| 30000_vertices_from_protobuf/ (allocs/op) | 840,235 allocs | 840,106 allocs | 🟢 -0.02% |
BenchmarkVersionVector
| Benchmark suite | Previous | Current | Change |
|---|---|---|---|
| clients_10/ (ns/op) | 128.24 ms | 104.73 ms | 🟢 -18.33% |
| clients_10/ (1_changepack(bytes)) | 185.00 B | 185.00 B | ⚪ 0% |
| clients_10/ (2_snapshot(bytes)) | 399.00 B | 399.00 B | ⚪ 0% |
| clients_10/ (3_pushpull(ms)) | 4.00 ms | 2.00 ms | 🟢 -50.00% |
| clients_10/ (4_attach(ms)) | 5.00 ms | 5.00 ms | ⚪ 0% |
| clients_10/ (5_changepack_after_detach(bytes)) | 230.00 B | 230.00 B | ⚪ 0% |
| clients_10/ (6_snapshot_after_detach(bytes)) | 156.00 B | 156.00 B | ⚪ 0% |
| clients_10/ (7_pushpull_after_detach(ms)) | 5.00 ms | 3.00 ms | 🟢 -40.00% |
| clients_10/ (B/op) | 9.08 MB | 6.44 MB | 🟢 -29.06% |
| clients_10/ (allocs/op) | 66,128 allocs | 56,805 allocs | 🟢 -14.10% |
| clients_100/ (ns/op) | 1.05 s | 882.57 ms | 🟢 -16.28% |
| clients_100/ (1_changepack(bytes)) | 185.00 B | 185.00 B | ⚪ 0% |
| clients_100/ (2_snapshot(bytes)) | 3.10 KB | 3.10 KB | ⚪ 0% |
| clients_100/ (3_pushpull(ms)) | 4.00 ms | 2.00 ms | 🟢 -50.00% |
| clients_100/ (4_attach(ms)) | 6.00 ms | 6.00 ms | ⚪ 0% |
| clients_100/ (5_changepack_after_detach(bytes)) | 231.00 B | 231.00 B | ⚪ 0% |
| clients_100/ (6_snapshot_after_detach(bytes)) | 156.00 B | 156.00 B | ⚪ 0% |
| clients_100/ (7_pushpull_after_detach(ms)) | 5.00 ms | 3.00 ms | 🟢 -40.00% |
| clients_100/ (B/op) | 77.37 MB | 60.55 MB | 🟢 -21.73% |
| clients_100/ (allocs/op) | 650,781 allocs | 582,131 allocs | 🟢 -10.55% |
| clients_1000/ (ns/op) | 12.26 s | 11.01 s | 🟢 -10.20% |
| clients_1000/ (1_changepack(bytes)) | 186.00 B | 186.00 B | ⚪ 0% |
| clients_1000/ (2_snapshot(bytes)) | 30.10 KB | 30.10 KB | ⚪ 0% |
| clients_1000/ (3_pushpull(ms)) | 5.00 ms | 3.00 ms | 🟢 -40.00% |
| clients_1000/ (4_attach(ms)) | 19.00 ms | 20.00 ms | 🔴 +5.26% |
| clients_1000/ (5_changepack_after_detach(bytes)) | 231.00 B | 231.00 B | ⚪ 0% |
| clients_1000/ (6_snapshot_after_detach(bytes)) | 156.00 B | 156.00 B | ⚪ 0% |
| clients_1000/ (7_pushpull_after_detach(ms)) | 5.00 ms | 3.00 ms | 🟢 -40.00% |
| clients_1000/ (B/op) | 1.48 GB | 1.42 GB | 🟢 -4.02% |
| clients_1000/ (allocs/op) | 18,228,809 allocs | 17,889,022 allocs | 🟢 -1.86% |
BenchmarkWebhook
| Benchmark suite | Previous | Current | Change |
|---|---|---|---|
| Send_10_Webhooks_to_10_Endpoints/ (ns/op) | 12.74 ms | 12.36 ms | 🟢 -2.97% |
| Send_10_Webhooks_to_10_Endpoints/ (B/op) | 808.37 KB | 807.97 KB | 🟢 -0.05% |
| Send_10_Webhooks_to_10_Endpoints/ (allocs/op) | 10,116 allocs | 10,114 allocs | 🟢 -0.02% |
| Send_100_Webhooks_to_10_Endpoints/ (ns/op) | 128.82 ms | 124.16 ms | 🟢 -3.61% |
| Send_100_Webhooks_to_10_Endpoints/ (B/op) | 8.09 MB | 8.08 MB | 🟢 -0.05% |
| Send_100_Webhooks_to_10_Endpoints/ (allocs/op) | 101,169 allocs | 101,144 allocs | 🟢 -0.02% |
| Send_10_Webhooks_to_100_Endpoints/ (ns/op) | 130.98 ms | 126.87 ms | 🟢 -3.14% |
| Send_10_Webhooks_to_100_Endpoints/ (B/op) | 8.28 MB | 8.28 MB | 🟢 -0.04% |
| Send_10_Webhooks_to_100_Endpoints/ (allocs/op) | 102,534 allocs | 102,524 allocs | ⚪ 0% |
| Send_100_Webhooks_to_100_Endpoints/ (ns/op) | 1.29 s | 1.25 s | 🟢 -3.37% |
| Send_100_Webhooks_to_100_Endpoints/ (B/op) | 82.37 MB | 82.34 MB | 🟢 -0.03% |
| Send_100_Webhooks_to_100_Endpoints/ (allocs/op) | 1,022,424 allocs | 1,022,219 allocs | 🟢 -0.02% |
| Send_10_Webhooks_to_1000_Endpoints/ (ns/op) | 2.82 s | 2.83 s | 🔴 +0.31% |
| Send_10_Webhooks_to_1000_Endpoints/ (B/op) | 209.68 MB | 209.12 MB | 🟢 -0.27% |
| Send_10_Webhooks_to_1000_Endpoints/ (allocs/op) | 1,716,612 allocs | 1,715,619 allocs | 🟢 -0.06% |
BenchmarkWebhookWithLimit
| Benchmark suite | Previous | Current | Change |
|---|---|---|---|
| Send_10_Webhooks_to_10_Endpoints_with_limit/ (ns/op) | 3.90 ms | 3.84 ms | 🟢 -1.44% |
| Send_10_Webhooks_to_10_Endpoints_with_limit/ (B/op) | 306.37 KB | 306.17 KB | 🟢 -0.06% |
| Send_10_Webhooks_to_10_Endpoints_with_limit/ (allocs/op) | 2,913 allocs | 2,912 allocs | 🟢 -0.03% |
| Send_100_Webhooks_to_10_Endpoints_with_limit/ (ns/op) | 4.13 ms | 4.15 ms | 🔴 +0.42% |
| Send_100_Webhooks_to_10_Endpoints_with_limit/ (B/op) | 428.82 KB | 428.56 KB | 🟢 -0.06% |
| Send_100_Webhooks_to_10_Endpoints_with_limit/ (allocs/op) | 4,713 allocs | 4,712 allocs | 🟢 -0.02% |
| Send_10_Webhooks_to_100_Endpoints_with_limit/ (ns/op) | 38.94 ms | 38.63 ms | 🟢 -0.79% |
| Send_10_Webhooks_to_100_Endpoints_with_limit/ (B/op) | 3.08 MB | 3.08 MB | 🟢 -0.14% |
| Send_10_Webhooks_to_100_Endpoints_with_limit/ (allocs/op) | 29,179 allocs | 29,151 allocs | 🟢 -0.10% |
| Send_100_Webhooks_to_100_Endpoints_with_limit/ (ns/op) | 41.65 ms | 42.35 ms | 🔴 +1.66% |
| Send_100_Webhooks_to_100_Endpoints_with_limit/ (B/op) | 4.37 MB | 4.37 MB | 🟢 -0.12% |
| Send_100_Webhooks_to_100_Endpoints_with_limit/ (allocs/op) | 47,188 allocs | 47,162 allocs | 🟢 -0.06% |
| Send_10_Webhooks_to_1000_Endpoints_with_limit/ (ns/op) | 373.38 ms | 366.87 ms | 🟢 -1.74% |
| Send_10_Webhooks_to_1000_Endpoints_with_limit/ (B/op) | 44.51 MB | 44.43 MB | 🟢 -0.18% |
| Send_10_Webhooks_to_1000_Endpoints_with_limit/ (allocs/op) | 380,231 allocs | 379,933 allocs | 🟢 -0.08% |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1424 +/- ##
==========================================
- Coverage 38.77% 38.54% -0.23%
==========================================
Files 184 185 +1
Lines 31249 32447 +1198
==========================================
+ Hits 12116 12507 +391
- Misses 18159 18940 +781
- Partials 974 1000 +26 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
k6 Load Test 📊
This is a comparison result between the previous(aeb1b83) and the current commit(5a3ee2e).
Change
| Metric | Previous | Current | Change |
|---|---|---|---|
| transaction_time.avg | 6812.78 | 6641.67 | 🟢 -2.51% |
| http_req_duration.avg | 85.20 | 62.95 | 🟢 -26.12% |
| http_req_duration.p(95) | 378.06 | 286.55 | 🟢 -24.21% |
| iteration_duration.avg | 6812.82 | 6641.70 | 🟢 -2.51% |
| data_received.count | 1,097,509,209 | 1,164,465,978 | 🔴 +6.10% |
| data_sent.count | 59,663,788 | 61,282,612 | 🔴 +2.71% |
Result
Details
script: test/k6/presence.ts
scenarios: (100.00%) 1 scenario, 500 max VUs, 3m0s max duration (incl. graceful stop):
* default: Up to 500 looping VUs for 2m30s over 3 stages (gracefulRampDown: 30s, gracefulStop: 30s)
█ THRESHOLDS
active_clients_time
✓ 'p(95)<500' p(95)=95ms
attach_documents_time
✓ 'p(95)<10000' p(95)=328.94ms
deactivate_clients_time
✓ 'p(95)<10000' p(95)=460ms
http_req_duration
✓ 'p(95)<10000' p(95)=286.54ms
pushpulls_time
✓ 'p(95)<1000' p(95)=304ms
transaction_success_rate
✓ 'rate>0.99' rate=100.00%
transaction_time
✓ 'p(95)<30000' p(95)=7.5s
█ TOTAL RESULTS
checks_total.......: 130272 836.639481/s
checks_succeeded...: 100.00% 130272 out of 130272
checks_failed......: 0.00% 0 out of 130272
✓ status is 200
✓ response has valid body
CUSTOM
active_clients....................: 8142 52.289968/s
active_clients_success_rate.......: 100.00% 8142 out of 8142
active_clients_time...............: avg=21ms min=1ms med=5ms max=469ms p(90)=58ms p(95)=95ms
attach_documents..................: 8142 52.289968/s
attach_documents_success_rate.....: 100.00% 8142 out of 8142
attach_documents_time.............: avg=92.83ms min=3ms med=40ms max=1.15s p(90)=250ms p(95)=328.94ms
deactivate_clients................: 8142 52.289968/s
deactivate_clients_success_rate...: 100.00% 8142 out of 8142
deactivate_clients_time...........: avg=116.96ms min=3ms med=42ms max=1.18s p(90)=346ms p(95)=460ms
pushpulls.........................: 40710 261.449838/s
pushpulls_success_rate............: 100.00% 40710 out of 40710
pushpulls_time....................: avg=67.52ms min=1ms med=21ms max=1.2s p(90)=193ms p(95)=304ms
transaction_success_rate..........: 100.00% 8142 out of 8142
transaction_time..................: avg=6.64s min=6.01s med=6.64s max=8.67s p(90)=7.28s p(95)=7.5s
HTTP
http_req_duration.................: avg=62.95ms min=708.95µs med=14.73ms max=1.18s p(90)=183.55ms p(95)=286.54ms
{ expected_response:true }......: avg=62.95ms min=708.95µs med=14.73ms max=1.18s p(90)=183.55ms p(95)=286.54ms
http_req_failed...................: 0.00% 0 out of 65136
http_reqs.........................: 65136 418.31974/s
EXECUTION
iteration_duration................: avg=6.64s min=6.01s med=6.64s max=8.67s p(90)=7.28s p(95)=7.5s
iterations........................: 8142 52.289968/s
vus...............................: 1 min=1 max=500
vus_max...........................: 500 min=500 max=500
NETWORK
data_received.....................: 1.2 GB 7.5 MB/s
data_sent.........................: 61 MB 394 kB/s
There was a problem hiding this comment.
Pull Request Overview
This PR introduces an in-memory caching system for client information to optimize MongoDB write performance during push-pull operations. The caching layer aims to reduce database load and latency by serving frequently accessed client data from memory.
Key changes include:
- Implementation of a comprehensive client info caching system with adaptive flushing
- Integration of cache operations into existing client lifecycle methods
- Addition of cache metrics and monitoring capabilities
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| server/backend/database/mongo/client_cache.go | New cache implementation with adaptive flushing, metrics tracking, and cache management |
| server/backend/database/mongo/client.go | Integration of cache operations into client lifecycle methods and database queries |
| server/backend/backend.go | Added cache cleanup in shutdown sequence |
| if err == mongo.ErrNoDocuments { | ||
| return nil, fmt.Errorf("%s: %w", refKey, database.ErrClientNotFound) | ||
| } | ||
| return nil, fmt.Errorf("decode client info: %w", err) |
There was a problem hiding this comment.
This error handling is incorrect. The code should return the original error from result.Err() when it's not mongo.ErrNoDocuments, but instead it's trying to decode and returning a decode error. The logic should handle result.Err() first before attempting to decode.
| return nil, fmt.Errorf("decode client info: %w", err) | ||
| } | ||
|
|
||
| if err := c.clientInfoCache.Set(refKey, &clientInfo); err != nil { |
There was a problem hiding this comment.
This line is unreachable because the function returns in the previous error handling block on line 1033. The decode logic and cache setting should only happen after successful result.Err() handling.
There was a problem hiding this comment.
Actionable comments posted: 2
🔭 Outside diff range comments (1)
server/backend/database/mongo/client.go (1)
802-818: Consider cache-first approach in ActivateClientThe method records a cache miss (line 804) but doesn't actually check the cache first. Consider checking if the client already exists in cache before performing the database operation.
Additionally, as mentioned in past reviews, consider using
$maxoperator forupdated_atto prevent timestamp regression in concurrent scenarios:res, err := c.collection(ColClients).UpdateOne(ctx, bson.M{ "project_id": projectID, "key": key, }, bson.M{ "$set": bson.M{ StatusKey: database.ClientActivated, - "updated_at": now, "metadata": metadata, }, + "$max": bson.M{ + "updated_at": now, + }, }, options.UpdateOne().SetUpsert(true))
♻️ Duplicate comments (1)
server/backend/database/mongo/client.go (1)
1024-1034: Fix incorrect error handling logicThe error handling is incorrect. When
result.Err()is notmongo.ErrNoDocuments, the original error should be returned instead of attempting to decode (which is unreachable).clientInfo := database.ClientInfo{} -if err := result.Decode(&clientInfo); err != nil { - if err == mongo.ErrNoDocuments { - return nil, fmt.Errorf("%s: %w", refKey, database.ErrClientNotFound) - } - return nil, fmt.Errorf("decode client info: %w", err) -} +if result.Err() != nil { + if result.Err() == mongo.ErrNoDocuments { + return nil, fmt.Errorf("%s: %w", refKey, database.ErrClientNotFound) + } + return nil, fmt.Errorf("find client: %w", result.Err()) +} +if err := result.Decode(&clientInfo); err != nil { + return nil, fmt.Errorf("decode client info: %w", err) +}
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
server/backend/database/mongo/client.go(16 hunks)server/backend/database/mongo/client_cache.go(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
server/backend/database/mongo/client.go (9)
api/types/resource_ref_key.go (2)
DocRefKey(24-27)ClientRefKey(35-38)server/backend/database/doc_info.go (1)
DocInfo(27-60)server/backend/database/mongo/changestore.go (1)
ChangeStore(41-44)pkg/cmap/cmap.go (1)
Map(36-38)api/types/id.go (1)
ID(35-35)pkg/document/time/version_vector.go (1)
VersionVector(33-33)server/backend/database/mongo/client_cache.go (3)
ClientInfoCache(99-110)NewClientInfoCache(113-135)CacheMetrics(34-48)server/backend/database/mongo/indexes.go (1)
ColClients(36-36)server/backend/database/client_info.go (6)
ClientActivated(43-43)DocumentAttached(49-49)ClientDocInfo(56-60)DocumentAttaching(48-48)ClientDeactivated(42-42)ClientInfo(66-92)
server/backend/database/mongo/client_cache.go (5)
server/backend/database/client_info.go (4)
ClientInfo(66-92)DocumentDetached(50-50)DocumentRemoved(51-51)ClientDocInfo(56-60)api/types/resource_ref_key.go (1)
ClientRefKey(35-38)server/backend/database/mongo/client.go (2)
Client(51-59)StatusKey(47-47)server/logging/logging.go (1)
DefaultLogger(85-90)server/backend/database/mongo/indexes.go (1)
ColClients(36-36)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: complex-test
- GitHub Check: load-test
- GitHub Check: build
- GitHub Check: bench
🔇 Additional comments (6)
server/backend/database/mongo/client_cache.go (3)
549-692: Well-implemented metrics systemThe metrics implementation is comprehensive with proper synchronization, detailed tracking per operation type, and informative logging with calculated hit rates.
236-239: Use database package constants for document statusThe code uses string literals "detached" and "removed" but should use the constants from the database package for consistency.
- if newDocInfo.Status == database.DocumentDetached || newDocInfo.Status == database.DocumentRemoved { + if newDocInfo.Status == database.DocumentDetached || newDocInfo.Status == database.DocumentRemoved {Note: Based on the relevant code snippets, these constants are defined in
server/backend/database/mongo/client.goasDocumentDetached = "detached"andDocumentRemoved = "removed". Ensure they're properly accessible or consider moving them to the database package for shared use.Likely an incorrect or invalid review comment.
50-79: Missing TTL field initialization in DefaultCacheConfigThe
CacheConfigstruct defines aTTLfield that is used in the cache operations (e.g., line 214 inUpdateClientInfo), but it's never initialized inDefaultCacheConfig(). This will cause the zero value (0s) to be used, potentially causing immediate expiration.Add the TTL field to the config:
type CacheConfig struct { // BaseFlushInterval is the base interval for flushing cache to DB BaseFlushInterval time.Duration // MaxFlushInterval is the maximum interval for flushing cache to DB MaxFlushInterval time.Duration // MinFlushInterval is the minimum interval for flushing cache to DB MinFlushInterval time.Duration // MaxCacheSize is the maximum number of cached entries MaxCacheSize int // WritePressureThreshold is the threshold for write pressure detection WritePressureThreshold int // PressureCheckInterval is the interval for checking write pressure PressureCheckInterval time.Duration // EnableFlushCleanup enables cache cleanup after flush EnableFlushCleanup bool + // TTL is the time-to-live for cache entries + TTL time.Duration } func DefaultCacheConfig() *CacheConfig { return &CacheConfig{ BaseFlushInterval: 10 * time.Second, MaxFlushInterval: 60 * time.Second, MinFlushInterval: 5 * time.Second, MaxCacheSize: 20000, WritePressureThreshold: 500, PressureCheckInterval: 10 * time.Second, EnableFlushCleanup: true, + TTL: 5 * time.Minute, } }Likely an incorrect or invalid review comment.
server/backend/database/mongo/client.go (3)
120-131: Proper Client initialization patternThe initialization now correctly creates a single Client instance and assigns the cache, fixing the previous double-initialization issue.
1369-1386: Good implementation of selective cache invalidationThe selective cache invalidation has been properly implemented as suggested in past reviews, with appropriate fallback to full invalidation when needed.
2313-2327: Clean implementation of cache metrics methodsThe cache metrics methods are well-implemented with proper nil checks for safety.
| docKey := clientDocInfoKey(docID, "server_seq") | ||
| updateDoc["$max"].(bson.M)[docKey] = docInfo.ServerSeq | ||
|
|
||
| docKey = clientDocInfoKey(docID, "client_seq") | ||
| updateDoc["$max"].(bson.M)[docKey] = docInfo.ClientSeq | ||
|
|
||
| docKey = clientDocInfoKey(docID, StatusKey) | ||
| updateDoc["$set"].(bson.M)[docKey] = docInfo.Status |
There was a problem hiding this comment.
Missing clientDocInfoKey helper function
The code uses clientDocInfoKey function which is not defined in this file. This function is defined in client.go but creates an unnecessary coupling between files.
Either:
- Import or define the function in this file:
// clientDocInfoKey returns the key for the client document info.
func clientDocInfoKey(docID types.ID, prefix string) string {
return fmt.Sprintf("documents.%s.%s", docID, prefix)
}- Or make it a method on the cache struct to keep it encapsulated.
🤖 Prompt for AI Agents
In server/backend/database/mongo/client_cache.go around lines 487 to 494, the
helper clientDocInfoKey used to build document map keys is missing from this
file (it currently exists in client.go and causes cross-file coupling); add a
local implementation or move it to the cache type as a method so this file is
self-contained — implement a function or method that accepts (docID types.ID,
prefix string) and returns the key string formatted as "documents.%s.%s" (use
fmt.Sprintf and the correct types import), then replace calls to
clientDocInfoKey with the new local function/method and remove any dependency on
client.go for this helper.
There was a problem hiding this comment.
Actionable comments posted: 4
🔭 Outside diff range comments (2)
server/clients/housekeeping.go (1)
62-67: Count only actual deactivations; skip no-ops cleanlyWith the new housekeeping path, the DB may return (nil, nil) when no state change was performed (not found or already deactivated). The current code increments deactivatedCount unconditionally, which can inflate metrics. Increment only when a deactivation actually occurred.
Apply this diff:
- if _, err := DeactivateForHousekeeping(ctx, be, pair.Project.ToProject(), pair.Client.RefKey()); err != nil { - return database.DefaultProjectID, err - } - - deactivatedCount++ + if info, err := DeactivateForHousekeeping(ctx, be, pair.Project.ToProject(), pair.Client.RefKey()); err != nil { + return database.DefaultProjectID, err + } else if info != nil { + deactivatedCount++ + }server/backend/database/mongo/client.go (1)
804-845: Consolidate to a single atomic upsert and prevent updated_at regressionThe current flow performs UpdateOne, followed by a conditional FindOneAndUpdate for created_at, and then an extra FindOne in the non-upsert case. This has two issues:
- Multiple round-trips that can be collapsed.
- Potential timestamp regression if concurrent calls write out-of-order (an older updated_at can overwrite a newer one).
Use a single FindOneAndUpdate with:
- $set for status and metadata
- $setOnInsert for created_at
- $max for updated_at to avoid regressing to an older timestamp
- ReturnDocument(After) to return the final state in one call.
now := gotime.Now() - - c.clientInfoCache.recordActivateClientMiss() - res, err := c.collection(ColClients).UpdateOne(ctx, bson.M{ - "project_id": projectID, - "key": key, - }, bson.M{ - "$set": bson.M{ - StatusKey: database.ClientActivated, - "updated_at": now, - "metadata": metadata, - }, - }, options.UpdateOne().SetUpsert(true)) - if err != nil { - return nil, fmt.Errorf("upsert client: %w", err) - } - - var result *mongo.SingleResult - if res.UpsertedCount > 0 { - result = c.collection(ColClients).FindOneAndUpdate(ctx, bson.M{ - "project_id": projectID, - "_id": res.UpsertedID, - }, bson.M{ - "$set": bson.M{ - "created_at": now, - }, - }) - } else { - result = c.collection(ColClients).FindOne(ctx, bson.M{ - "project_id": projectID, - "key": key, - }) - } + c.clientInfoCache.recordActivateClientMiss() + result := c.collection(ColClients).FindOneAndUpdate( + ctx, + bson.M{ + "project_id": projectID, + "key": key, + }, + bson.M{ + "$set": bson.M{ + StatusKey: database.ClientActivated, + "metadata": metadata, + }, + "$setOnInsert": bson.M{ + "created_at": now, + }, + "$max": bson.M{ + "updated_at": now, + }, + }, + options.FindOneAndUpdate(). + SetUpsert(true). + SetReturnDocument(options.After), + ) clientInfo := database.ClientInfo{} if err = result.Decode(&clientInfo); err != nil { return nil, fmt.Errorf("decode client info: %w", err) }
♻️ Duplicate comments (2)
server/backend/database/mongo/client_cache.go (2)
488-488: Missing clientDocInfoKey helper functionThe code uses
clientDocInfoKeyfunction which is not defined in this file. This function is referenced on lines 488, 491, 494, 878, 883, and 884 but is not implemented.Based on the past review comments, this issue was already identified. Please add the missing function to make this file self-contained:
+// clientDocInfoKey returns the key for the client document info. +func clientDocInfoKey(docID types.ID, prefix string) string { + return fmt.Sprintf("documents.%s.%s", docID, prefix) +}
83-89: Missing ExpiresAt field in CachedClientInfo structThe struct is missing the
ExpiresAtfield that was referenced in previous code versions and might be needed for TTL functionality mentioned in the configuration.Add the missing field if TTL functionality is planned:
type CachedClientInfo struct { ClientInfo *database.ClientInfo UpdatedAt time.Time Dirty bool LastFlush time.Time + ExpiresAt time.Time }
🧹 Nitpick comments (5)
server/backend/database/mongo/client.go (2)
1088-1117: Write-back only on UpdateClientInfoAfterPushPull: add resilience to flush failuresThis switches to a write-back-only path for client updates. If FlushToDB() fails, dirty data can linger longer than desired.
- Add retry/backoff around periodic flush failures.
- Consider immediate flush on specific transitions (e.g., detach, deactivate) or when write pressure is high.
- Ensure Backend.Shutdown always calls CloseClientInfoCache() before disconnecting Mongo to flush outstanding changes (verified elsewhere).
1213-1254: Cache-first attached-clients lookup is good; verify DB fallback field/indexThe DB fallback filters by "attached_docs". Ensure:
- The cache flusher maintains "attached_docs" alongside Documents.
- An appropriate index exists on {project_id, status, attached_docs} to make this query efficient.
If not, consider updating indexes and the update model builder.
server/backend/database/mongo/client_cache.go (3)
674-743: Consider write-back consistency for UpdateCheckpointThe
UpdateCheckpointmethod uses write-back strategy with $max operator logic, but there's no mechanism to ensure consistency if multiple concurrent updates occur between cache and database flushes. Consider adding optimistic concurrency control or documenting the expected usage pattern.Consider adding a comment to document the concurrency expectations:
// UpdateCheckpoint updates only the checkpoint (ServerSeq, ClientSeq) for a specific document // This uses Write-back strategy with max operator for consistency +// Note: Concurrent updates between cache flushes may result in lost updates. +// Callers should ensure proper synchronization at the application level. func (c *ClientInfoCache) UpdateCheckpoint(refKey types.ClientRefKey, docID types.ID, cp change.Checkpoint) error {
745-774: Potential race condition in write-through operationsThe
UpdateStatusandUpdateDocumentStatusmethods unlock the mutex before calling the DB update function, then re-acquire it. This creates a window where the cache entry could be modified by another goroutine between the DB update and cache update, potentially leading to inconsistent state.Consider refactoring to maintain consistency:
// UpdateStatus updates the client status using CAS Write-through strategy func (c *ClientInfoCache) UpdateStatus(refKey types.ClientRefKey, status string) error { - // Check if update is needed by comparing with cache first - c.mu.Lock() - if cached, exists := c.cache[refKey]; exists { - if cached.ClientInfo.Status == status { - // No change needed, skip database update - c.mu.Unlock() - return nil - } - } - c.mu.Unlock() + c.mu.Lock() + defer c.mu.Unlock() + + // Check if update is needed + if cached, exists := c.cache[refKey]; exists { + if cached.ClientInfo.Status == status { + return nil // No change needed + } + } - // Update in DB (Write-through) + // Temporarily unlock for DB operation + c.mu.Unlock() if err := c.updateStatusInDB(refKey, status); err != nil { + c.mu.Lock() // Re-acquire for defer return fmt.Errorf("update status in DB: %w", err) } + c.mu.Lock() - // Then update cache - c.mu.Lock() - defer c.mu.Unlock() - + // Update cache after successful DB write if cached, exists := c.cache[refKey]; exists { cached.ClientInfo.Status = status cached.UpdatedAt = time.Now() - // Not dirty since already written to DB cached.Dirty = false } return nil }Also applies to: 776-827
429-473: Optimize bulk write performanceThe
FlushToDBmethod processes all dirty entries in a single bulk operation, which is good for performance. However, consider adding batch size limits to avoid potential memory issues or MongoDB limits with very large caches.Consider adding batch size configuration:
// CacheConfig defines configuration parameters for the cache type CacheConfig struct { // ... existing fields ... + // MaxBatchSize limits the number of entries processed in a single bulk write + MaxBatchSize int } // DefaultCacheConfig returns default cache configuration func DefaultCacheConfig() *CacheConfig { return &CacheConfig{ // ... existing values ... + MaxBatchSize: 1000, } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (6)
server/backend/database/database.go(1 hunks)server/backend/database/memory/database.go(1 hunks)server/backend/database/mongo/client.go(16 hunks)server/backend/database/mongo/client_cache.go(1 hunks)server/clients/clients.go(2 hunks)server/clients/housekeeping.go(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (6)
server/clients/housekeeping.go (1)
server/clients/clients.go (1)
DeactivateForHousekeeping(96-144)
server/backend/database/database.go (2)
api/types/resource_ref_key.go (1)
ClientRefKey(35-38)server/backend/database/client_info.go (1)
ClientInfo(66-92)
server/backend/database/memory/database.go (2)
api/types/resource_ref_key.go (1)
ClientRefKey(35-38)server/backend/database/client_info.go (3)
ClientInfo(66-92)ClientActivated(43-43)ClientDeactivated(42-42)
server/clients/clients.go (5)
server/backend/backend.go (1)
Backend(47-77)api/types/resource_ref_key.go (2)
ClientRefKey(35-38)DocRefKey(24-27)server/backend/database/client_info.go (3)
ClientInfo(66-92)ClientDeactivated(42-42)DocumentAttached(49-49)server/documents/documents.go (1)
FindDocInfoByRefKey(382-388)api/types/id.go (1)
ID(35-35)
server/backend/database/mongo/client.go (7)
api/types/resource_ref_key.go (2)
DocRefKey(24-27)ClientRefKey(35-38)server/backend/database/doc_info.go (1)
DocInfo(27-60)pkg/cmap/cmap.go (1)
Map(36-38)pkg/document/time/version_vector.go (1)
VersionVector(33-33)server/backend/database/mongo/client_cache.go (3)
ClientInfoCache(101-111)NewClientInfoCache(114-136)CacheMetrics(36-50)server/backend/database/mongo/indexes.go (1)
ColClients(36-36)server/backend/database/client_info.go (6)
ClientActivated(43-43)DocumentAttached(49-49)ClientDocInfo(56-60)DocumentAttaching(48-48)ClientDeactivated(42-42)ClientInfo(66-92)
server/backend/database/mongo/client_cache.go (5)
api/types/resource_ref_key.go (1)
ClientRefKey(35-38)server/backend/database/mongo/client.go (2)
Client(51-59)StatusKey(47-47)server/logging/logging.go (1)
DefaultLogger(85-90)server/backend/database/mongo/indexes.go (1)
ColClients(36-36)server/backend/database/database.go (1)
ErrClientNotFound(46-46)
🔇 Additional comments (17)
server/backend/database/mongo/client.go (11)
120-132: Client initialization and cache wiring look correctCreating the client instance first and then wiring the cache into the same instance avoids stale references and duplicate allocations. LGTM.
136-147: Ensure cache lifecycle is closed on shutdown pathsClose() intentionally does not close the clientInfoCache, relying on Backend.Shutdown to call CloseClientInfoCache() beforehand. Verify all shutdown paths (including tests/utilities that may call Client.Close directly) invoke CloseClientInfoCache() to avoid lost write-back flushes.
149-155: LGTM on explicit cache close APIProviding an explicit CloseClientInfoCache() improves lifecycle clarity and allows deterministic flush. Looks good.
855-886: Cache-first TryAttaching looks good; ensure DB state gets flushed reliablyThe cache hit path updates only the in-memory entry and returns. This depends on the write-back flusher to persist changes. Make sure flush reliability (retry/backoff, flush on shutdown) is sufficient for this critical state transition, otherwise attaching state may be lost on crash.
932-1007: LGTM on cache-first DeactivateClient with DB fallbackInput validation and preconditions are preserved in both cache and DB paths. The $expr condition correctly rejects clients with any attaching/attached documents. Looks good.
1009-1048: Mongo returns nil on no-op; ensure memory backend matchesThis method returns (nil, nil) when the client is not found or already deactivated, which is desirable for idempotent housekeeping. Please align the memory backend to match this behavior (see separate comment).
1053-1079: LGTM on cache-first FindClientInfoByRefKeyStraightforward cache read-through; on miss, populate cache. Error mapping remains consistent.
1411-1427: Selective cache invalidation is a solid improvementInvalidating only clients attached to the affected document should significantly boost cache hit rates compared to blanket invalidation. LGTM.
2246-2262: LGTM: targeted invalidation on purgeSame as above: selective invalidation over full cache clear is the right choice here.
2051-2082: LGTM on cache-first IsDocumentAttachedFast path via cache, with DB fallback that honors excludeClientID. Ensure "attached_docs" field is kept in sync by the cache flusher for correctness.
2355-2369: Metrics surfacing looks goodExposing and logging cache metrics will help validate effectiveness and tune thresholds. LGTM.
server/backend/database/mongo/client_cache.go (6)
372-374: Improved error logging for adaptive flushThe error logging has been improved compared to the previous version that silently ignored errors.
205-226: Fixed deadlock issue in UpdateClientInfoThe method correctly avoids the deadlock by directly manipulating the cache instead of calling
Set()while holding the lock.
269-283: Improved error handling for dirty entry flushesBoth
InvalidateandInvalidateAllmethods now properly return errors when flushing dirty entries fails, addressing the previous concern about silent error handling.Also applies to: 285-296
612-672: Comprehensive metrics logging implementationThe metrics logging provides detailed insights into cache performance across different operations, which is excellent for monitoring and debugging cache effectiveness.
114-136: Well-structured cache initializationThe constructor properly initializes all components and starts background goroutines for adaptive flushing and metrics logging.
229-266: Add unit tests for sequence reset on detach/removeThere are no existing tests covering how
mergeClientInfohandlesDocumentDetachedorDocumentRemoved, so we can’t be sure the reset-to-0 logic is behaving as intended. Please:
- Review the intended semantics when a document’s status changes to detached/removed:
- Should both
ServerSeqandClientSeqalways reset to0, even ifnewDocInfohas higher sequence numbers?- Or should you apply
$maxafter resetting?- Add unit tests in
*_test.gothat exercise:
- A document transitioning to detached/removed with higher
newDocInfosequences (verify you either drop or keep the new values per spec).- Normal merges (status other than detached/removed) to confirm your
$maxlogic still applies.- Target tests against
mergeClientInfoinserver/backend/database/mongo/client_cache.go(lines 229–266).
| // DeactivateClientForHousekeeping deactivates the client for housekeeping purposes. | ||
| // This method bypasses cache and directly updates the database. | ||
| DeactivateClientForHousekeeping(ctx context.Context, refKey types.ClientRefKey) (*ClientInfo, error) | ||
|
|
There was a problem hiding this comment.
💡 Verification agent
🧩 Analysis chain
Clarify contract and align semantics across backends for housekeeping deactivation
Right now, Mongo returns (nil, nil) when the client is not found or already deactivated, while the memory backend can return a non-nil ClientInfo in the "already deactivated" case. Please document the intended contract here and align implementations accordingly.
I recommend standardizing on: return (nil, nil) if the client does not exist, belongs to another project, or is already deactivated. This enables callers (e.g., housekeeping) to treat nil as "no-op" consistently.
Apply this comment update to make the contract explicit:
- // DeactivateClientForHousekeeping deactivates the client for housekeeping purposes.
- // This method bypasses cache and directly updates the database.
+ // DeactivateClientForHousekeeping deactivates the client for housekeeping purposes.
+ // This method bypasses any in-memory cache and directly updates the database.
+ // Returns (nil, nil) if the client does not exist, is not in the given project,
+ // or is already deactivated (i.e., no state change was performed).
DeactivateClientForHousekeeping(ctx context.Context, refKey types.ClientRefKey) (*ClientInfo, error)🏁 Script executed:
#!/bin/bash
set -e
echo "Searching for DeactivateClientForHousekeeping occurrences in Go files"
rg -n "DeactivateClientForHousekeeping" -C2 --type=goLength of output: 2279
🏁 Script executed:
#!/bin/bash
set -e
echo "Extracting memory DB implementation for DeactivateClientForHousekeeping:"
sed -n '860,950p' server/backend/database/memory/database.go
echo
echo "Extracting Mongo client implementation for DeactivateClientForHousekeeping:"
sed -n '1000,1100p' server/backend/database/mongo/client.goLength of output: 6175
Clarify DeactivateClientForHousekeeping contract and align semantics across backends
The memory backend still returns a non-nil ClientInfo when the client is already deactivated, whereas the Mongo backend returns (nil, nil) in that case. Let’s standardize on returning (nil, nil) whenever no state change is needed (client not found, wrong project, or already deactivated).
• server/backend/database/database.go (interface): update doc comment
• server/backend/database/memory/database.go: change the “already deactivated” branch to return nil, nil
Suggested doc-comment diff in database.go:
- // DeactivateClientForHousekeeping deactivates the client for housekeeping purposes.
- // This method bypasses cache and directly updates the database.
+ // DeactivateClientForHousekeeping deactivates the client for housekeeping purposes.
+ // This method bypasses any in-memory cache and directly updates the database.
+ // Returns (nil, nil) if the client does not exist, is not in the given project,
+ // or is already deactivated (i.e., no state change was performed).Then, in memory/database.go, replace:
if clientInfo.Status != database.ClientActivated {
return clientInfo, nil // Already deactivated, return current state
}with:
if clientInfo.Status != database.ClientActivated {
return nil, nil // Already deactivated, no-op
}🤖 Prompt for AI Agents
In server/backend/database/database.go around lines 189-192, update the
DeactivateClientForHousekeeping doc comment to state that the method returns
(nil, nil) when no state change is needed (client not found, wrong project, or
already deactivated); then in server/backend/database/memory/database.go locate
the branch that checks if clientInfo.Status != database.ClientActivated and
change its return to nil, nil (instead of returning the current ClientInfo) so
the memory backend semantics match Mongo and the interface doc.
There was a problem hiding this comment.
Actionable comments posted: 1
🔭 Outside diff range comments (1)
server/backend/database/mongo/client.go (1)
810-835: Prevent timestamp regression and collapse to a single atomic upsert.Concurrent ActivateClient calls may overwrite a newer updated_at with an older one. Also, you’re doing a second hop to set created_at only when inserted. Use FindOneAndUpdate with $max(updated_at) and $setOnInsert(created_at) to atomically return the updated doc and eliminate the follow-up query.
Apply this refactor:
- c.clientInfoCache.recordActivateClientMiss() - res, err := c.collection(ColClients).UpdateOne(ctx, bson.M{ - "project_id": projectID, - "key": key, - }, bson.M{ - "$set": bson.M{ - StatusKey: database.ClientActivated, - "updated_at": now, - "metadata": metadata, - }, - }, options.UpdateOne().SetUpsert(true)) - if err != nil { - return nil, fmt.Errorf("activate client of %s: %w", key, err) - } - - var result *mongo.SingleResult - if res.UpsertedCount > 0 { - result = c.collection(ColClients).FindOneAndUpdate(ctx, bson.M{ - "project_id": projectID, - "_id": res.UpsertedID, - }, bson.M{ - "$set": bson.M{ - "created_at": now, - }, - }) - } else { - result = c.collection(ColClients).FindOne(ctx, bson.M{ - "project_id": projectID, - "key": key, - }) - } + c.clientInfoCache.recordActivateClientMiss() + result := c.collection(ColClients).FindOneAndUpdate( + ctx, + bson.M{ + "project_id": projectID, + "key": key, + }, + bson.M{ + "$set": bson.M{ + StatusKey: database.ClientActivated, + "metadata": metadata, // keep behavior of updating metadata on activation + }, + "$max": bson.M{ + "updated_at": now, // only move forward in time + }, + "$setOnInsert": bson.M{ + "created_at": now, // set at creation only + }, + }, + options.FindOneAndUpdate(). + SetUpsert(true). + SetReturnDocument(options.After), + ) + if result.Err() != nil { + return nil, fmt.Errorf("activate client of %s: %w", key, result.Err()) + }This also reduces round-trips under contention.
🧹 Nitpick comments (4)
server/backend/database/mongo/client.go (4)
1103-1106: Redundant IsAttached() check.You already validated membership via clientInfo.Documents[docInfo.ID]. Re-evaluating IsAttached without using its boolean is redundant and can be removed for clarity.
- _, err := clientInfo.IsAttached(docInfo.ID) - if err != nil { - return err - } + if _, err := clientInfo.IsAttached(docInfo.ID); err != nil { + return err + }Alternatively, rely solely on the earlier map check and drop this call if it doesn’t add semantics you need.
1108-1113: Write-back reliability for critical post-PushPull updates.This path only updates the cache; durability depends on asynchronous flush. Consider adding retry/backoff around flush on dirty entries or a write-through option for this critical path, plus metrics/logging for failed flushes to avoid silent staleness under DB issues.
Would you like me to propose a minimal retry wrapper in ClientInfoCache.FlushToDB() and surface a counter for failed flush attempts in CacheMetrics?
Also applies to: 1126-1129
1250-1267: Sort DB fallback results for deterministic ordering.The cache path returns a sorted slice; the DB fallback should match this to avoid nondeterministic order across executions.
Apply this after decoding:
var clientInfos []*database.ClientInfo if err := cursor.All(ctx, &clientInfos); err != nil { return nil, fmt.Errorf("find attached clients of %s: %w", docRefKey, err) } + sort.Slice(clientInfos, func(i, j int) bool { + return string(clientInfos[i].ID) < string(clientInfos[j].ID) + }) + return clientInfos, nil
136-138: Add changeCache.Purge() to Client.Close for symmetryThe
Shutdownmethod already callsCloseClientInfoCache()beforeDB.Close(), so the comment about closing the client-info cache is accurate. We still need to free the change cache in theClient.Close()method:In server/backend/database/mongo/client.go, update
func (c *Client) Close():if err := c.client.Disconnect(context.Background()); err != nil { return fmt.Errorf("close MongoDB client: %w", err) } c.docCache.Purge() + c.changeCache.Purge() c.vectorCache.Purge() return nil
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
server/backend/database/memory/database.go(1 hunks)server/backend/database/mongo/client.go(16 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- server/backend/database/memory/database.go
🧰 Additional context used
🧬 Code Graph Analysis (1)
server/backend/database/mongo/client.go (10)
api/types/resource_ref_key.go (2)
DocRefKey(24-27)ClientRefKey(35-38)server/backend/database/doc_info.go (1)
DocInfo(27-60)server/backend/database/mongo/changestore.go (1)
ChangeStore(41-44)pkg/cmap/cmap.go (1)
Map(36-38)api/types/id.go (1)
ID(35-35)pkg/document/time/version_vector.go (1)
VersionVector(33-33)server/backend/database/mongo/client_cache.go (3)
ClientInfoCache(101-111)NewClientInfoCache(114-136)CacheMetrics(36-50)server/backend/database/mongo/indexes.go (1)
ColClients(36-36)server/backend/database/client_info.go (4)
ClientActivated(43-43)DocumentAttached(49-49)ClientDocInfo(56-60)ClientInfo(66-92)server/logging/logging.go (1)
DefaultLogger(85-90)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: bench
- GitHub Check: build
- GitHub Check: complex-test
- GitHub Check: load-test
🔇 Additional comments (15)
server/backend/database/mongo/client.go (15)
24-24: Import of sort looks good.Required for deterministic ordering in cache-returned results. No concerns.
55-58: ClientInfo cache field addition is appropriate.The new clientInfoCache field is well-scoped on Client alongside existing caches.
120-131: Single Client initialization: LGTM.This pattern avoids creating a throwaway Client and correctly wires the cache to the returned instance.
149-155: Dedicated CloseClientInfoCache(): LGTM.Clear ownership and lifecycle of the cache are well-defined.
861-893: Cache-first TryAttaching flow: LGTM.Precondition checks and in-cache transition to Attaching are clear and consistent with the write-back strategy.
894-915: DB fallback path and cache update: LGTM.The conditional update ensures the doc isn’t already attached; caching the fresh result keeps cache coherent.
Also applies to: 924-927
938-1014: Cache-first DeactivateClient flow: LGTM.Precondition validation and in-cache transition to Deactivated are sound.
1016-1055: Housekeeping path bypass: LGTM.Direct DB update with permissive nil return for “no-op” is reasonable; updating the cache afterward maintains coherence.
1060-1083: Read-through FindClientInfoByRefKey: LGTM.Hit/miss metrics and caching the DB result are correctly handled.
1425-1441: Selective invalidation on document removal: LGTM.Targeting only affected clients with a safe fallback to full invalidation is a solid balance between correctness and performance.
2272-2288: Selective invalidation on purge: LGTM.Same here—the selective-first with full fallback approach is appropriate.
2077-2091: Cache-first IsDocumentAttached: LGTM.The early-return based on cache reduces DB load; skip logic for excludeClientID is correct.
2381-2395: Cache metrics API: LGTM.Simple, useful hooks to bubble up cache health and performance.
1250-1261: attached_docs field is defined, indexed, and maintainedConfirmed that:
- An index on “attached_docs” exists in server/backend/database/mongo/indexes.go (Keys: “attached_docs”).
- Updates to the field are applied in server/backend/database/mongo/client_cache.go (lines ~498–501 via
clientInfo.AttachedDocuments()).No further action needed.
1227-1249: types.ID implements a String() method; the compile concern is unfounded
We’ve confirmed in api/types/id.go thattype ID stringis paired withfunc (id *ID) String() string { return string(*id) }As a result,
attachedClients[i].ID.String()will compile correctly. Casting tostring(ID)remains a stylistic choice, not a necessity. Please ignore this suggestion.Likely an incorrect or invalid review comment.
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
server/backend/database/mongo/client.go (3)
810-851: Critical: upsert decode can fail and updated_at can regress under concurrency
- Upsert branch uses FindOneAndUpdate without ReturnDocument=After, which can decode “before” state (or none), causing ErrNoDocuments after a successful upsert.
- Plain $set of updated_at risks a slower concurrent call overwriting a newer timestamp. This can break deactivation thresholds.
Refactor to a single atomic FindOneAndUpdate with SetUpsert(true), ReturnDocument(After), use $max for updated_at, and $setOnInsert for created_at. Also set metadata through $set (or $setOnInsert if you want it immutable at creation).
Apply this diff:
- c.clientInfoCache.recordActivateClientMiss() - res, err := c.collection(ColClients).UpdateOne(ctx, bson.M{ - "project_id": projectID, - "key": key, - }, bson.M{ - "$set": bson.M{ - StatusKey: database.ClientActivated, - "updated_at": now, - "metadata": metadata, - }, - }, options.UpdateOne().SetUpsert(true)) - if err != nil { - return nil, fmt.Errorf("activate client of %s: %w", key, err) - } - - var result *mongo.SingleResult - if res.UpsertedCount > 0 { - result = c.collection(ColClients).FindOneAndUpdate(ctx, bson.M{ - "project_id": projectID, - "_id": res.UpsertedID, - }, bson.M{ - "$set": bson.M{ - "created_at": now, - }, - }) - } else { - result = c.collection(ColClients).FindOne(ctx, bson.M{ - "project_id": projectID, - "key": key, - }) - } + result := c.collection(ColClients).FindOneAndUpdate( + ctx, + bson.M{ + "project_id": projectID, + "key": key, + }, + bson.M{ + "$set": bson.M{ + StatusKey: database.ClientActivated, + "metadata": metadata, + }, + // Prevent timestamp regression on concurrent writes + "$max": bson.M{ + "updated_at": now, + }, + // Ensure created_at exists only on insert + "$setOnInsert": bson.M{ + "created_at": now, + }, + }, + options.FindOneAndUpdate(). + SetUpsert(true). + SetReturnDocument(options.After), + )
1060-1085: Bug: swallows non-ErrNoDocuments decode errors and caches a zero-value structIf result.Decode fails with an error other than ErrNoDocuments, the function falls through, caches an empty ClientInfo, and returns it. Return the decode error instead and only cache on success.
Apply this diff:
clientInfo := database.ClientInfo{} if err := result.Decode(&clientInfo); err != nil { if err == mongo.ErrNoDocuments { return nil, fmt.Errorf("find client of %s: %w", refKey, database.ErrClientNotFound) } + return nil, fmt.Errorf("find client of %s: %w", refKey, err) } if err := c.clientInfoCache.Set(refKey, &clientInfo); err != nil { return nil, fmt.Errorf("set client info in cache: %w", err) }
2076-2107: Bug: DB errors in IsDocumentAttached return true, masking failuresThe fallback returns true for any non-ErrNoDocuments error, which can erroneously block compaction/purges and hide operational issues. Return the error instead.
Apply this diff:
result := c.collection(ColClients).FindOne(ctx, filter) if result.Err() == mongo.ErrNoDocuments { return false, nil } - - return true, nil + if result.Err() != nil { + return false, fmt.Errorf("check attachment of %s: %w", docRefKey, result.Err()) + } + return true, nil
🧹 Nitpick comments (4)
server/backend/database/mongo/client.go (4)
136-147: Also purge changeCache in Close()Minor resource hygiene: changeCache isn’t purged on shutdown while doc/vector caches are. Purge it to avoid lingering memory.
Apply this diff:
if err := c.client.Disconnect(context.Background()); err != nil { return fmt.Errorf("close MongoDB client: %w", err) } c.docCache.Purge() + c.changeCache.Purge() c.vectorCache.Purge() return nil
810-810: Metric mislabel: recording a “miss” without a cache lookupYou record an ActivateClient miss but never check the cache. Either check cache and record hit/miss or drop the metric.
861-893: Cache write-back path: ensure timely persistence and retriesOn cache hit, you mutate cached ClientInfo to DocumentAttaching and return without a guaranteed immediate DB write. Confirm UpdateClientInfo marks the entry dirty and triggers/adheres to bounded-latency flush with retries/backoff. If not, attach/deactivate races may occur or be lost on crash.
Would you like a follow-up patch to add retry/backoff and metrics for FlushToDB failures?
917-922: Nit: fix error message formattingRemove the stray space before the colon in the error message.
Apply this diff:
- return nil, fmt.Errorf("try attaching %s to %s : %w", docID, refKey.ClientID, err) + return nil, fmt.Errorf("try attaching %s to %s: %w", docID, refKey.ClientID, err)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
server/backend/database/memory/database.go(1 hunks)server/backend/database/mongo/client.go(17 hunks)server/backend/database/mongo/client_cache.go(1 hunks)server/clients/clients.go(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- server/backend/database/memory/database.go
- server/clients/clients.go
- server/backend/database/mongo/client_cache.go
🧰 Additional context used
🧬 Code Graph Analysis (1)
server/backend/database/mongo/client.go (4)
api/types/resource_ref_key.go (2)
DocRefKey(24-27)ClientRefKey(35-38)server/backend/database/mongo/client_cache.go (3)
ClientInfoCache(101-111)NewClientInfoCache(114-136)CacheMetrics(36-50)server/backend/database/client_info.go (6)
ClientActivated(43-43)DocumentAttached(49-49)ClientDocInfo(56-60)DocumentAttaching(48-48)ClientDeactivated(42-42)ClientInfo(66-92)server/backend/database/database.go (1)
ErrClientNotFound(46-46)
🔇 Additional comments (9)
server/backend/database/mongo/client.go (9)
120-131: Good fix: single Client instance initializationThis addresses the double-initialization issue and ensures the cache references the correct Client instance. Clean and correct.
938-1014: LGTM: cache-first deactivate with correctness checksProperly validates activation and no-attached-docs precondition in cache path, mirrors DB filter with $expr fallback, and updates cache on DB path. Looks solid.
1016-1055: LGTM: housekeeping deactivation bypasses cache safelyDirect DB update semantics are clear. Returning (nil, nil) on “not found/already deactivated” is acceptable for housekeeping flows and you still refresh the cache when a change occurs.
1094-1131: LGTM: safer error handling in DB-read pathUsing result.Err() for existence check and populating the cache with provided clientInfo avoids stale reads. This addresses the earlier “wrap wrong error” issue.
1226-1247: Deterministic ordering for cached attached clients — good improvementSorting by client ID after a cache hit provides stable results. Reasonable tradeoff vs. pushing sort to callers.
1424-1441: Targeted cache invalidation is a solid optimizationSelective invalidation for clients attached to the removed doc, with a safe fallback to full invalidation on failure, is the right balance.
2271-2287: Targeted cache invalidation on purge — goodMatches the removed-doc path. Fallback to full invalidation is pragmatic.
2380-2394: LGTM: cache metrics accessorsNon-intrusive accessors with nil-guard are fine. Useful for observability.
149-155: Verified: clientInfoCache closed in Backend.ShutdownThe
Backend.Shutdownmethod (server/backend/backend.go lines 229–235) already invokesmongoClient.CloseClientInfoCache()before disconnecting the DB. All shutdown paths (including Yorkie.Shutdown and tests) go through this, so no additional calls are required.
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (3)
server/backend/database/mongo/client.go (1)
2342-2358: LGTM! Consistent cache invalidation pattern.The invalidation logic for document purging follows the same selective pattern as document removal, maintaining consistency.
server/backend/database/mongo/client_cache.go (2)
473-475: LGTM! Error handling improved for adaptive flush.The error logging for flush failures has been properly implemented.
612-619: Add the missing clientDocInfoKey helper function.The function is currently defined in client.go but should be local to this file to avoid cross-file coupling.
+// clientDocInfoKey returns the key for the client document info. +func clientDocInfoKey(docID types.ID, prefix string) string { + return fmt.Sprintf("documents.%s.%s", docID, prefix) +} + // Add document-specific updates for docID, docInfo := range clientInfo.Documents { docKey := clientDocInfoKey(docID, "server_seq")
🧹 Nitpick comments (7)
test/integration/routing_test.go (2)
40-114: Consider extracting test constants and improving documentation.The test validates cross-server routing well, but could benefit from:
- Extracting magic numbers into named constants
- Adding documentation about the test scenario
func TestClientRouting(t *testing.T) { ctx := context.Background() + // Test constants + const ( + testDelay = 150 * time.Millisecond // Delay to reduce cross-server write races + numUpdates = 2 // Number of updates per client + maxRetries = 5 // Maximum retry attempts for attachment + ) + const clientKey = "C1-routing-test" docKey := helper.TestDocKey(t) + // This test validates that a client can seamlessly move between servers + // while maintaining document state through a shared database backend. + // Test flow: Server A -> Server B -> Server A with the same client key + // Start two servers A and B sharing the same DB. svrA := helper.TestServer() assert.NoError(t, svrA.Start()) defer func() { _ = svrA.Shutdown(true) }() svrB := helper.TestServer() assert.NoError(t, svrB.Start()) defer func() { _ = svrB.Shutdown(true) }() // t0: Connect to A with fixed client key cliA, err := client.Dial(svrA.RPCAddr(), client.WithKey(clientKey)) assert.NoError(t, err) assert.NoError(t, cliA.Activate(ctx)) docA := document.New(docKey) assert.NoError(t, cliA.Attach(ctx, docA)) - for i := 0; i < 2; i++ { + for i := 0; i < numUpdates; i++ { err := docA.Update(func(r *json.Object, p *presence.Presence) error { r.SetString("k", "vA") return nil }) assert.NoError(t, err) assert.NoError(t, cliA.Sync(ctx)) } // Ensure the server has the latest state from A assert.NoError(t, cliA.Detach(ctx, docA)) // Small delay to reduce cross-server write races - time.Sleep(150 * time.Millisecond) + time.Sleep(testDelay) // t1: Connect to B with SAME client key cliB, err := client.Dial(svrB.RPCAddr(), client.WithKey(clientKey)) assert.NoError(t, err) assert.NoError(t, cliB.Activate(ctx)) docB := document.New(docKey) - assert.NoError(t, attachWithRetry(ctx, cliB, docB, 5)) + assert.NoError(t, attachWithRetry(ctx, cliB, docB, maxRetries)) assert.NoError(t, cliB.Sync(ctx)) gotB := docB.Marshal() assert.Contains(t, gotB, "\"k\":\"vA\"") - for i := 0; i < 2; i++ { + for i := 0; i < numUpdates; i++ { err := docB.Update(func(r *json.Object, p *presence.Presence) error { r.SetString("k", "vB") return nil }) assert.NoError(t, err) assert.NoError(t, cliB.Sync(ctx)) } assert.NoError(t, cliB.Detach(ctx, docB)) // Small delay to reduce cross-server write races - time.Sleep(150 * time.Millisecond) + time.Sleep(testDelay) // t2: Reconnect to A with SAME client key cliA2, err := client.Dial(svrA.RPCAddr(), client.WithKey(clientKey)) assert.NoError(t, err) assert.NoError(t, cliA2.Activate(ctx)) docA2 := document.New(docKey) - assert.NoError(t, attachWithRetry(ctx, cliA2, docA2, 5)) + assert.NoError(t, attachWithRetry(ctx, cliA2, docA2, maxRetries)) assert.NoError(t, cliA2.Sync(ctx)) gotA2 := docA2.Marshal() assert.Contains(t, gotA2, "\"k\":\"vB\"") err = docA2.Update(func(r *json.Object, p *presence.Presence) error { r.SetString("k", "vA2") return nil }) assert.NoError(t, err) assert.NoError(t, cliA2.Sync(ctx)) }
216-216: Reduce or extract hardcoded sleep duration.The 4500ms sleep appears multiple times and could block test execution. Consider extracting this to a constant or using a polling approach.
+const cacheTTLWaitTime = 4500 * time.Millisecond // Wait for cache TTL to expire + // t2: Deactivate client on backend A _, err = backendA.DB.DeactivateClient(ctx, clientInfoA.RefKey()) assert.NoError(t, err) // Small delay to work ttl cleanup - time.Sleep(4500 * time.Millisecond) + time.Sleep(cacheTTLWaitTime)Also applies to: 245-245, 263-263
server/backend/database/mongo/client.go (3)
1495-1511: Consider implementing more targeted cache invalidation.While the current implementation attempts selective invalidation, consider maintaining a reverse index of document-to-client mappings in the cache for more efficient invalidation without the fallback to
InvalidateAll().The fallback to full cache invalidation could impact performance in high-traffic scenarios. Consider:
- Maintaining a reverse mapping in the cache (document -> clients)
- Using a more resilient error handling that doesn't require full invalidation
- Adding metrics to track how often the fallback is triggered
2467-2514: Consider consolidating these helper methods.The
updateDocumentStatusInDBDirectandupdateStatusInDBDirectmethods duplicate logic that exists in the cache implementation. Consider reusing the cache's DB update methods or extracting common logic to avoid duplication.-// updateDocumentStatusInDBDirect updates document status in database directly -func (c *Client) updateDocumentStatusInDBDirect(refKey types.ClientRefKey, docID types.ID, status string) error { - updateDoc := bson.M{ - "$set": bson.M{ - "updated_at": gotime.Now(), - }, - } - - // Add document-specific updates - docKey := clientDocInfoKey(docID, StatusKey) - updateDoc["$set"].(bson.M)[docKey] = status - - // Reset sequence numbers for detached/removed/attaching documents - if status == database.DocumentDetached || status == database.DocumentRemoved || status == database.DocumentAttaching { - serverSeqKey := clientDocInfoKey(docID, "server_seq") - clientSeqKey := clientDocInfoKey(docID, "client_seq") - updateDoc["$set"].(bson.M)[serverSeqKey] = int64(0) - updateDoc["$set"].(bson.M)[clientSeqKey] = uint32(0) - } - - _, err := c.collection(ColClients).UpdateOne( - context.Background(), - bson.M{ - "project_id": refKey.ProjectID, - "_id": refKey.ClientID, - }, - updateDoc, - ) - return err -} +// updateDocumentStatusInDBDirect updates document status in database directly +func (c *Client) updateDocumentStatusInDBDirect(refKey types.ClientRefKey, docID types.ID, status string) error { + // Delegate to cache's DB update method which has the same logic + return c.clientInfoCache.updateDocumentStatusInDB(refKey, docID, status) +}
810-850: Verify metadata upsert semanticsI didn’t find any existing tests or code paths that relied on the old behavior of including
metadatain the filter—no tests assert that differing metadata produces distinct client documents. That said, movingmetadatainto the$setdoes change semantics: now metadata changes update the existing client instead of inserting a new one.• If you intend this as a deliberate behavior change, please add a note to the PR description (and any migration guide) clarifying that clients are now uniqued by
(project_id, key)only, with metadata merged on each activation.
• Otherwise, consider restoring metadata in the filter if callers expect each unique metadata set to create its own client document.server/backend/database/mongo/client_cache.go (2)
78-89: Review default cache configuration values.The default TTL of 3 seconds seems quite aggressive and could lead to frequent cache misses. Consider increasing it based on typical client session durations and update patterns.
func DefaultCacheConfig() *CacheConfig { return &CacheConfig{ BaseFlushInterval: 10 * time.Second, MaxFlushInterval: 60 * time.Second, MinFlushInterval: 5 * time.Second, MaxCacheSize: 20000, WritePressureThreshold: 500, PressureCheckInterval: 10 * time.Second, EnableFlushCleanup: true, - TTL: 3 * time.Second, + TTL: 30 * time.Second, // Increase TTL for better cache hit rates } }
983-987: Consider removing or improving the warning log.The warning log at line 986 could be noisy in production if this is an expected scenario for new document attachments.
- logging.DefaultLogger().Warnf("Client not found in DB for document attachment, creating minimal cache entry: %v", err) + // Expected for new document attachments - client will be loaded when needed + logging.DefaultLogger().Debugf("Client not found in DB for document attachment: %v", err)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
server/backend/database/mongo/client.go(17 hunks)server/backend/database/mongo/client_cache.go(1 hunks)test/integration/routing_test.go(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (3)
test/integration/routing_test.go (5)
server/rpc/connecthelper/status.go (1)
CodeOf(165-176)test/helper/helper.go (13)
TestDocKey(344-373)AdminUser(71-71)AdminPassword(72-72)UseDefaultProject(74-74)ClientDeactivateThreshold(81-81)ProjectCacheTTL(94-94)AdminTokenDuration(80-80)MongoConnectionURI(97-97)MongoConnectionTimeout(98-98)MongoPingTimeout(99-99)HousekeepingInterval(75-75)HousekeepingCandidatesLimitPerProject(76-76)HousekeepingProjectFetchSize(77-77)client/options.go (1)
WithKey(56-58)server/profiling/prometheus/metrics.go (1)
NewMetrics(77-221)server/backend/database/client_info.go (3)
ClientActivated(43-43)ClientDeactivated(42-42)DocumentAttaching(48-48)
server/backend/database/mongo/client_cache.go (5)
server/backend/database/client_info.go (6)
ClientInfo(66-92)ClientDocInfo(56-60)DocumentDetached(50-50)DocumentRemoved(51-51)DocumentAttached(49-49)DocumentAttaching(48-48)api/types/resource_ref_key.go (1)
ClientRefKey(35-38)server/backend/database/mongo/client.go (2)
Client(51-59)StatusKey(47-47)server/backend/database/mongo/indexes.go (1)
ColClients(36-36)server/backend/database/database.go (1)
ErrClientNotFound(46-46)
server/backend/database/mongo/client.go (4)
api/types/resource_ref_key.go (2)
DocRefKey(24-27)ClientRefKey(35-38)server/backend/database/mongo/client_cache.go (3)
ClientInfoCache(110-120)NewClientInfoCache(123-146)CacheMetrics(37-55)server/backend/database/client_info.go (8)
ClientActivated(43-43)DocumentAttached(49-49)ClientDocInfo(56-60)DocumentAttaching(48-48)ClientDeactivated(42-42)ClientInfo(66-92)DocumentDetached(50-50)DocumentRemoved(51-51)server/backend/database/database.go (1)
ErrClientNotFound(46-46)
🔇 Additional comments (10)
test/integration/routing_test.go (1)
24-38: LGTM! Helper function handles retries appropriately.The retry logic with exponential backoff is appropriate for handling temporary conflicts during attachment operations in distributed systems.
server/backend/database/mongo/client.go (6)
120-131: LGTM! Clean initialization of client with cache.The client initialization now properly creates a single instance and assigns the cache with the correct reference.
862-898: LGTM! Efficient cache-first approach for TryAttaching.The implementation correctly handles the cache-first path with proper validation and write-through updates to the database.
1029-1068: LGTM! Clean separation of housekeeping deactivation logic.The dedicated housekeeping deactivation method bypasses cache checks appropriately and handles the "not found" case gracefully by returning nil.
1298-1318: LGTM! Efficient cache utilization for finding attached clients.The implementation leverages cache effectively and includes proper sorting for deterministic results.
2147-2161: Cache-aware document attachment check is well-implemented.The method efficiently checks cache first before falling back to database queries.
1119-1199: Error path on cache miss is correct; no stale cache entry to invalidateThe
UpdateClientInfoAfterPushPullbranch for cache misses only executes whenc.clientInfoCache.Get(...)returnsnil, so there is no existing cache entry that could go stale. It correctly returnsErrClientNotFoundwithout callingSet, and thus requires no additional invalidation here. Invalidation of stale entries is already handled elsewhere viaclientInfoCache.Invalidatecalls.server/backend/database/mongo/client_cache.go (3)
148-179: LGTM! Well-designed expired entry handling.The background handling of expired entries with checkpoint preservation is a good design choice that balances performance with data consistency.
805-876: Well-implemented checkpoint update with write-back strategy.The UpdateCheckpoint method correctly implements the write-back strategy with max operator semantics for consistency.
878-925: Efficient CAS write-through implementation for status updates.The method correctly checks for changes before updating the database, avoiding unnecessary writes.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (4)
test/integration/routing_test.go (4)
29-29: Replace deprecatedconnect.CodeOfwithconnecthelper.CodeOf.The
connect.CodeOffunction is deprecated. Useconnecthelper.CodeOfinstead, which is already available in the codebase.Add the import:
+ "github.com/yorkie-team/yorkie/server/rpc/connecthelper"Update the code:
- if connect.CodeOf(err) == connect.CodeFailedPrecondition { + if connecthelper.CodeOf(err) == connecthelper.CodeFailedPrecondition {
40-114: Improve test isolation and resource cleanup.The test has good coverage but could benefit from:
- Using
deferfor client cleanup immediately after creation- Detaching documents in deferred cleanup to ensure proper cleanup even on test failure
Apply these improvements:
func TestClientRouting(t *testing.T) { ctx := context.Background() const clientKey = "C1-routing-test" docKey := helper.TestDocKey(t) // Start two servers A and B sharing the same DB. svrA := helper.TestServer() assert.NoError(t, svrA.Start()) defer func() { _ = svrA.Shutdown(true) }() svrB := helper.TestServer() assert.NoError(t, svrB.Start()) defer func() { _ = svrB.Shutdown(true) }() // t0: Connect to A with fixed client key cliA, err := client.Dial(svrA.RPCAddr(), client.WithKey(clientKey)) assert.NoError(t, err) + defer func() { _ = cliA.Deactivate(ctx) }() assert.NoError(t, cliA.Activate(ctx)) docA := document.New(docKey) assert.NoError(t, cliA.Attach(ctx, docA)) + defer func() { _ = cliA.Detach(ctx, docA) }()
120-199: Extract backend configuration to reduce duplication.The backend configuration is nearly identical for both backends A and B. Consider extracting it to a helper function to improve maintainability.
Create a helper function:
func createTestBackend(t *testing.T) (*backend.Backend, error) { met, err := prometheus.NewMetrics() if err != nil { return nil, err } return backend.New( &backend.Config{ AdminUser: helper.AdminUser, AdminPassword: helper.AdminPassword, UseDefaultProject: helper.UseDefaultProject, ClientDeactivateThreshold: helper.ClientDeactivateThreshold, SnapshotThreshold: helper.SnapshotThreshold, SnapshotCacheSize: helper.SnapshotCacheSize, AuthWebhookCacheSize: helper.AuthWebhookSize, AuthWebhookCacheTTL: helper.AuthWebhookCacheTTL.String(), AuthWebhookMaxWaitInterval: helper.AuthWebhookMaxWaitInterval.String(), AuthWebhookMinWaitInterval: helper.AuthWebhookMinWaitInterval.String(), AuthWebhookRequestTimeout: helper.AuthWebhookRequestTimeout.String(), EventWebhookMaxWaitInterval: helper.EventWebhookMaxWaitInterval.String(), EventWebhookMinWaitInterval: helper.EventWebhookMinWaitInterval.String(), EventWebhookRequestTimeout: helper.EventWebhookRequestTimeout.String(), ProjectCacheSize: helper.ProjectCacheSize, ProjectCacheTTL: helper.ProjectCacheTTL.String(), AdminTokenDuration: helper.AdminTokenDuration, }, &mongo.Config{ ConnectionURI: helper.MongoConnectionURI, YorkieDatabase: helper.TestDBName(), ConnectionTimeout: helper.MongoConnectionTimeout, PingTimeout: helper.MongoPingTimeout, }, &housekeeping.Config{ Interval: helper.HousekeepingInterval.String(), CandidatesLimitPerProject: helper.HousekeepingCandidatesLimitPerProject, ProjectFetchSize: helper.HousekeepingProjectFetchSize, }, met, nil, nil) }Then update the test:
- metA, err := prometheus.NewMetrics() - assert.NoError(t, err) - - backendA, err := backend.New( - // ... long config ... - ) + backendA, err := createTestBackend(t) assert.NoError(t, err) defer func() { _ = backendA.Shutdown() }() - metB, err := prometheus.NewMetrics() - assert.NoError(t, err) - - backendB, err := backend.New( - // ... long config ... - ) + backendB, err := createTestBackend(t) assert.NoError(t, err) defer func() { _ = backendB.Shutdown() }()
230-251: Consider adding assertion for document existence.The test verifies that the document is in the
DocumentAttachingstatus, but doesn't verify that backend B can actually find the document by key. This would provide more comprehensive coverage.Add this assertion after line 250:
assert.NoError(t, err) assert.Equal(t, database.DocumentAttaching, clientInfoB.Documents[docInfoA.ID].Status) + + // Verify backend B can find the document + docInfoB, err := backendB.DB.FindDocInfo(ctx, clientInfoA.RefKey(), docKey) + assert.NoError(t, err) + assert.Equal(t, docInfoA.ID, docInfoB.ID)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
server/backend/database/mongo/client_cache.go(1 hunks)test/integration/routing_test.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- server/backend/database/mongo/client_cache.go
🧰 Additional context used
🧬 Code graph analysis (1)
test/integration/routing_test.go (6)
server/backend/database/mongo/client.go (2)
Client(51-59)Dial(62-132)server/rpc/connecthelper/status.go (1)
CodeOf(165-176)test/helper/helper.go (1)
TestDocKey(344-373)client/options.go (1)
WithKey(56-58)server/profiling/prometheus/metrics.go (1)
NewMetrics(77-221)server/backend/database/client_info.go (3)
ClientActivated(43-43)ClientDeactivated(42-42)DocumentAttaching(48-48)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: load-test
- GitHub Check: complex-test
- GitHub Check: bench
- GitHub Check: build
🔇 Additional comments (2)
test/integration/routing_test.go (2)
200-228: Good test structure with clear state transitions!The test clearly documents the progression through different states (t0-t4) and validates cache behavior across backends. The assertion that the same client key results in the same client ID is particularly important for consistency.
253-275: Test effectively validates cross-backend housekeeping behavior.The test properly verifies that housekeeping deactivation on one backend is visible to another backend after cache TTL, and that the client can be reactivated with the same ID.
| _, err = backendA.DB.DeactivateClient(ctx, clientInfoA.RefKey()) | ||
| assert.NoError(t, err) | ||
| // Small delay to work ttl cleanup | ||
| time.Sleep(11 * time.Second) |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Replace hard-coded cache TTL delays with configurable constants.
The tests use hard-coded 11 * time.Second delays in multiple places. This value appears to be related to cache TTL behavior but is not documented. Consider extracting this to a named constant with documentation.
Add a constant at the package level:
// cacheTTLWait is the time to wait for cache TTL expiration.
// This should be slightly longer than the actual cache TTL to ensure reliable test behavior.
const cacheTTLWait = 11 * time.SecondThen update the code:
- time.Sleep(11 * time.Second)
+ time.Sleep(cacheTTLWait)Also applies to: 245-245, 263-263
🤖 Prompt for AI Agents
In test/integration/routing_test.go around lines 216, 245 and 263 the tests use
hard-coded sleeps like "11 * time.Second"; extract that magic value into a
documented package-level constant named cacheTTLWait (e.g. "// cacheTTLWait is
the time to wait for cache TTL expiration. This should be slightly longer than
the actual cache TTL to ensure reliable test behavior." const cacheTTLWait = 11
* time.Second) and replace the three occurrences of time.Sleep(11 * time.Second)
with time.Sleep(cacheTTLWait); ensure the file imports time (or keeps it) and
that the constant is placed at the top of the file so all tests can use it.
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (1)
server/backend/database/mongo/client_cache.go (1)
614-614: Missing clientDocInfoKey helper function.The code uses
clientDocInfoKeyfunction which is not defined in this file, causing a compilation error.Add the missing helper function:
+// clientDocInfoKey returns the key for the client document info. +func clientDocInfoKey(docID types.ID, prefix string) string { + return fmt.Sprintf("documents.%s.%s", docID, prefix) +} + // buildUpdateModel builds a MongoDB update model for the given client info func (c *ClientInfoCache) buildUpdateModel(refKey types.ClientRefKey, clientInfo *database.ClientInfo) mongo.WriteModel {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
server/backend/database/mongo/client_cache.go(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
server/backend/database/mongo/client_cache.go (5)
server/backend/database/client_info.go (6)
ClientInfo(66-92)ClientDocInfo(56-60)DocumentDetached(50-50)DocumentRemoved(51-51)DocumentAttached(49-49)DocumentAttaching(48-48)api/types/resource_ref_key.go (1)
ClientRefKey(35-38)server/backend/database/mongo/client.go (2)
Client(51-59)StatusKey(47-47)server/backend/database/mongo/indexes.go (1)
ColClients(36-36)server/backend/database/database.go (1)
ErrClientNotFound(46-46)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: load-test
- GitHub Check: complex-test
- GitHub Check: bench
- GitHub Check: build
🔇 Additional comments (4)
server/backend/database/mongo/client_cache.go (4)
182-201: LGTM! Efficient cache retrieval with TTL handling.The implementation correctly handles TTL expiration by checking the expiration time and scheduling background cleanup while returning a cache miss. The use of
DeepCopy()prevents data races.
329-367: LGTM! Solid merge logic with MongoDB $max semantics.The merge implementation correctly applies MongoDB's $max operator semantics for sequence numbers and handles special cases for detached/removed documents by resetting sequences. The logic is well-structured and matches the expected behavior.
553-599: LGTM! Well-designed bulk flush implementation.The flush logic correctly snapshots dirty entries, releases locks during database operations, and includes proper cleanup. The bulk write approach is efficient for performance.
807-878: LGTM! Proper write-back strategy for checkpoint updates.The checkpoint update implementation correctly uses write-back caching with MongoDB's $max operator semantics, handles cache misses by loading from DB, and maintains consistency.
| interval := c.config.MinFlushInterval + | ||
| time.Duration(float64(c.config.MaxFlushInterval-c.config.MinFlushInterval)*ratio) | ||
|
|
||
| return interval |
There was a problem hiding this comment.
Fix potential panic in flush interval calculation.
When MaxFlushInterval is shorter than MinFlushInterval, the subtraction c.config.MaxFlushInterval-c.config.MinFlushInterval will result in a negative duration, which can cause unexpected behavior.
Apply this diff to add validation:
// calculateFlushInterval calculates adaptive flush interval based on pressure
func (c *ClientInfoCache) calculateFlushInterval() time.Duration {
c.pressureMu.RLock()
pressureLevel := c.writePressure.PressureLevel
c.pressureMu.RUnlock()
+ // Validate configuration
+ if c.config.MaxFlushInterval < c.config.MinFlushInterval {
+ return c.config.MinFlushInterval
+ }
+
// Higher pressure = shorter interval
ratio := 1.0 - pressureLevel
interval := c.config.MinFlushInterval +
time.Duration(float64(c.config.MaxFlushInterval-c.config.MinFlushInterval)*ratio)
return interval
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| interval := c.config.MinFlushInterval + | |
| time.Duration(float64(c.config.MaxFlushInterval-c.config.MinFlushInterval)*ratio) | |
| return interval | |
| // calculateFlushInterval calculates adaptive flush interval based on pressure | |
| func (c *ClientInfoCache) calculateFlushInterval() time.Duration { | |
| c.pressureMu.RLock() | |
| pressureLevel := c.writePressure.PressureLevel | |
| c.pressureMu.RUnlock() | |
| // Validate configuration | |
| if c.config.MaxFlushInterval < c.config.MinFlushInterval { | |
| return c.config.MinFlushInterval | |
| } | |
| // Higher pressure = shorter interval | |
| ratio := 1.0 - pressureLevel | |
| interval := c.config.MinFlushInterval + | |
| time.Duration(float64(c.config.MaxFlushInterval-c.config.MinFlushInterval)*ratio) | |
| return interval | |
| } |
🤖 Prompt for AI Agents
In server/backend/database/mongo/client_cache.go around lines 538 to 541, the
flush interval calculation can panic or behave unexpectedly if MaxFlushInterval
< MinFlushInterval; compute diff := c.config.MaxFlushInterval -
c.config.MinFlushInterval, if diff < 0 set diff = 0 (or swap intervals) and
clamp ratio to [0,1], then calculate interval := c.config.MinFlushInterval +
time.Duration(float64(diff)*ratio) and return interval; optionally add a
debug/warn log when the config is invalid.
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (1)
server/backend/database/mongo/client_cache.go (1)
536-542: Clamp invalid flush interval configs; avoid negative durations.If MaxFlushInterval < MinFlushInterval (misconfig), interval becomes negative. Add a guard and clamp.
func (c *ClientInfoCache) calculateFlushInterval() time.Duration { @@ - // Higher pressure = shorter interval + // Guard against invalid configuration. + if c.config.MaxFlushInterval < c.config.MinFlushInterval { + return c.config.MinFlushInterval + } + // Higher pressure = shorter interval ratio := 1.0 - pressureLevel interval := c.config.MinFlushInterval + time.Duration(float64(c.config.MaxFlushInterval-c.config.MinFlushInterval)*ratio) return interval }
🧹 Nitpick comments (8)
server/backend/database/mongo/client_cache.go (8)
204-212: Avoid DB operations while holding the cache lock in expiration paths.handleExpiration and ttlCleanup call handleExpiredEntry (which performs DB IO) while c.mu is held, blocking all cache operations and risking priority inversion.
Refactor to capture-and-release before flushing:
func (c *ClientInfoCache) handleExpiration(refKey types.ClientRefKey) { - c.mu.Lock() - defer c.mu.Unlock() - - if cached, exists := c.cache[refKey]; exists && time.Now().After(cached.ExpiresAt) { - c.handleExpiredEntry(refKey, cached) - delete(c.cache, refKey) - } + var cached *CachedClientInfo + c.mu.Lock() + if e, exists := c.cache[refKey]; exists && time.Now().After(e.ExpiresAt) { + cached = e + delete(c.cache, refKey) + } + c.mu.Unlock() + if cached != nil { + c.handleExpiredEntry(refKey, cached) + } } func (c *ClientInfoCache) ttlCleanup() { ticker := time.NewTicker(c.config.TTL) defer ticker.Stop() @@ - c.mu.Lock() - now := time.Now() - for refKey, cached := range c.cache { - if now.After(cached.ExpiresAt) { - c.handleExpiredEntry(refKey, cached) - delete(c.cache, refKey) - } - } - c.mu.Unlock() + var toFlush map[types.ClientRefKey]*CachedClientInfo + now := time.Now() + c.mu.Lock() + for refKey, cached := range c.cache { + if now.After(cached.ExpiresAt) { + if toFlush == nil { + toFlush = make(map[types.ClientRefKey]*CachedClientInfo) + } + toFlush[refKey] = cached + delete(c.cache, refKey) + } + } + c.mu.Unlock() + for refKey, cached := range toFlush { + c.handleExpiredEntry(refKey, cached) + }Do the same “capture, unlock, flush” pattern for handleBatchExpiration:
func (c *ClientInfoCache) handleBatchExpiration(refKeys []types.ClientRefKey) { - c.mu.Lock() - defer c.mu.Unlock() - - now := time.Now() - for _, refKey := range refKeys { - if cached, exists := c.cache[refKey]; exists && now.After(cached.ExpiresAt) { - c.handleExpiredEntry(refKey, cached) - delete(c.cache, refKey) - } - } + now := time.Now() + toFlush := make(map[types.ClientRefKey]*CachedClientInfo, len(refKeys)) + c.mu.Lock() + for _, refKey := range refKeys { + if cached, exists := c.cache[refKey]; exists && now.After(cached.ExpiresAt) { + toFlush[refKey] = cached + delete(c.cache, refKey) + } + } + c.mu.Unlock() + for refKey, cached := range toFlush { + c.handleExpiredEntry(refKey, cached) + } }Also applies to: 488-500, 266-278
338-352: Consistency: Reset seq on 'attaching' as well.Elsewhere (UpdateDocumentStatus) you reset sequences for attaching, detached, and removed. mergeClientInfo resets only for detached/removed. Align behavior.
- // Reset sequence numbers when document status changes - if newDocInfo.Status == database.DocumentDetached || newDocInfo.Status == database.DocumentRemoved { + // Reset sequence numbers when status transitions require it + if newDocInfo.Status == database.DocumentDetached || + newDocInfo.Status == database.DocumentRemoved || + newDocInfo.Status == database.DocumentAttaching { existingDocInfo.ServerSeq = 0 existingDocInfo.ClientSeq = 0 existingDocInfo.Status = newDocInfo.Status
663-667: Make metrics logging idempotent; avoid duplicate goroutines.StartMetricsLogging is called in the constructor and is also public. If called again, it spawns duplicate loggers. Guard with sync.Once.
type ClientInfoCache struct { mu sync.RWMutex cache map[types.ClientRefKey]*CachedClientInfo - flushCh chan struct{} stopCh chan struct{} config *CacheConfig client *Client writePressure *WritePressure pressureMu sync.RWMutex metrics *CacheMetrics + metricsOnce sync.Once } @@ // Start metrics logging - cache.StartMetricsLogging() + cache.StartMetricsLogging() @@ func (c *ClientInfoCache) StartMetricsLogging() { - go c.periodicMetricsLogging() + c.metricsOnce.Do(func() { + go c.periodicMetricsLogging() + }) }Optional: if flushCh is not used, remove it entirely (see separate comment).
Also applies to: 110-120, 138-146
411-434: Avoid DB writes while holding c.mu in evictOldest.evictOldest calls flushSingleToDB while Set holds c.mu. This blocks all writers during IO. Not a correctness bug, but it can stall cache ops under load.
Refactor to identify the oldest under lock, release the lock to flush, then re-lock and delete if still present. This two-phase approach prevents long critical sections. I can provide a concrete diff if you prefer this change now.
122-136: Validate configuration to prevent zero/negative tickers.If TTL or PressureCheckInterval are zero or negative, time.NewTicker will panic. Add simple validation in the constructor.
Example change:
func NewClientInfoCache(config *CacheConfig, client *Client) *ClientInfoCache { if config == nil { config = DefaultCacheConfig() } + // Basic validation + if config.TTL <= 0 { + config.TTL = 10 * time.Second + } + if config.PressureCheckInterval <= 0 { + config.PressureCheckInterval = 10 * time.Second + } + if config.MinFlushInterval <= 0 { + config.MinFlushInterval = 5 * time.Second + } + if config.MaxFlushInterval < config.MinFlushInterval { + config.MaxFlushInterval = config.MinFlushInterval + } cache := &ClientInfoCache{ cache: make(map[types.ClientRefKey]*CachedClientInfo), - flushCh: make(chan struct{}, 1), stopCh: make(chan struct{}), config: config, client: client, writePressure: &WritePressure{}, metrics: &CacheMetrics{}, }Also applies to: 77-89
410-420: TTL and eviction metadata usage.Using UpdatedAt for eviction is okay; consider renaming to LastAccess if you decide to update it on reads later for a true LRU.
606-606: Follow-up: tests to cover failure and TTL paths.
- FlushToDB should re-mark entries dirty on error (after the refactor).
- TTL expiration should only $max checkpoints and not alter statuses/attached_docs.
- InvalidateAll should not deadlock and must clear entries.
Do you want me to draft table-driven tests for these scenarios (using a fake mongo.Client or an interface abstraction) and a small harness to simulate BulkWrite failures?
19-34: Remove unusedstringsimport andflushChfieldThe
stringspackage and theflushChchannel are never referenced elsewhere in the repository; removing them will clean up dead code and avoid confusion.• File:
server/backend/database/mongo/client_cache.go
– Remove thestringsimport (currently at line 21).
– Remove theflushCh chan struct{}field in theClientInfoCachestruct (line 113).
– Remove theflushCh: make(chan struct{}, 1),initialization inNewClientInfoCache(line 130).Suggested diff:
--- a/server/backend/database/mongo/client_cache.go +++ b/server/backend/database/mongo/client_cache.go @@ -19,7 +19,6 @@ import ( "context" "fmt" - "strings" "sync" "sync/atomic" "time" @@ -113,7 +112,6 @@ type ClientInfoCache struct { mu sync.RWMutex cache map[types.ClientRefKey]*CachedClientInfo - flushCh chan struct{} stopCh chan struct{} config *CacheConfig } @@ -128,7 +126,6 @@ func NewClientInfoCache(config *CacheConfig) *ClientInfoCache { cache: make(map[types.ClientRefKey]*CachedClientInfo), - flushCh: make(chan struct{}, 1), stopCh: make(chan struct{}), config: config, }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
server/backend/database/mongo/client_cache.go(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
server/backend/database/mongo/client_cache.go (6)
server/backend/database/client_info.go (6)
ClientInfo(66-92)ClientDocInfo(56-60)DocumentDetached(50-50)DocumentRemoved(51-51)DocumentAttached(49-49)DocumentAttaching(48-48)api/types/resource_ref_key.go (1)
ClientRefKey(35-38)server/backend/database/mongo/client.go (2)
Client(51-59)StatusKey(47-47)server/logging/logging.go (1)
DefaultLogger(85-90)server/backend/database/mongo/indexes.go (1)
ColClients(36-36)server/backend/database/database.go (1)
ErrClientNotFound(46-46)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: load-test
- GitHub Check: complex-test
- GitHub Check: bench
- GitHub Check: build
🔇 Additional comments (9)
server/backend/database/mongo/client_cache.go (9)
748-812: Metrics look solid.Atomic loads with derived percentages and cache usage ratio are clean and safe. Good separation between counters and logging.
814-885: Checkpoint update path is correct and safe.Deep-copying maps, applying $max logic, and marking Dirty with TTL refresh is correct for write-back semantics.
936-1031: Write-through status updates are coherent.You avoid unnecessary DB writes, then mirror the change into cache. Sequence reset rules for status transitions are applied consistently here.
441-460: Post-flush cleanup heuristic is reasonable.The “recently flushed and not dirty” removal with logging provides a pragmatic bound on cache size while minimizing churn. Works well with the adaptive flush loop.
481-481: Minor: adaptive flush logging improved.Errors are logged; combined with pressure-based interval, this is fine.
1-16: License header and package structure.Header and package naming look consistent with the repo.
569-594: BulkWrite result handling is good.Including the operation count and warning on partial modifications is useful for ops visibility.
1033-1053: DB lookup error wrapping is correct.Wrapping ErrNoDocuments with ErrClientNotFound preserves sentinel error semantics for errors.Is.
284-301: Set path is straightforward and safe.Deep-copy on insert, TTL assignment, and size check are correct. Pair this with the evictOldest refactor to avoid holding the lock over IO.
| // handleExpiredEntry processes an expired cache entry | ||
| // Status information is discarded, checkpoint information is flushed to DB using max operator | ||
| func (c *ClientInfoCache) handleExpiredEntry(refKey types.ClientRefKey, cached *CachedClientInfo) { | ||
| atomic.AddInt64(&c.metrics.TTLExpirations, 1) | ||
|
|
||
| // Create a copy of client info with only checkpoint information | ||
| checkpointInfo := &database.ClientInfo{ | ||
| ID: cached.ClientInfo.ID, | ||
| ProjectID: cached.ClientInfo.ProjectID, | ||
| Key: cached.ClientInfo.Key, | ||
| Status: cached.ClientInfo.Status, | ||
| Documents: make(map[types.ID]*database.ClientDocInfo), | ||
| Metadata: cached.ClientInfo.Metadata, | ||
| CreatedAt: cached.ClientInfo.CreatedAt, | ||
| UpdatedAt: cached.ClientInfo.UpdatedAt, | ||
| } | ||
|
|
||
| // Copy only checkpoint information (ServerSeq, ClientSeq) from documents | ||
| for docID, docInfo := range cached.ClientInfo.Documents { | ||
| checkpointInfo.Documents[docID] = &database.ClientDocInfo{ | ||
| ServerSeq: docInfo.ServerSeq, | ||
| ClientSeq: docInfo.ClientSeq, | ||
| Status: docInfo.Status, | ||
| } | ||
| } | ||
|
|
||
| // Flush checkpoint information to DB using max operator | ||
| if err := c.flushSingleToDB(refKey, checkpointInfo); err != nil { | ||
| atomic.AddInt64(&c.metrics.TTLFlushErrors, 1) | ||
| logging.DefaultLogger().Errorf("Failed to flush expired entry checkpoint to DB: %v", err) | ||
| } | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion
TTL-expiration flush should not mutate statuses or attached_docs.
handleExpiredEntry says “Status information is discarded,” but buildUpdateModel unconditionally $set’s per-document Status and attached_docs. TTL-driven flushes should only $max the checkpoint (ServerSeq/ClientSeq) to avoid regressing status in DB.
Introduce a checkpoint-only write model and use it from TTL paths:
func (c *ClientInfoCache) handleExpiredEntry(refKey types.ClientRefKey, cached *CachedClientInfo) {
atomic.AddInt64(&c.metrics.TTLExpirations, 1)
@@
- // Flush checkpoint information to DB using max operator
- if err := c.flushSingleToDB(refKey, checkpointInfo); err != nil {
+ // Flush only checkpoints ($max), do not touch statuses or attached_docs.
+ if err := c.flushSingleCheckpointToDB(refKey, checkpointInfo); err != nil {
atomic.AddInt64(&c.metrics.TTLFlushErrors, 1)
logging.DefaultLogger().Errorf("Failed to flush expired entry checkpoint to DB: %v", err)
}
}
// buildUpdateModel builds a MongoDB update model for the given client info
func (c *ClientInfoCache) buildUpdateModel(refKey types.ClientRefKey,
clientInfo *database.ClientInfo) mongo.WriteModel {
@@
// Update attached_docs array with correct state
attachedDocs := clientInfo.AttachedDocuments()
updateDoc["$set"].(bson.M)["attached_docs"] = attachedDocs
return mongo.NewUpdateOneModel().
SetFilter(bson.M{
"project_id": refKey.ProjectID,
"_id": refKey.ClientID,
}).
SetUpdate(updateDoc)
}
+// buildCheckpointUpdateModel builds an update that only $max-es checkpoint fields.
+func (c *ClientInfoCache) buildCheckpointUpdateModel(refKey types.ClientRefKey,
+ clientInfo *database.ClientInfo) mongo.WriteModel {
+ updateDoc := bson.M{
+ "$max": bson.M{},
+ }
+ for docID, docInfo := range clientInfo.Documents {
+ docKey := clientDocInfoKey(docID, "server_seq")
+ updateDoc["$max"].(bson.M)[docKey] = docInfo.ServerSeq
+ docKey = clientDocInfoKey(docID, "client_seq")
+ updateDoc["$max"].(bson.M)[docKey] = docInfo.ClientSeq
+ }
+ return mongo.NewUpdateOneModel().
+ SetFilter(bson.M{
+ "project_id": refKey.ProjectID,
+ "_id": refKey.ClientID,
+ }).
+ SetUpdate(updateDoc)
+}
+
// flushSingleToDB flushes a single client info entry to the database
func (c *ClientInfoCache) flushSingleToDB(refKey types.ClientRefKey, clientInfo *database.ClientInfo) error {
update := c.buildUpdateModel(refKey, clientInfo)
_, err := c.client.collection(ColClients).BulkWrite(context.Background(), []mongo.WriteModel{update})
if err != nil {
return fmt.Errorf("flush single client info: %w", err)
}
return nil
}
+
+// flushSingleCheckpointToDB flushes only checkpoint fields ($max) for a single entry.
+func (c *ClientInfoCache) flushSingleCheckpointToDB(refKey types.ClientRefKey, clientInfo *database.ClientInfo) error {
+ update := c.buildCheckpointUpdateModel(refKey, clientInfo)
+ _, err := c.client.collection(ColClients).BulkWrite(context.Background(), []mongo.WriteModel{update})
+ if err != nil {
+ return fmt.Errorf("flush single client checkpoint: %w", err)
+ }
+ return nil
+}Also applies to: 608-641, 643-653
| func (c *ClientInfoCache) InvalidateAll() error { | ||
| c.mu.Lock() | ||
| defer c.mu.Unlock() | ||
|
|
||
| // Force flush all dirty entries before clearing | ||
| if err := c.FlushToDB(); err != nil { | ||
| return fmt.Errorf("failed to flush all dirty entries: %w", err) | ||
| } | ||
| c.cache = make(map[types.ClientRefKey]*CachedClientInfo) | ||
| return nil | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Deadlock: InvalidateAll holds c.mu and re-enters via FlushToDB.
InvalidateAll acquires c.mu and calls FlushToDB, which also acquires c.mu. This is a guaranteed self-deadlock.
Apply this minimal fix to flush outside the lock:
func (c *ClientInfoCache) InvalidateAll() error {
- c.mu.Lock()
- defer c.mu.Unlock()
-
- // Force flush all dirty entries before clearing
- if err := c.FlushToDB(); err != nil {
- return fmt.Errorf("failed to flush all dirty entries: %w", err)
- }
- c.cache = make(map[types.ClientRefKey]*CachedClientInfo)
- return nil
+ // Flush outside the lock to avoid deadlock.
+ if err := c.FlushToDB(); err != nil {
+ return fmt.Errorf("failed to flush all dirty entries: %w", err)
+ }
+ // Now clear the cache safely.
+ c.mu.Lock()
+ c.cache = make(map[types.ClientRefKey]*CachedClientInfo)
+ c.mu.Unlock()
+ return nil
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func (c *ClientInfoCache) InvalidateAll() error { | |
| c.mu.Lock() | |
| defer c.mu.Unlock() | |
| // Force flush all dirty entries before clearing | |
| if err := c.FlushToDB(); err != nil { | |
| return fmt.Errorf("failed to flush all dirty entries: %w", err) | |
| } | |
| c.cache = make(map[types.ClientRefKey]*CachedClientInfo) | |
| return nil | |
| } | |
| func (c *ClientInfoCache) InvalidateAll() error { | |
| // Flush outside the lock to avoid deadlock. | |
| if err := c.FlushToDB(); err != nil { | |
| return fmt.Errorf("failed to flush all dirty entries: %w", err) | |
| } | |
| // Now clear the cache safely. | |
| c.mu.Lock() | |
| c.cache = make(map[types.ClientRefKey]*CachedClientInfo) | |
| c.mu.Unlock() | |
| return nil | |
| } |
| c.mu.Lock() | ||
| dirtyClients := make(map[types.ClientRefKey]*database.ClientInfo) | ||
|
|
||
| for refKey, cached := range c.cache { | ||
| if cached.Dirty { | ||
| // Take a deep copy snapshot to avoid data races with concurrent updates | ||
| // after releasing the lock. | ||
| dirtyClients[refKey] = cached.ClientInfo.DeepCopy() | ||
| cached.Dirty = false | ||
| cached.LastFlush = time.Now() | ||
| } | ||
| } | ||
| c.mu.Unlock() |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Data loss risk: Dirty cleared before DB success in FlushToDB.
You clear Dirty and set LastFlush while still holding the lock, before the BulkWrite completes. If the write fails, entries remain non-dirty and won’t be retried, leading to silent data loss.
Refactor to only clear Dirty and set LastFlush after a successful write:
func (c *ClientInfoCache) FlushToDB() error {
- c.mu.Lock()
- dirtyClients := make(map[types.ClientRefKey]*database.ClientInfo)
-
- for refKey, cached := range c.cache {
- if cached.Dirty {
- // Take a deep copy snapshot to avoid data races with concurrent updates
- // after releasing the lock.
- dirtyClients[refKey] = cached.ClientInfo.DeepCopy()
- cached.Dirty = false
- cached.LastFlush = time.Now()
- }
- }
- c.mu.Unlock()
+ // 1) Snapshot dirty entries without mutating state.
+ c.mu.Lock()
+ dirtyClients := make(map[types.ClientRefKey]*database.ClientInfo)
+ dirtyKeys := make([]types.ClientRefKey, 0)
+ for refKey, cached := range c.cache {
+ if cached.Dirty {
+ dirtyClients[refKey] = cached.ClientInfo.DeepCopy()
+ dirtyKeys = append(dirtyKeys, refKey)
+ }
+ }
+ c.mu.Unlock()
if len(dirtyClients) == 0 {
return nil
}
@@
// Execute bulk update
if len(updates) > 0 {
result, err := c.client.collection(ColClients).BulkWrite(context.Background(), updates)
if err != nil {
return fmt.Errorf("bulk write client info (%d operations): %w", len(updates), err)
}
if result.ModifiedCount != int64(len(updates)) {
logging.DefaultLogger().Warnf(
"Bulk write partial success: expected %d modifications, got %d",
len(updates), result.ModifiedCount,
)
}
}
- // Update last flush time
- c.pressureMu.Lock()
- c.writePressure.LastFlushTime = time.Now()
- c.pressureMu.Unlock()
+ // 2) Mark flushed entries clean after success.
+ now := time.Now()
+ c.mu.Lock()
+ for _, refKey := range dirtyKeys {
+ if entry, ok := c.cache[refKey]; ok {
+ entry.Dirty = false
+ entry.LastFlush = now
+ }
+ }
+ c.mu.Unlock()
+
+ // Update last flush time (pressure metrics).
+ c.pressureMu.Lock()
+ c.writePressure.LastFlushTime = now
+ c.pressureMu.Unlock()
// Cleanup cache after successful flush
if c.config.EnableFlushCleanup {
c.cleanupAfterFlush()
}
return nil
}Also applies to: 580-606
🤖 Prompt for AI Agents
In server/backend/database/mongo/client_cache.go around lines 555-567 (and
similarly 580-606), the code clears Cached.Dirty and updates LastFlush while
still holding the lock and before the BulkWrite completes, risking silent data
loss if the DB write fails; change the flow to (1) while holding the lock,
snapshot deep copies of dirty entries into a local map and DO NOT modify the
cached entries, then release the lock; (2) perform the BulkWrite using the
snapshot; (3) only if the BulkWrite succeeds, re-acquire the lock and for each
successfully-written refKey clear Dirty and set LastFlush on the actual cache
entries (and handle partial failures by only clearing those that succeeded);
ensure errors from the write cause no mutation of Dirty flags so failed entries
remain retried.
| // Check if update is needed by comparing with cache first | ||
| c.mu.Lock() | ||
| if cached, exists := c.cache[refKey]; exists { | ||
| if cached.ClientInfo.Documents != nil { | ||
| if docInfo, exists := cached.ClientInfo.Documents[docID]; exists { | ||
| if docInfo.Status == status { | ||
| // No change needed, skip database update | ||
| c.mu.Unlock() | ||
| return nil | ||
| } | ||
| } | ||
| } | ||
| } | ||
| c.mu.Unlock() | ||
|
|
There was a problem hiding this comment.
🛠️ Refactor suggestion
Use errors.Is for ErrClientNotFound detection; remove brittle string matching.
strings.Contains(err.Error(), "client not found") is fragile. loadClientInfoFromDB wraps ErrClientNotFound; prefer errors.Is and simplify imports.
@@
-import (
- "context"
- "fmt"
- "strings"
- "sync"
+import (
+ "context"
+ "errors"
+ "fmt"
+ "sync"
"sync/atomic"
"time"
@@
- // Update in DB (Write-through)
+ // Update in DB (Write-through)
if err := c.updateDocumentStatusInDB(refKey, docID, status); err != nil {
return fmt.Errorf("update document status in DB: %w", err)
}
@@
- clientInfo, err := c.loadClientInfoFromDB(refKey)
+ clientInfo, err := c.loadClientInfoFromDB(refKey)
if err != nil {
- // If client not found in DB, this might be a new document attachment
- // In this case, we need to create a minimal client info structure
- if strings.Contains(err.Error(), "client not found") {
+ if errors.Is(err, database.ErrClientNotFound) {
return fmt.Errorf("cannot update document status for non-existent client %s: %w",
refKey.ClientID, err)
}
return fmt.Errorf("load client info from DB: %w", err)
}Note: Removing strings import here keeps imports clean.
Also applies to: 989-998, 19-26
🤖 Prompt for AI Agents
In server/backend/database/mongo/client_cache.go around lines 939-953 (and
similarly update sites at 989-998 and import area at 19-26), the code currently
detects a missing client by doing strings.Contains(err.Error(), "client not
found"); replace that brittle string match with errors.Is(err,
ErrClientNotFound) (ensure the package that defines ErrClientNotFound is
imported or referenced correctly), remove the now-unused strings import, and
simplify the error handling branch to treat ErrClientNotFound as the not-found
case while preserving other error propagation paths.
|
This PR leads to #1472 and following PRs. |
What this PR does / why we need it:
This PR introduces an in-memory caching system for Checkpoint (ClientInfo) to reduce MongoDB write overhead and improve response latency during presence-related operations such as pushpull, attach, and detach.
Motivation:
Previously, every pushpull synchronization triggered immediate writes to MongoDB. As the number of active clients and attached documents grew, the DB write frequency increased, leading to performance bottlenecks and higher latency.
Approach:
Implemented a ClientInfoCache layer above the MongoDB backend, using a read-through / write-back caching strategy.
Benefits:
Known limitations:
Which issue(s) this PR fixes:
#1381
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?:
Additional documentation:
Checklist:
Summary by CodeRabbit
New Features
Bug Fixes
Chores
Tests