Checkpoint(ClientInfo) Caching System by mnnseong · Pull Request #1424 · yorkie-team/yorkie

mnnseong · 2025-08-01T09:49:15Z

What this PR does / why we need it:
This PR introduces an in-memory caching system for Checkpoint (ClientInfo) to reduce MongoDB write overhead and improve response latency during presence-related operations such as pushpull, attach, and detach.

Motivation:
Previously, every pushpull synchronization triggered immediate writes to MongoDB. As the number of active clients and attached documents grew, the DB write frequency increased, leading to performance bottlenecks and higher latency.
Approach:
Implemented a ClientInfoCache layer above the MongoDB backend, using a read-through / write-back caching strategy.
- Read-through: On cache miss, client info is loaded from DB and stored in memory.
- Write-back: Checkpoint updates are applied to cache first, marked as dirty, and flushed to DB in bulk.
- Write-through: Some operations may still use write-through when strict consistency is required (e.g., status changes).
- Each cached entry stores additional metadata (Dirty flag, LastFlush timestamp, TTL expiration).
- Adaptive flush mechanism: flush interval is adjusted dynamically based on write pressure calculated by the number of dirty entries.
- TTL cleanup: expired entries trigger checkpoint-only flush ($max operator) and removal from cache.
- Optional post-flush cleanup removes non-dirty entries to control memory usage.
Benefits:
- Reduced latency: Presence APIs (attach, detach, pushpull) now often hit the cache and skip DB I/O, lowering response time.
- Lower DB write volume: Multiple small updates are coalesced into batched bulk writes, reducing redundant updates and easing DB load.
- Resource efficiency: Less DB pressure translates into higher throughput, lower scaling cost.
- Consistency: DB remains the source of truth. Critical updates are written through immediately, and cache misses always fall back to DB queries, ensuring correctness across multiple servers.
Known limitations:
- We observed partial successes in bulk writes and performance drops under extreme load. As future work, we propose enhancing concurrency control (e.g., lock striping).
- Cache size is capped (default 20k entries), but eviction policy is still basic. Large-scale deployments will require further tuning and monitoring.

Which issue(s) this PR fixes:
#1381

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

Additional documentation:

Checklist:

Added relevant tests or not required
Addressed and resolved all CodeRabbit review comments
Didn't break anything

Summary by CodeRabbit

New Features
- In-memory client-info cache (read-through/write-back), cache-aware read/write paths, public cache controls/metrics, and a housekeeping deactivation API.
Bug Fixes
- Reduced stale-data windows via targeted invalidation, deterministic ordering for attached-client lists, and reliable DB fallbacks on cache miss/errors.
Chores
- Adaptive TTL/eviction, pressure-aware flushes, periodic metrics/logging, and integrated shutdown/cleanup.
Tests
- New integration tests for cross-server routing, cache semantics, attachment behavior, and housekeeping deactivation.

coderabbitai · 2025-08-01T09:49:26Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds a thread-safe, adaptive write-back in-memory ClientInfo cache to the MongoDB backend; integrates cache-first read/write flows with DB fallbacks, targeted invalidation, metrics, lifecycle APIs, a cache-bypass housekeeping deactivation API, and integration tests for cross-server routing and cache semantics. (50 words)

Changes

Cohort / File(s)	Change Summary
Client Info Cache Implementation `server/backend/database/mongo/client_cache.go`	New `ClientInfoCache` and supporting types (`CacheConfig`, `CacheMetrics`, `CachedClientInfo`, `WritePressure`); TTL, eviction, dirty-tracking, adaptive write-pressure flush scheduling, bulk/single flush to MongoDB, Set/Update/Invalidate/Flush APIs, metrics logging, Start/Close lifecycle, and copy-on-return semantics.
MongoDB Client Integration `server/backend/database/mongo/client.go`	Adds `clientInfoCache *ClientInfoCache` to `Client`; initializes in Dial; exposes `CloseClientInfoCache()`, `GetClientInfoCacheMetrics()`, `LogClientInfoCacheMetrics()`, `DeactivateClientForHousekeeping()`; incorporates cache-first logic across ActivateClient, TryAttaching, DeactivateClient, FindClientInfoByRefKey, UpdateClientInfoAfterPushPull, FindAttachedClientInfosByRefKey, UpdateDocInfoStatusToRemoved, PurgeDocument, IsDocumentAttached; preserves DB fallbacks and deterministic sorting.
Database Interface & Memory DB `server/backend/database/database.go`, `server/backend/database/memory/database.go`	Adds `DeactivateClientForHousekeeping(ctx, refKey)` to `Database` interface and implements it in the in-memory DB to perform direct DB-side deactivation that bypasses cache.
Clients / Housekeeping `server/clients/clients.go`, `server/clients/housekeeping.go`	Adds `DeactivateForHousekeeping` and `FindClientInfoForDeactivation`; housekeeping now uses `DeactivateForHousekeeping` to detach docs and deactivate clients without full activation.
Backend Shutdown `server/backend/backend.go`	`Backend.Shutdown` now calls `CloseClientInfoCache()` when DB is the Mongo client and logs any cache-close errors.
Tests: Integration Routing `test/integration/routing_test.go`	New integration tests validating client/cluster routing and cache semantics across two servers sharing a DB; includes attach-with-retry helper and MongoDB-specific subtests for activation, attachment, deactivation, and housekeeping.
Misc (imports/ordering) `server/backend/database/mongo/client.go` (and others)	Adds `sort` import and small wiring/comments about cache lifecycle differences; Dial now wires cache to client; Close notes separate cache shutdown.

Sequence Diagram(s)

sequenceDiagram
    participant API as Caller
    participant MongoClient as MongoClient (client.go)
    participant Cache as ClientInfoCache
    participant DB as MongoDB

    API->>MongoClient: Activate / TryAttach / Deactivate / Find / Update
    MongoClient->>Cache: Get(refKey)
    alt cache hit
        Cache-->>MongoClient: ClientInfo (cached)
        MongoClient->>Cache: Update / MarkDirty / Invalidate (as needed)
        MongoClient-->>API: return cached response
    else cache miss
        MongoClient->>DB: Query / Upsert ClientInfo
        DB-->>MongoClient: DB result
        MongoClient->>Cache: Set(refKey, ClientInfo)
        MongoClient-->>API: return DB response
    end

    Note over Cache,DB: Background flushes adapt to write-pressure (batched or single)
    Cache->>DB: FlushToDB (bulk or single)
    DB-->>Cache: ack
    API->>MongoClient: Shutdown
    MongoClient->>Cache: CloseClientInfoCache() -> final flush

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~40 minutes

Possibly related issues

Introduce Checkpoint Cache for PushPull Optimization #1381 — Implements a write-back ClientInfo cache with TTL, eviction, and $max-style DB flush semantics, aligning with this PR’s ClientInfoCache design.

Possibly related PRs

Introduce a cache for document info for push pack #1333 — Overlaps on Mongo client cache wiring and Dial/Close integration.
Fix race condition between Attach/PushPull and Deactivation #1414 — Touches attachment/deactivation logic (TryAttaching/Deactivate) modified here.
Migrate MongoDB Go driver from v1.17 to v2.2 #1405 — Modifies MongoDB client struct/Dial plumbing similar to this PR.

Suggested labels

enhancement 🌟

Suggested reviewers

hackerwins
blurfx

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.2.2)

Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/product/migration-guide for migration instructions
The command is terminated due to an error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/product/migration-guide for migration instructions

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch cp-caching

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (2)

server/backend/database/mongo/client.go (1)

875-877: Consider caching the activated client instead of invalidating.

After successfully activating a client in the database, the code invalidates the cache entry. This seems counterproductive as the client info is fresh and could benefit future reads.

Replace the invalidation with a cache update:
-	// Invalidate cache for this client
-	c.clientInfoCache.Invalidate(clientInfo.RefKey())
+	// Update cache with the activated client info
+	c.clientInfoCache.Set(clientInfo.RefKey(), &clientInfo)

server/backend/database/mongo/client_cache.go (1)

450-482: Optimize update model to avoid empty $max operator.

The code always includes a $max field even when it might be empty, which is unnecessary.

Only include $max when needed:

 func (c *ClientInfoCache) buildUpdateModel(refKey types.ClientRefKey,
 	clientInfo *database.ClientInfo) mongo.WriteModel {
 	// Build the update document
 	updateDoc := bson.M{
 		"$set": bson.M{
 			"updated_at": clientInfo.UpdatedAt,
 		},
-		"$max": bson.M{},
 	}
 
+	maxUpdates := bson.M{}
 	// Add document-specific updates
 	for docID, docInfo := range clientInfo.Documents {
 		docKey := clientDocInfoKey(docID, "server_seq")
-		updateDoc["$max"].(bson.M)[docKey] = docInfo.ServerSeq
+		maxUpdates[docKey] = docInfo.ServerSeq
 
 		docKey = clientDocInfoKey(docID, "client_seq")
-		updateDoc["$max"].(bson.M)[docKey] = docInfo.ClientSeq
+		maxUpdates[docKey] = docInfo.ClientSeq
 
 		docKey = clientDocInfoKey(docID, StatusKey)
 		updateDoc["$set"].(bson.M)[docKey] = docInfo.Status
 	}
 
+	if len(maxUpdates) > 0 {
+		updateDoc["$max"] = maxUpdates
+	}
+
 	// Update attached_docs array with correct state
 	attachedDocs := clientInfo.AttachedDocuments()
 	updateDoc["$set"].(bson.M)["attached_docs"] = attachedDocs

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6d75800 and 5903d4c.

📒 Files selected for processing (2)

server/backend/database/mongo/client.go (13 hunks)
server/backend/database/mongo/client_cache.go (1 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: in the `reactivateclients` function in `migrations/v0.5.6/document-detach.go`, when `totalcount` is ...

Learnt from: raararaara
PR: yorkie-team/yorkie#1062
File: migrations/v0.5.6/document-detach.go:166-167
Timestamp: 2024-11-14T05:10:21.629Z
Learning: In the `ReactivateClients` function in `migrations/v0.5.6/document-detach.go`, when `totalCount` is zero, the cursor will be empty, and the loop where `percentage` is calculated won't execute, so division by zero is not a concern.

Applied to files:

server/backend/database/mongo/client.go

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: load-test
GitHub Check: complex-test
GitHub Check: build

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (3)

server/backend/database/mongo/client.go (2)

1397-1414: LGTM! Selective cache invalidation implementation.

The selective invalidation logic is well-implemented with proper fallback to full invalidation when selective invalidation fails.

2216-2233: LGTM! Consistent selective invalidation pattern.

The same selective invalidation pattern is correctly applied in PurgeDocument, maintaining consistency with UpdateDocInfoStatusToRemoved.
server/backend/database/mongo/client_cache.go (1)
135-146: Fix O(n) lookup performance in GetByKey.

The current implementation iterates through all cache entries to find a client by key, which is inefficient for large caches.

Consider maintaining a secondary index for efficient lookups:
type ClientInfoCache struct {
	mu            sync.RWMutex
	cache         map[types.ClientRefKey]*CachedClientInfo
+	keyIndex      map[string]types.ClientRefKey // Maps "projectID:key" to ClientRefKey
	// ... other fields
}

func (c *ClientInfoCache) GetByKey(projectID types.ID, key string) *database.ClientInfo {
	c.mu.RLock()
	defer c.mu.RUnlock()
	
+	indexKey := string(projectID) + ":" + key
+	if refKey, exists := c.keyIndex[indexKey]; exists {
+		if cached, exists := c.cache[refKey]; exists && time.Now().Before(cached.ExpiresAt) {
+			return cached.ClientInfo
+		}
+	}
	return nil
}

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5903d4c and b5c2e87.

📒 Files selected for processing (2)

server/backend/database/mongo/client.go (13 hunks)
server/backend/database/mongo/client_cache.go (1 hunks)

🧰 Additional context used

🧠 Learnings (3)

📚 Learning: in `pkg/document/change/id.go`, the `initialid` variable represents the initial state where nothing ...

Learnt from: chacha912
PR: yorkie-team/yorkie#1164
File: test/helper/helper.go:120-125
Timestamp: 2025-02-21T04:54:29.326Z
Learning: In `pkg/document/change/id.go`, the `InitialID` variable represents the initial state where nothing has been edited. Consider using a function like `GetInitialID()` instead of the package-level variable to enforce proper initialization and prevent shared mutable state.

Applied to files:

server/backend/database/mongo/client.go

📚 Learning: in the `reactivateclients` function in `migrations/v0.5.6/document-detach.go`, when `totalcount` is ...

Learnt from: raararaara
PR: yorkie-team/yorkie#1062
File: migrations/v0.5.6/document-detach.go:166-167
Timestamp: 2024-11-14T05:10:21.629Z
Learning: In the `ReactivateClients` function in `migrations/v0.5.6/document-detach.go`, when `totalCount` is zero, the cursor will be empty, and the loop where `percentage` is calculated won't execute, so division by zero is not a concern.

Applied to files:

server/backend/database/mongo/client.go

📚 Learning: in `server/rpc/cluster_server.go`, it's acceptable to return errors directly without additional cont...

Learnt from: hackerwins
PR: yorkie-team/yorkie#1036
File: server/rpc/cluster_server.go:56-121
Timestamp: 2024-10-17T06:34:51.208Z
Learning: In `server/rpc/cluster_server.go`, it's acceptable to return errors directly without additional context in the `DetachDocument` method.

Applied to files:

server/backend/database/mongo/client.go

🧬 Code Graph Analysis (1)

server/backend/database/mongo/client.go (5)

api/types/resource_ref_key.go (2)

DocRefKey (24-27)

ClientRefKey (35-38)

server/backend/database/mongo/client_cache.go (2)

ClientInfoCache (86-95)

NewClientInfoCache (98-117)

server/backend/database/client_info.go (8)

ClientActivated (43-43)

ErrClientNotActivated (32-32)

DocumentAttached (49-49)

ClientDocInfo (56-60)

DocumentAttaching (48-48)

ErrAttachedDocumentExists (37-37)

ClientDeactivated (42-42)

ClientInfo (66-92)

server/backend/database/database.go (1)

ErrClientNotFound (46-46)

server/logging/logging.go (1)

DefaultLogger (85-90)

🔇 Additional comments (19)

server/backend/database/mongo/client.go (6)

54-58: LGTM! Clean struct field addition.

The new clientInfoCache field is properly added to the Client struct alongside existing cache fields.

119-131: LGTM! Proper cache initialization pattern.

The cache initialization correctly creates the client instance first, then initializes the cache with a reference to it. This avoids the double initialization issue mentioned in past reviews.

142-148: LGTM! Proper cache cleanup.

The cache cleanup is properly implemented in the Close method with appropriate error handling.

878-918: LGTM! Cache integration in TryAttaching looks correct.

The method properly checks cache first, validates state, updates cache, and flushes immediately for consistency.

963-1004: LGTM! Proper cache handling in DeactivateClient.

The method correctly validates client state from cache and flushes changes immediately for critical operations.

1051-1086: LGTM! Cache-first pattern in FindClientInfoByRefKey.

The method properly implements cache-first lookup with fallback to database and cache population.

server/backend/database/mongo/client_cache.go (13)

33-51: LGTM! Well-structured cache configuration.

The configuration struct provides comprehensive tuning parameters for cache behavior including TTL, flush intervals, and pressure thresholds.

53-65: LGTM! Reasonable default configuration values.

The default values provide a good balance between performance and resource usage. The 5-second base flush interval and 10-minute TTL are appropriate for most use cases.

67-84: LGTM! Comprehensive cache entry metadata.

The CachedClientInfo and WritePressure structs capture all necessary metadata for cache management and adaptive flushing.

97-117: LGTM! Proper cache initialization.

The constructor properly initializes all fields and starts background goroutines for adaptive flushing and cleanup.

171-201: LGTM! Fixed deadlock issue from past reviews.

The UpdateClientInfo method now correctly avoids the deadlock by implementing the logic inline rather than calling Set() recursively.

203-240: LGTM! Sophisticated merge logic for ClientInfo.

The merge logic properly handles sequence number updates with $max semantics and correctly resets sequences for detached/removed documents.

242-257: LGTM! Improved error handling for invalidation.

The Invalidate method now properly returns errors instead of silently ignoring them, addressing the concern from past reviews.

259-270: LGTM! Proper error handling in InvalidateAll.

The method correctly flushes all dirty entries before clearing and propagates any flush errors.

354-366: LGTM! Well-designed adaptive flush interval calculation.

The pressure-based interval calculation provides good responsiveness under high write load while maintaining efficiency during low activity.

395-420: LGTM! Comprehensive expired entry cleanup.

The cleanup logic properly handles dirty expired entries by flushing them before removal and includes appropriate logging.

443-482: LGTM! Efficient bulk flush implementation.

The bulk flush implementation efficiently batches updates and uses MongoDB's bulk write operations for optimal performance.

484-517: LGTM! Comprehensive MongoDB update model.

The update model correctly uses $max operations for sequence numbers and $set for other fields, maintaining consistency with MongoDB semantics.

531-537: LGTM! Proper cache shutdown.

The Close method correctly stops background goroutines and flushes remaining dirty entries.

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (2)

server/backend/database/mongo/client.go (2)
795-846: Remove metadata from upsert filter to prevent duplicate documents.

Including metadata in the filter can cause duplicate client documents if metadata changes between calls. This should be in the $set or $setOnInsert clause instead.

Apply this diff:
 res, err := c.collection(ColClients).UpdateOne(ctx, bson.M{
     "project_id": projectID,
     "key":        key,
-    "metadata":   metadata,
 }, bson.M{
     "$set": bson.M{
         StatusKey:    database.ClientActivated,
         "updated_at": now,
+        "metadata":   metadata,
     },
 }, options.UpdateOne().SetUpsert(true))
1043-1080: Ensure flush reliability for critical push-pull operations.

The write-back strategy here relies on background flushing which could fail. Consider adding retry logic or monitoring for flush failures.

🧹 Nitpick comments (1)

server/backend/database/mongo/client_cache.go (1)
466-470: Consider re-enabling cleanup after resolving the underlying issue.

The cleanup is disabled to prevent "client not found" errors in housekeeping. This seems like a temporary workaround - consider tracking this as a TODO with an issue number.
 // Cleanup cache after successful flush
-// Temporarily disabled to prevent "client not found" errors in housekeeping
-// if c.config.EnableFlushCleanup {
-// 	c.cleanupAfterFlush()
-// }
+// TODO(#issue): Re-enable cleanup after fixing "client not found" errors in housekeeping
+// Temporarily disabled to prevent errors when housekeeping queries recently flushed clients
+// if c.config.EnableFlushCleanup {
+// 	c.cleanupAfterFlush()
+// }

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b5c2e87 and 06ecf54.

📒 Files selected for processing (3)

server/backend/backend.go (1 hunks)
server/backend/database/mongo/client.go (17 hunks)
server/backend/database/mongo/client_cache.go (1 hunks)

🧰 Additional context used

🧠 Learnings (4)

📚 Learning: in `pkg/document/change/id.go`, the `initialid` variable represents the initial state where nothing ...

Learnt from: chacha912
PR: yorkie-team/yorkie#1164
File: test/helper/helper.go:120-125
Timestamp: 2025-02-21T04:54:29.326Z
Learning: In `pkg/document/change/id.go`, the `InitialID` variable represents the initial state where nothing has been edited. Consider using a function like `GetInitialID()` instead of the package-level variable to enforce proper initialization and prevent shared mutable state.

Applied to files:

server/backend/database/mongo/client.go

📚 Learning: in the `reactivateclients` function in `migrations/v0.5.6/document-detach.go`, when `totalcount` is ...

Learnt from: raararaara
PR: yorkie-team/yorkie#1062
File: migrations/v0.5.6/document-detach.go:166-167
Timestamp: 2024-11-14T05:10:21.629Z
Learning: In the `ReactivateClients` function in `migrations/v0.5.6/document-detach.go`, when `totalCount` is zero, the cursor will be empty, and the loop where `percentage` is calculated won't execute, so division by zero is not a concern.

Applied to files:

server/backend/database/mongo/client.go

📚 Learning: in `server/rpc/cluster_server.go`, it's acceptable to return errors directly without additional cont...

Learnt from: hackerwins
PR: yorkie-team/yorkie#1036
File: server/rpc/cluster_server.go:56-121
Timestamp: 2024-10-17T06:34:51.208Z
Learning: In `server/rpc/cluster_server.go`, it's acceptable to return errors directly without additional context in the `DetachDocument` method.

Applied to files:

server/backend/database/mongo/client.go

📚 Learning: go 1.22+ supports the integer range syntax (`for range n` and `for i := range n`), which is particul...

Learnt from: window9u
PR: yorkie-team/yorkie#1166
File: test/bench/webhook_bench_test.go:122-163
Timestamp: 2025-03-12T06:17:36.021Z
Learning: Go 1.22+ supports the integer range syntax (`for range n` and `for i := range n`), which is particularly useful for benchmarking code. This allows for more concise iteration over integer values compared to traditional three-clause for loops.

Applied to files:

server/backend/database/mongo/client.go

🧬 Code Graph Analysis (1)

server/backend/backend.go (2)

server/backend/database/mongo/client.go (1)

Client (51-59)

server/logging/logging.go (1)

DefaultLogger (85-90)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: load-test
GitHub Check: complex-test
GitHub Check: bench
GitHub Check: build

🔇 Additional comments (17)

server/backend/backend.go (1)

231-236: LGTM! Proper cache cleanup during shutdown.

The cache closure is correctly placed before database disconnection, and the error handling follows the existing pattern in the shutdown sequence.

server/backend/database/mongo/client.go (9)

55-59: LGTM! Clean cache field addition to Client struct.

The clientInfoCache field is properly placed alongside other cache fields.

120-131: LGTM! Correct single-instance initialization pattern.

The client instance is created once and properly passed to the cache constructor, avoiding the previous double-initialization issue.

135-156: LGTM! Well-structured cache lifecycle management.

The separation of cache closure from the main Close() method is appropriate, with clear documentation and defensive nil checking.

855-923: LGTM! Well-implemented cache-first approach for TryAttaching.

The method properly checks cache first, performs in-memory updates with deep copying to prevent corruption, and falls back to database operations when needed. Metrics recording is consistently applied.

1175-1216: LGTM! Efficient cache utilization for finding attached clients.

The method effectively uses cache-first strategy and ensures consistent ordering of results. The fallback to database is properly implemented.

1373-1390: LGTM! Optimized selective cache invalidation.

The selective invalidation of only attached clients is much more efficient than the previous full invalidation approach. The fallback ensures correctness.

2013-2044: LGTM! Efficient cache-first document attachment check.

The method properly utilizes the cache to avoid database queries when possible, with correct handling of the exclude client parameter.

2208-2225: LGTM! Consistent selective invalidation for document purge.

The invalidation logic matches the pattern used in UpdateDocInfoStatusToRemoved, maintaining consistency across the codebase.

2317-2331: LGTM! Clean metrics exposure methods.

The wrapper methods provide safe access to cache metrics with appropriate nil checking.

server/backend/database/mongo/client_cache.go (7)

50-135: LGTM! Well-structured cache configuration and initialization.

The configuration provides good tuning knobs with sensible defaults. The initialization properly starts background tasks.

137-179: LGTM! Proper use of deep copying for cache safety.

All getter methods correctly return deep copies to prevent external modifications from corrupting the cache.

181-225: LGTM! Correct cache entry management with proper dirty tracking.

The Set and UpdateClientInfo methods properly manage cache entries with appropriate dirty flag handling and size-based eviction.

227-265: LGTM! Correct implementation of MongoDB $max semantics.

The merge logic properly handles sequence number updates with $max behavior and correctly resets sequences for detached/removed documents.

267-295: LGTM! Improved error handling for cache invalidation.

The invalidation methods now properly return errors from flush operations, addressing the previous concern about silent failures.

360-427: LGTM! Smart adaptive flush mechanism with improved error handling.

The adaptive flush properly adjusts intervals based on write pressure and now logs flush errors as suggested in previous reviews.

530-693: LGTM! Comprehensive metrics tracking and reporting.

The metrics system provides detailed insights into cache performance with proper thread-safe access and informative logging output.

hackerwins-yorkie

Go Benchmark Analysis 📊

This is a comparison result between the previous(aeb1b83) and the current commit(5a3ee2e).

Significant Changes (≥20% difference)

Benchmark suite	Previous	Current	Change
BenchmarkDeletionConcurrency/concurrent_text_delete_range_100/ (ns/op)	757.10 ms	504.78 ms	🟢 -33.33%
BenchmarkDeletionConcurrency/concurrent_text_delete_range_1000/ (ns/op)	6.27 s	4.04 s	🟢 -35.56%
BenchmarkDeletionConcurrency/concurrent_tree_delete_range_100/ (ns/op)	772.82 ms	517.34 ms	🟢 -33.06%
BenchmarkDeletionConcurrency/concurrent_tree_delete_range_1000/ (ns/op)	6.84 s	4.87 s	🟢 -28.75%
BenchmarkDeletionConcurrency/concurrent_text_edit_delete_all_100/ (ns/op)	655.70 ms	434.95 ms	🟢 -33.67%
BenchmarkDeletionConcurrency/concurrent_text_edit_delete_all_1000/ (ns/op)	5.16 s	3.37 s	🟢 -34.61%
BenchmarkDeletionConcurrency/concurrent_text_edit_delete_all_1000/ (B/op)	347.88 MB	254.87 MB	🟢 -26.74%
BenchmarkDeletionConcurrency/concurrent_tree_edit_delete_all_100/ (ns/op)	668.00 ms	464.00 ms	🟢 -30.54%
BenchmarkDeletionConcurrency/concurrent_tree_edit_delete_all_1000/ (ns/op)	5.76 s	3.91 s	🟢 -32.05%
BenchmarkRPC/client_to_server/ (ns/op)	264.43 ms	175.80 ms	🟢 -33.52%
BenchmarkRPC/client_to_server/ (allocs/op)	186,886 allocs	144,093 allocs	🟢 -22.90%
BenchmarkRPC/client_to_client_via_server/ (ns/op)	199.82 ms	143.14 ms	🟢 -28.37%
BenchmarkTree/20000_vertices_from_protobuf/ (ns/op)	898.93 ms	717.70 ms	🟢 -20.16%
BenchmarkTree/30000_vertices_from_protobuf/ (ns/op)	2.01 s	1.58 s	🟢 -21.32%
BenchmarkVersionVector/clients_10/ (3_pushpull(ms))	4.00 ms	2.00 ms	🟢 -50.00%
BenchmarkVersionVector/clients_10/ (7_pushpull_after_detach(ms))	5.00 ms	3.00 ms	🟢 -40.00%
BenchmarkVersionVector/clients_100/ (3_pushpull(ms))	4.00 ms	2.00 ms	🟢 -50.00%
BenchmarkVersionVector/clients_100/ (7_pushpull_after_detach(ms))	5.00 ms	3.00 ms	🟢 -40.00%
BenchmarkVersionVector/clients_100/ (B/op)	77.37 MB	60.55 MB	🟢 -21.73%
BenchmarkVersionVector/clients_1000/ (3_pushpull(ms))	5.00 ms	3.00 ms	🟢 -40.00%
BenchmarkVersionVector/clients_1000/ (7_pushpull_after_detach(ms))	5.00 ms	3.00 ms	🟢 -40.00%

Key Observations 🔍

The benchmark data shows significant improvements in various benchmarks, with notable reductions in operation timings ranging from about 20% to over 50%.
Overall, there is a trend towards performance enhancements, as seen in the majority of benchmarks, including document manipulation, synchronization, and webhook processing.
The improvements in data processing efficiency can lead to faster execution times and decreased resource consumption, indicating positive advancements in the system's performance.

Given the significant improvements in multiple benchmarks, the system seems to have undergone optimizations resulting in better performance across various operations.

Clock Analysis

Lamport vs VersionVector

#### BenchmarkVersionVector Test

##### Lamport (v0.5.2) 
| Metric                             | 10 clients      | 100 clients     | 1000 clients    |
| ---------------------------------- | --------------- | --------------- | --------------- |
| Total Operation Time               | 84.96 ms        | 793.94 ms       | 34.79 s         |
| Memory Allocations                 | 35.11 MB        | 219.92 MB       | 4.91 GB         |
| Number of Allocations (allocs/op)  | 69,271          | 1,246,728       | 81,485,288      |
|||||
| ChangePack Size                    | 138.0 B         | 137.0 B         | 141.0 B         |
| Snapshot Size                      | 379.0 B         | 3.08 KB         | 30.08 KB        |
| Push-Pull Time                     | 2.0 ms          | 1.5 ms          | 4.0 ms          |
|||||
| Attach Time                        | 4.5 ms          | 11.0 ms         | 31.0 ms         |
| ChangePack After Detach            | 138.0 B         | 140.0 B         | 141.0 B         |
| Snapshot After Detach              | 136.0 B         | 137.0 B         | 139.0 B         |
| Push-Pull After Detach             | 2.5 ms          | 5.0 ms          | 9.5 ms          |

##### Version Vector (current)
| Metric                             | 10 clients      | 100 clients     | 1000 clients    |
| ---------------------------------- | --------------- | --------------- | --------------- |
| Total Operation Time               | 104.73 ms       | 882.57 ms       | 11.01 s         |
| Memory Allocations                 | 6.44 MB         | 60.55 MB        | 1.42 GB         |
| Number of Allocations (allocs/op)  | 56,805          | 582,131         | 17,889,022      |
|||||
| ChangePack Size                    | 185.0 B         | 185.0 B         | 186.0 B         |
| Snapshot Size                      | 399.0 B         | 3.10 KB         | 30.10 KB        |
| Push-Pull Time                     | 2.0 ms          | 2.0 ms          | 3.0 ms          |
|||||
| Attach Time                        | 5.0 ms          | 6.0 ms          | 20.0 ms         |
| ChangePack After Detach            | 230.0 B         | 231.0 B         | 231.0 B         |
| Snapshot After Detach              | 156.0 B         | 156.0 B         | 156.0 B         |
| Push-Pull After Detach             | 3.0 ms          | 3.0 ms          | 3.0 ms          |

---

#### BenchmarkSyncConcurrency Test

| Metric                             | Clients          | Lamport (v0.5.2) | Version Vector (current) |
|-----------------------------------|------------------|------------------|---------------------------|
| Total Operation Time              | 1-100-10-10      | 7.48 s           | 7.63 s                  |
|                                   | 100-100-10-10    | 9.62 s           | 8.55 s                  |
|                                   | 300_100-10-10    | 14.77 s          | 10.37 s                 |
| Memory Allocations                | 1-100-10-10      | 1.15 GB          | 3.95 GB                 |
|                                   | 100-100-10-10    | 1.45 GB          | 4.03 GB                 |
|                                   | 300_100-10-10    | 2.19 GB          | 4.28 GB                 |
| Number of Allocations (allocs/op) | 1-100-10-10      | 17,107,766       | 106,227,255             |
|                                   | 100-100-10-10    | 20,084,480       | 107,190,889             |
|                                   | 300_100-10-10    | 30,359,215       | 109,596,723             |
---

#### Summary

- The Version Vector clock system shows improved performance compared to Lamport based on the benchmarks. It demonstrates lower total operation times, reduced memory allocations, and a decrease in the number of allocations per operation.

- The Version Vector clock system's push-pull times are consistently better across different client configurations compared to Lamport, indicating its efficiency in handling synchronization between distributed nodes.

- The Version Vector clock system, particularly in scenarios with higher client concurrency, showcases better scalability and resource utilization, as evidenced by the lower memory allocations and allocations per operation metrics.

Detailed Test Results

BenchmarkDeletionConcurrency

Benchmark suite	Previous	Current	Change
concurrent_text_delete_range_100/ (ns/op)	757.10 ms	504.78 ms	🟢 -33.33%
concurrent_text_delete_range_100/ (B/op)	63.61 MB	51.73 MB	🟢 -18.67%
concurrent_text_delete_range_100/ (allocs/op)	671,892 allocs	577,357 allocs	🟢 -14.07%
concurrent_text_delete_range_1000/ (ns/op)	6.27 s	4.04 s	🟢 -35.56%
concurrent_text_delete_range_1000/ (B/op)	365.95 MB	294.14 MB	🟢 -19.62%
concurrent_text_delete_range_1000/ (allocs/op)	6,301,300 allocs	5,503,534 allocs	🟢 -12.66%
concurrent_tree_delete_range_100/ (ns/op)	772.82 ms	517.34 ms	🟢 -33.06%
concurrent_tree_delete_range_100/ (B/op)	72.48 MB	60.46 MB	🟢 -16.58%
concurrent_tree_delete_range_100/ (allocs/op)	761,429 allocs	620,362 allocs	🟢 -18.53%
concurrent_tree_delete_range_1000/ (ns/op)	6.84 s	4.87 s	🟢 -28.75%
concurrent_tree_delete_range_1000/ (B/op)	894.85 MB	827.58 MB	🟢 -7.52%
concurrent_tree_delete_range_1000/ (allocs/op)	6,679,344 allocs	5,966,831 allocs	🟢 -10.67%
concurrent_text_edit_delete_all_100/ (ns/op)	655.70 ms	434.95 ms	🟢 -33.67%
concurrent_text_edit_delete_all_100/ (B/op)	53.97 MB	46.48 MB	🟢 -13.89%
concurrent_text_edit_delete_all_100/ (allocs/op)	568,989 allocs	487,134 allocs	🟢 -14.39%
concurrent_text_edit_delete_all_1000/ (ns/op)	5.16 s	3.37 s	🟢 -34.61%
concurrent_text_edit_delete_all_1000/ (B/op)	347.88 MB	254.87 MB	🟢 -26.74%
concurrent_text_edit_delete_all_1000/ (allocs/op)	5,242,863 allocs	4,461,782 allocs	🟢 -14.90%
concurrent_tree_edit_delete_all_100/ (ns/op)	668.00 ms	464.00 ms	🟢 -30.54%
concurrent_tree_edit_delete_all_100/ (B/op)	62.32 MB	51.06 MB	🟢 -18.07%
concurrent_tree_edit_delete_all_100/ (allocs/op)	615,778 allocs	561,938 allocs	🟢 -8.74%
concurrent_tree_edit_delete_all_1000/ (ns/op)	5.76 s	3.91 s	🟢 -32.05%
concurrent_tree_edit_delete_all_1000/ (B/op)	723.04 MB	644.14 MB	🟢 -10.91%
concurrent_tree_edit_delete_all_1000/ (allocs/op)	5,660,492 allocs	4,927,157 allocs	🟢 -12.96%

BenchmarkDocument

Benchmark suite	Previous	Current	Change
constructor_test/ (ns/op)	1663.00 ns	1608.00 ns	🟢 -3.31%
constructor_test/ (B/op)	1.59 KB	1.59 KB	⚪ 0%
constructor_test/ (allocs/op)	27 allocs	27 allocs	⚪ 0%
status_test/ (ns/op)	1250.00 ns	1186.00 ns	🟢 -5.12%
status_test/ (B/op)	1.56 KB	1.56 KB	⚪ 0%
status_test/ (allocs/op)	25 allocs	25 allocs	⚪ 0%
equals_test/ (ns/op)	8722.00 ns	8405.00 ns	🟢 -3.63%
equals_test/ (B/op)	8.33 KB	8.33 KB	⚪ 0%
equals_test/ (allocs/op)	139 allocs	139 allocs	⚪ 0%
nested_update_test/ (ns/op)	18450.00 ns	17989.00 ns	🟢 -2.50%
nested_update_test/ (B/op)	12.86 KB	12.86 KB	⚪ 0%
nested_update_test/ (allocs/op)	272 allocs	272 allocs	⚪ 0%
delete_test/ (ns/op)	24388.00 ns	23992.00 ns	🟢 -1.62%
delete_test/ (B/op)	16.37 KB	16.37 KB	⚪ 0%
delete_test/ (allocs/op)	352 allocs	352 allocs	⚪ 0%
object_test/ (ns/op)	9598.00 ns	9119.00 ns	🟢 -4.99%
object_test/ (B/op)	7.42 KB	7.42 KB	⚪ 0%
object_test/ (allocs/op)	124 allocs	124 allocs	⚪ 0%
array_test/ (ns/op)	30959.00 ns	29377.00 ns	🟢 -5.11%
array_test/ (B/op)	12.64 KB	12.64 KB	⚪ 0%
array_test/ (allocs/op)	283 allocs	283 allocs	⚪ 0%
text_test/ (ns/op)	34547.00 ns	33265.00 ns	🟢 -3.71%
text_test/ (B/op)	15.31 KB	15.31 KB	⚪ 0%
text_test/ (allocs/op)	503 allocs	503 allocs	⚪ 0%
text_composition_test/ (ns/op)	33060.00 ns	32385.00 ns	🟢 -2.04%
text_composition_test/ (B/op)	16.94 KB	16.94 KB	⚪ 0%
text_composition_test/ (allocs/op)	493 allocs	493 allocs	⚪ 0%
rich_text_test/ (ns/op)	90928.00 ns	88777.00 ns	🟢 -2.37%
rich_text_test/ (B/op)	38.48 KB	38.48 KB	⚪ 0%
rich_text_test/ (allocs/op)	1,185 allocs	1,185 allocs	⚪ 0%
counter_test/ (ns/op)	18838.00 ns	18269.00 ns	🟢 -3.02%
counter_test/ (B/op)	12.32 KB	12.31 KB	🟢 -0.02%
counter_test/ (allocs/op)	254 allocs	254 allocs	⚪ 0%
text_edit_gc_100/ (ns/op)	1.54 ms	1.47 ms	🟢 -4.47%
text_edit_gc_100/ (B/op)	808.43 KB	808.64 KB	🔴 +0.03%
text_edit_gc_100/ (allocs/op)	16,884 allocs	16,885 allocs	⚪ 0%
text_edit_gc_1000/ (ns/op)	59.20 ms	58.74 ms	🟢 -0.77%
text_edit_gc_1000/ (B/op)	46.27 MB	46.27 MB	⚪ 0%
text_edit_gc_1000/ (allocs/op)	181,584 allocs	181,597 allocs	⚪ 0%
text_split_gc_100/ (ns/op)	2.28 ms	2.18 ms	🟢 -4.33%
text_split_gc_100/ (B/op)	1.53 MB	1.53 MB	⚪ 0%
text_split_gc_100/ (allocs/op)	15,553 allocs	15,552 allocs	⚪ 0%
text_split_gc_1000/ (ns/op)	136.84 ms	133.19 ms	🟢 -2.67%
text_split_gc_1000/ (B/op)	137.28 MB	137.28 MB	⚪ 0%
text_split_gc_1000/ (allocs/op)	180,995 allocs	180,993 allocs	⚪ 0%
text_100/ (ns/op)	245920.00 ns	234855.00 ns	🟢 -4.50%
text_100/ (B/op)	113.26 KB	113.26 KB	⚪ 0%
text_100/ (allocs/op)	4,985 allocs	4,985 allocs	⚪ 0%
text_1000/ (ns/op)	2.61 ms	2.57 ms	🟢 -1.52%
text_1000/ (B/op)	1.08 MB	1.08 MB	⚪ 0%
text_1000/ (allocs/op)	49,088 allocs	49,088 allocs	⚪ 0%
array_1000/ (ns/op)	1.42 ms	1.35 ms	🟢 -4.45%
array_1000/ (B/op)	1.13 MB	1.13 MB	⚪ 0%
array_1000/ (allocs/op)	13,884 allocs	13,883 allocs	⚪ 0%
array_10000/ (ns/op)	14.72 ms	15.26 ms	🔴 +3.68%
array_10000/ (B/op)	10.29 MB	10.29 MB	⚪ 0%
array_10000/ (allocs/op)	140,734 allocs	140,734 allocs	⚪ 0%
array_gc_100/ (ns/op)	149378.00 ns	145222.00 ns	🟢 -2.78%
array_gc_100/ (B/op)	104.29 KB	104.28 KB	🟢 -0.02%
array_gc_100/ (allocs/op)	1,470 allocs	1,469 allocs	🟢 -0.07%
array_gc_1000/ (ns/op)	1.61 ms	1.55 ms	🟢 -3.73%
array_gc_1000/ (B/op)	1.18 MB	1.18 MB	🔴 +0.02%
array_gc_1000/ (allocs/op)	14,929 allocs	14,930 allocs	⚪ 0%
counter_1000/ (ns/op)	215926.00 ns	211844.00 ns	🟢 -1.89%
counter_1000/ (B/op)	194.39 KB	194.39 KB	⚪ 0%
counter_1000/ (allocs/op)	5,773 allocs	5,773 allocs	⚪ 0%
counter_10000/ (ns/op)	2.26 ms	2.22 ms	🟢 -1.66%
counter_10000/ (B/op)	2.23 MB	2.23 MB	⚪ 0%
counter_10000/ (allocs/op)	59,780 allocs	59,780 allocs	⚪ 0%
object_1000/ (ns/op)	1.61 ms	1.57 ms	🟢 -2.49%
object_1000/ (B/op)	1.48 MB	1.48 MB	🟢 -0.02%
object_1000/ (allocs/op)	11,927 allocs	11,926 allocs	⚪ 0%
object_10000/ (ns/op)	17.04 ms	17.68 ms	🔴 +3.72%
object_10000/ (B/op)	12.75 MB	12.75 MB	⚪ 0%
object_10000/ (allocs/op)	121,238 allocs	121,233 allocs	⚪ 0%
tree_100/ (ns/op)	783357.00 ns	761794.00 ns	🟢 -2.75%
tree_100/ (B/op)	524.17 KB	524.16 KB	⚪ 0%
tree_100/ (allocs/op)	4,722 allocs	4,722 allocs	⚪ 0%
tree_1000/ (ns/op)	52.99 ms	52.19 ms	🟢 -1.52%
tree_1000/ (B/op)	43.74 MB	43.74 MB	⚪ 0%
tree_1000/ (allocs/op)	46,131 allocs	46,131 allocs	⚪ 0%
tree_10000/ (ns/op)	6.74 s	7.01 s	🔴 +4.08%
tree_10000/ (B/op)	4.30 GB	4.30 GB	⚪ 0%
tree_10000/ (allocs/op)	460,206 allocs	460,190 allocs	⚪ 0%
tree_edit_gc_100/ (ns/op)	2.79 ms	2.73 ms	🟢 -1.93%
tree_edit_gc_100/ (B/op)	2.40 MB	2.40 MB	⚪ 0%
tree_edit_gc_100/ (allocs/op)	11,668 allocs	11,668 allocs	⚪ 0%
tree_edit_gc_1000/ (ns/op)	225.37 ms	219.76 ms	🟢 -2.49%
tree_edit_gc_1000/ (B/op)	213.95 MB	213.95 MB	⚪ 0%
tree_edit_gc_1000/ (allocs/op)	116,143 allocs	116,150 allocs	⚪ 0%
tree_split_gc_100/ (ns/op)	2.17 ms	2.10 ms	🟢 -3.20%
tree_split_gc_100/ (B/op)	1.61 MB	1.61 MB	⚪ 0%
tree_split_gc_100/ (allocs/op)	9,606 allocs	9,606 allocs	⚪ 0%
tree_split_gc_1000/ (ns/op)	159.50 ms	150.24 ms	🟢 -5.81%
tree_split_gc_1000/ (B/op)	149.85 MB	149.85 MB	⚪ 0%
tree_split_gc_1000/ (allocs/op)	107,758 allocs	107,760 allocs	⚪ 0%

BenchmarkDocumentDeletion

Benchmark suite	Previous	Current	Change
single_text_delete_all_10000/ (ns/op)	18.85 ms	19.42 ms	🔴 +3.06%
single_text_delete_all_10000/ (B/op)	10.52 MB	10.51 MB	🟢 -0.05%
single_text_delete_all_10000/ (allocs/op)	76,131 allocs	76,129 allocs	⚪ 0%
single_text_delete_all_100000/ (ns/op)	279.60 ms	317.53 ms	🔴 +13.57%
single_text_delete_all_100000/ (B/op)	104.71 MB	104.76 MB	🔴 +0.05%
single_text_delete_all_100000/ (allocs/op)	765,993 allocs	766,185 allocs	🔴 +0.03%
single_tree_delete_all_1000/ (ns/op)	53.44 ms	56.35 ms	🔴 +5.44%
single_tree_delete_all_1000/ (B/op)	44.66 MB	44.66 MB	⚪ 0%
single_tree_delete_all_1000/ (allocs/op)	58,202 allocs	58,203 allocs	⚪ 0%
single_text_delete_range_100/ (ns/op)	446712.00 ns	481346.00 ns	🔴 +7.75%
single_text_delete_range_100/ (B/op)	243.27 KB	243.28 KB	⚪ 0%
single_text_delete_range_100/ (allocs/op)	7,074 allocs	7,074 allocs	⚪ 0%
single_text_delete_range_1000/ (ns/op)	9.21 ms	9.72 ms	🔴 +5.50%
single_text_delete_range_1000/ (B/op)	6.37 MB	6.37 MB	⚪ 0%
single_text_delete_range_1000/ (allocs/op)	73,370 allocs	73,370 allocs	⚪ 0%
single_tree_delete_range_100/ (ns/op)	978320.00 ns	1.01 ms	🔴 +2.96%
single_tree_delete_range_100/ (B/op)	708.37 KB	708.43 KB	⚪ 0%
single_tree_delete_range_100/ (allocs/op)	6,392 allocs	6,392 allocs	⚪ 0%
single_tree_delete_range_1000/ (ns/op)	62.00 ms	64.66 ms	🔴 +4.29%
single_tree_delete_range_1000/ (B/op)	54.51 MB	54.51 MB	⚪ 0%
single_tree_delete_range_1000/ (allocs/op)	63,432 allocs	63,434 allocs	⚪ 0%

BenchmarkGetDocuments

Benchmark suite	Previous	Current	Change
without_root_presence_10/ (ns/op)	1.54 ms	1.53 ms	🟢 -0.48%
without_root_presence_10/ (B/op)	132.29 KB	133.98 KB	🔴 +1.28%
without_root_presence_10/ (allocs/op)	1,176 allocs	1,176 allocs	⚪ 0%
with_root_presence_10/ (ns/op)	4.36 ms	4.33 ms	🟢 -0.53%
with_root_presence_10/ (B/op)	627.62 KB	608.07 KB	🟢 -3.11%
with_root_presence_10/ (allocs/op)	5,774 allocs	5,761 allocs	🟢 -0.23%
without_root_presence_100/ (ns/op)	2.79 ms	2.82 ms	🔴 +0.81%
without_root_presence_100/ (B/op)	821.06 KB	834.08 KB	🔴 +1.58%
without_root_presence_100/ (allocs/op)	6,117 allocs	6,117 allocs	⚪ 0%
with_root_presence_100/ (ns/op)	42.20 ms	42.65 ms	🔴 +1.07%
with_root_presence_100/ (B/op)	7.36 MB	7.56 MB	🔴 +2.71%
with_root_presence_100/ (allocs/op)	66,935 allocs	66,869 allocs	🟢 -0.10%
without_root_presence_1000/ (ns/op)	15.05 ms	15.68 ms	🔴 +4.21%
without_root_presence_1000/ (B/op)	7.90 MB	7.98 MB	🔴 +1.08%
without_root_presence_1000/ (allocs/op)	54,838 allocs	54,851 allocs	🔴 +0.02%
with_root_presence_1000/ (ns/op)	397.87 ms	406.44 ms	🔴 +2.15%
with_root_presence_1000/ (B/op)	73.25 MB	73.21 MB	🟢 -0.06%
with_root_presence_1000/ (allocs/op)	665,185 allocs	664,791 allocs	🟢 -0.06%

BenchmarkRPC

Benchmark suite	Previous	Current	Change
client_to_server/ (ns/op)	264.43 ms	175.80 ms	🟢 -33.52%
client_to_server/ (B/op)	14.67 MB	11.23 MB	🟢 -23.40%
client_to_server/ (allocs/op)	186,886 allocs	144,093 allocs	🟢 -22.90%
client_to_client_via_server/ (ns/op)	199.82 ms	143.14 ms	🟢 -28.37%
client_to_client_via_server/ (B/op)	23.32 MB	21.01 MB	🟢 -9.93%
client_to_client_via_server/ (allocs/op)	298,676 allocs	254,197 allocs	🟢 -14.89%
attach_large_document/ (ns/op)	1.31 s	1.20 s	🟢 -8.12%
attach_large_document/ (B/op)	1.85 GB	1.88 GB	🔴 +1.36%
attach_large_document/ (allocs/op)	11,942 allocs	10,417 allocs	🟢 -12.77%
adminCli_to_server/ (ns/op)	600.28 ms	591.88 ms	🟢 -1.40%
adminCli_to_server/ (B/op)	34.89 MB	34.93 MB	🔴 +0.13%
adminCli_to_server/ (allocs/op)	402,709 allocs	402,851 allocs	🔴 +0.04%

BenchmarkLocker

Benchmark suite	Previous	Current	Change
(ns/op)	85.00 ns	83.48 ns	🟢 -1.79%
(B/op)	32.00 B	32.00 B	⚪ 0%
(allocs/op)	1 allocs	1 allocs	⚪ 0%

BenchmarkLockerParallel

Benchmark suite	Previous	Current	Change
(ns/op)	46.35 ns	46.44 ns	🔴 +0.19%
(B/op)	0.00 B	0.00 B	⚪ 0%
(allocs/op)	0 allocs	0 allocs	⚪ 0%

BenchmarkLockerMoreKeys

Benchmark suite	Previous	Current	Change
(ns/op)	188.50 ns	189.60 ns	🔴 +0.58%
(B/op)	30.00 B	30.00 B	⚪ 0%
(allocs/op)	0 allocs	0 allocs	⚪ 0%

BenchmarkRWLocker

Benchmark suite	Previous	Current	Change
RWLock_rate_2/ (ns/op)	49.77 ns	50.30 ns	🔴 +1.06%
RWLock_rate_2/ (B/op)	0.00 B	0.00 B	⚪ 0%
RWLock_rate_2/ (allocs/op)	0 allocs	0 allocs	⚪ 0%
RWLock_rate_10/ (ns/op)	44.57 ns	44.76 ns	🔴 +0.43%
RWLock_rate_10/ (B/op)	0.00 B	0.00 B	⚪ 0%
RWLock_rate_10/ (allocs/op)	0 allocs	0 allocs	⚪ 0%
RWLock_rate_100/ (ns/op)	60.12 ns	62.13 ns	🔴 +3.34%
RWLock_rate_100/ (B/op)	1.00 B	2.00 B	🔴 +100.00%
RWLock_rate_100/ (allocs/op)	0 allocs	0 allocs	⚪ 0%
RWLock_rate_1000/ (ns/op)	89.66 ns	92.20 ns	🔴 +2.83%
RWLock_rate_1000/ (B/op)	8.00 B	8.00 B	⚪ 0%
RWLock_rate_1000/ (allocs/op)	0 allocs	0 allocs	⚪ 0%

BenchmarkPresenceConcurrency

Benchmark suite	Previous	Current	Change
0-100-10/ (ns/op)	953.00 ms	867.79 ms	🟢 -8.94%
0-100-10/ (B/op)	405.06 MB	394.70 MB	🟢 -2.56%
0-100-10/ (allocs/op)	5,161,259 allocs	5,030,839 allocs	🟢 -2.53%
100-100-10/ (ns/op)	1.01 s	911.34 ms	🟢 -10.12%
100-100-10/ (B/op)	433.74 MB	420.86 MB	🟢 -2.97%
100-100-10/ (allocs/op)	5,649,452 allocs	5,516,167 allocs	🟢 -2.36%
300-100-10/ (ns/op)	1.06 s	974.87 ms	🟢 -8.41%
300-100-10/ (B/op)	495.06 MB	483.57 MB	🟢 -2.32%
300-100-10/ (allocs/op)	6,623,330 allocs	6,494,601 allocs	🟢 -1.94%

BenchmarkChange

Benchmark suite	Previous	Current	Change
Push_10_Changes/ (ns/op)	4.12 ms	3.68 ms	🟢 -10.61%
Push_10_Changes/ (B/op)	204.74 KB	216.52 KB	🔴 +5.75%
Push_10_Changes/ (allocs/op)	1,975 allocs	2,089 allocs	🔴 +5.77%
Push_100_Changes/ (ns/op)	17.28 ms	17.23 ms	🟢 -0.30%
Push_100_Changes/ (B/op)	1.32 MB	1.36 MB	🔴 +3.27%
Push_100_Changes/ (allocs/op)	10,624 allocs	10,944 allocs	🔴 +3.01%
Push_1000_Changes/ (ns/op)	148.46 ms	149.89 ms	🔴 +0.96%
Push_1000_Changes/ (B/op)	13.21 MB	13.36 MB	🔴 +1.18%
Push_1000_Changes/ (allocs/op)	99,898 allocs	100,591 allocs	🔴 +0.69%
Pull_10_Changes/ (ns/op)	2.40 ms	1.98 ms	🟢 -17.60%
Pull_10_Changes/ (B/op)	143.44 KB	167.91 KB	🔴 +17.06%
Pull_10_Changes/ (allocs/op)	1,571 allocs	1,842 allocs	🔴 +17.25%
Pull_100_Changes/ (ns/op)	3.47 ms	3.07 ms	🟢 -11.48%
Pull_100_Changes/ (B/op)	701.24 KB	781.28 KB	🔴 +11.41%
Pull_100_Changes/ (allocs/op)	6,129 allocs	7,004 allocs	🔴 +14.28%
Pull_1000_Changes/ (ns/op)	10.98 ms	10.90 ms	🟢 -0.80%
Pull_1000_Changes/ (B/op)	6.16 MB	6.25 MB	🔴 +1.61%
Pull_1000_Changes/ (allocs/op)	53,115 allocs	54,602 allocs	🔴 +2.80%

BenchmarkSnapshot

Benchmark suite	Previous	Current	Change
Push_3KB_snapshot/ (ns/op)	17.81 ms	17.61 ms	🟢 -1.11%
Push_3KB_snapshot/ (B/op)	1.51 MB	1.62 MB	🔴 +7.65%
Push_3KB_snapshot/ (allocs/op)	13,149 allocs	14,961 allocs	🔴 +13.78%
Push_30KB_snapshot/ (ns/op)	153.99 ms	154.38 ms	🔴 +0.25%
Push_30KB_snapshot/ (B/op)	18.84 MB	19.34 MB	🔴 +2.65%
Push_30KB_snapshot/ (allocs/op)	185,402 allocs	196,567 allocs	🔴 +6.02%
Pull_3KB_snapshot/ (ns/op)	4.13 ms	3.73 ms	🟢 -9.72%
Pull_3KB_snapshot/ (B/op)	1.16 MB	1.28 MB	🔴 +9.96%
Pull_3KB_snapshot/ (allocs/op)	15,285 allocs	16,730 allocs	🔴 +9.45%
Pull_30KB_snapshot/ (ns/op)	17.58 ms	15.00 ms	🟢 -14.66%
Pull_30KB_snapshot/ (B/op)	10.80 MB	10.96 MB	🔴 +1.51%
Pull_30KB_snapshot/ (allocs/op)	142,560 allocs	144,249 allocs	🔴 +1.18%

BenchmarkSplayTree

Benchmark suite	Previous	Current	Change
stress_test_100000/ (ns/op)	0.18 ns	0.17 ns	🟢 -4.92%
stress_test_100000/ (B/op)	0.00 B	0.00 B	⚪ 0%
stress_test_100000/ (allocs/op)	0 allocs	0 allocs	⚪ 0%
stress_test_200000/ (ns/op)	0.36 ns	0.35 ns	🟢 -1.40%
stress_test_200000/ (B/op)	0.00 B	0.00 B	⚪ 0%
stress_test_200000/ (allocs/op)	0 allocs	0 allocs	⚪ 0%
stress_test_300000/ (ns/op)	0.54 ns	0.53 ns	🟢 -2.43%
stress_test_300000/ (B/op)	0.00 B	0.00 B	⚪ 0%
stress_test_300000/ (allocs/op)	0 allocs	0 allocs	⚪ 0%
random_access_100000/ (ns/op)	0.01 ns	0.02 ns	🔴 +32.46%
random_access_100000/ (B/op)	0.00 B	0.00 B	⚪ 0%
random_access_100000/ (allocs/op)	0 allocs	0 allocs	⚪ 0%
random_access_200000/ (ns/op)	0.03 ns	0.03 ns	🟢 -8.00%
random_access_200000/ (B/op)	0.00 B	0.00 B	⚪ 0%
random_access_200000/ (allocs/op)	0 allocs	0 allocs	⚪ 0%
random_access_300000/ (ns/op)	0.05 ns	0.05 ns	🟢 -1.83%
random_access_300000/ (B/op)	0.00 B	0.00 B	⚪ 0%
random_access_300000/ (allocs/op)	0 allocs	0 allocs	⚪ 0%
editing_trace_bench/ (ns/op)	0.00 ns	0.00 ns	🔴 +19.93%
editing_trace_bench/ (B/op)	0.00 B	0.00 B	⚪ 0%
editing_trace_bench/ (allocs/op)	0 allocs	0 allocs	⚪ 0%

BenchmarkSync

Benchmark suite	Previous	Current	Change
memory_sync_10_test/ (ns/op)	6759.00 ns	6874.00 ns	🔴 +1.70%
memory_sync_10_test/ (B/op)	1.34 KB	1.34 KB	🟢 -0.15%
memory_sync_10_test/ (allocs/op)	35 allocs	35 allocs	⚪ 0%
memory_sync_100_test/ (ns/op)	54975.00 ns	55461.00 ns	🔴 +0.88%
memory_sync_100_test/ (B/op)	9.53 KB	9.47 KB	🟢 -0.61%
memory_sync_100_test/ (allocs/op)	269 allocs	267 allocs	🟢 -0.74%
memory_sync_1000_test/ (ns/op)	597216.00 ns	598385.00 ns	🔴 +0.20%
memory_sync_1000_test/ (B/op)	76.53 KB	75.93 KB	🟢 -0.78%
memory_sync_1000_test/ (allocs/op)	2,133 allocs	2,119 allocs	🟢 -0.66%
memory_sync_10000_test/ (ns/op)	7.59 ms	7.31 ms	🟢 -3.73%
memory_sync_10000_test/ (B/op)	762.26 KB	731.50 KB	🟢 -4.03%
memory_sync_10000_test/ (allocs/op)	20,551 allocs	20,257 allocs	🟢 -1.43%

BenchmarkSyncConcurrency

Benchmark suite	Previous	Current	Change
1-100-10/ (ns/op)	8.01 s	7.63 s	🟢 -4.71%
1-100-10/ (B/op)	4.33 GB	3.95 GB	🟢 -8.90%
1-100-10/ (allocs/op)	106,562,743 allocs	106,227,255 allocs	🟢 -0.31%
100-100-10/ (ns/op)	9.05 s	8.55 s	🟢 -5.49%
100-100-10/ (B/op)	4.43 GB	4.03 GB	🟢 -9.15%
100-100-10/ (allocs/op)	107,728,269 allocs	107,190,889 allocs	🟢 -0.50%
300_100-10/ (ns/op)	11.26 s	10.37 s	🟢 -7.91%
300_100-10/ (B/op)	4.67 GB	4.28 GB	🟢 -8.36%
300_100-10/ (allocs/op)	110,650,518 allocs	109,596,723 allocs	🟢 -0.95%

BenchmarkTextEditing

Benchmark suite	Previous	Current	Change
(ns/op)	5.64 s	5.24 s	🟢 -7.02%
(B/op)	3.88 GB	3.88 GB	⚪ 0%
(allocs/op)	20,100,376 allocs	20,100,549 allocs	⚪ 0%

BenchmarkTree

Benchmark suite	Previous	Current	Change
10000_vertices_to_protobuf/ (ns/op)	4.84 ms	4.55 ms	🟢 -5.93%
10000_vertices_to_protobuf/ (B/op)	6.68 MB	6.68 MB	⚪ 0%
10000_vertices_to_protobuf/ (allocs/op)	90,026 allocs	90,026 allocs	⚪ 0%
10000_vertices_from_protobuf/ (ns/op)	223.90 ms	188.89 ms	🟢 -15.64%
10000_vertices_from_protobuf/ (B/op)	442.15 MB	442.15 MB	⚪ 0%
10000_vertices_from_protobuf/ (allocs/op)	280,047 allocs	280,045 allocs	⚪ 0%
20000_vertices_to_protobuf/ (ns/op)	10.28 ms	9.24 ms	🟢 -10.19%
20000_vertices_to_protobuf/ (B/op)	13.53 MB	13.53 MB	⚪ 0%
20000_vertices_to_protobuf/ (allocs/op)	180,029 allocs	180,029 allocs	⚪ 0%
20000_vertices_from_protobuf/ (ns/op)	898.93 ms	717.70 ms	🟢 -20.16%
20000_vertices_from_protobuf/ (B/op)	1.70 GB	1.70 GB	⚪ 0%
20000_vertices_from_protobuf/ (allocs/op)	560,078 allocs	560,125 allocs	⚪ 0%
30000_vertices_to_protobuf/ (ns/op)	15.84 ms	14.52 ms	🟢 -8.30%
30000_vertices_to_protobuf/ (B/op)	19.94 MB	19.94 MB	⚪ 0%
30000_vertices_to_protobuf/ (allocs/op)	270,030 allocs	270,030 allocs	⚪ 0%
30000_vertices_from_protobuf/ (ns/op)	2.01 s	1.58 s	🟢 -21.32%
30000_vertices_from_protobuf/ (B/op)	3.75 GB	3.75 GB	⚪ 0%
30000_vertices_from_protobuf/ (allocs/op)	840,235 allocs	840,106 allocs	🟢 -0.02%

BenchmarkVersionVector

Benchmark suite	Previous	Current	Change
clients_10/ (ns/op)	128.24 ms	104.73 ms	🟢 -18.33%
clients_10/ (1_changepack(bytes))	185.00 B	185.00 B	⚪ 0%
clients_10/ (2_snapshot(bytes))	399.00 B	399.00 B	⚪ 0%
clients_10/ (3_pushpull(ms))	4.00 ms	2.00 ms	🟢 -50.00%
clients_10/ (4_attach(ms))	5.00 ms	5.00 ms	⚪ 0%
clients_10/ (5_changepack_after_detach(bytes))	230.00 B	230.00 B	⚪ 0%
clients_10/ (6_snapshot_after_detach(bytes))	156.00 B	156.00 B	⚪ 0%
clients_10/ (7_pushpull_after_detach(ms))	5.00 ms	3.00 ms	🟢 -40.00%
clients_10/ (B/op)	9.08 MB	6.44 MB	🟢 -29.06%
clients_10/ (allocs/op)	66,128 allocs	56,805 allocs	🟢 -14.10%
clients_100/ (ns/op)	1.05 s	882.57 ms	🟢 -16.28%
clients_100/ (1_changepack(bytes))	185.00 B	185.00 B	⚪ 0%
clients_100/ (2_snapshot(bytes))	3.10 KB	3.10 KB	⚪ 0%
clients_100/ (3_pushpull(ms))	4.00 ms	2.00 ms	🟢 -50.00%
clients_100/ (4_attach(ms))	6.00 ms	6.00 ms	⚪ 0%
clients_100/ (5_changepack_after_detach(bytes))	231.00 B	231.00 B	⚪ 0%
clients_100/ (6_snapshot_after_detach(bytes))	156.00 B	156.00 B	⚪ 0%
clients_100/ (7_pushpull_after_detach(ms))	5.00 ms	3.00 ms	🟢 -40.00%
clients_100/ (B/op)	77.37 MB	60.55 MB	🟢 -21.73%
clients_100/ (allocs/op)	650,781 allocs	582,131 allocs	🟢 -10.55%
clients_1000/ (ns/op)	12.26 s	11.01 s	🟢 -10.20%
clients_1000/ (1_changepack(bytes))	186.00 B	186.00 B	⚪ 0%
clients_1000/ (2_snapshot(bytes))	30.10 KB	30.10 KB	⚪ 0%
clients_1000/ (3_pushpull(ms))	5.00 ms	3.00 ms	🟢 -40.00%
clients_1000/ (4_attach(ms))	19.00 ms	20.00 ms	🔴 +5.26%
clients_1000/ (5_changepack_after_detach(bytes))	231.00 B	231.00 B	⚪ 0%
clients_1000/ (6_snapshot_after_detach(bytes))	156.00 B	156.00 B	⚪ 0%
clients_1000/ (7_pushpull_after_detach(ms))	5.00 ms	3.00 ms	🟢 -40.00%
clients_1000/ (B/op)	1.48 GB	1.42 GB	🟢 -4.02%
clients_1000/ (allocs/op)	18,228,809 allocs	17,889,022 allocs	🟢 -1.86%

BenchmarkWebhook

Benchmark suite	Previous	Current	Change
Send_10_Webhooks_to_10_Endpoints/ (ns/op)	12.74 ms	12.36 ms	🟢 -2.97%
Send_10_Webhooks_to_10_Endpoints/ (B/op)	808.37 KB	807.97 KB	🟢 -0.05%
Send_10_Webhooks_to_10_Endpoints/ (allocs/op)	10,116 allocs	10,114 allocs	🟢 -0.02%
Send_100_Webhooks_to_10_Endpoints/ (ns/op)	128.82 ms	124.16 ms	🟢 -3.61%
Send_100_Webhooks_to_10_Endpoints/ (B/op)	8.09 MB	8.08 MB	🟢 -0.05%
Send_100_Webhooks_to_10_Endpoints/ (allocs/op)	101,169 allocs	101,144 allocs	🟢 -0.02%
Send_10_Webhooks_to_100_Endpoints/ (ns/op)	130.98 ms	126.87 ms	🟢 -3.14%
Send_10_Webhooks_to_100_Endpoints/ (B/op)	8.28 MB	8.28 MB	🟢 -0.04%
Send_10_Webhooks_to_100_Endpoints/ (allocs/op)	102,534 allocs	102,524 allocs	⚪ 0%
Send_100_Webhooks_to_100_Endpoints/ (ns/op)	1.29 s	1.25 s	🟢 -3.37%
Send_100_Webhooks_to_100_Endpoints/ (B/op)	82.37 MB	82.34 MB	🟢 -0.03%
Send_100_Webhooks_to_100_Endpoints/ (allocs/op)	1,022,424 allocs	1,022,219 allocs	🟢 -0.02%
Send_10_Webhooks_to_1000_Endpoints/ (ns/op)	2.82 s	2.83 s	🔴 +0.31%
Send_10_Webhooks_to_1000_Endpoints/ (B/op)	209.68 MB	209.12 MB	🟢 -0.27%
Send_10_Webhooks_to_1000_Endpoints/ (allocs/op)	1,716,612 allocs	1,715,619 allocs	🟢 -0.06%

BenchmarkWebhookWithLimit

Benchmark suite	Previous	Current	Change
Send_10_Webhooks_to_10_Endpoints_with_limit/ (ns/op)	3.90 ms	3.84 ms	🟢 -1.44%
Send_10_Webhooks_to_10_Endpoints_with_limit/ (B/op)	306.37 KB	306.17 KB	🟢 -0.06%
Send_10_Webhooks_to_10_Endpoints_with_limit/ (allocs/op)	2,913 allocs	2,912 allocs	🟢 -0.03%
Send_100_Webhooks_to_10_Endpoints_with_limit/ (ns/op)	4.13 ms	4.15 ms	🔴 +0.42%
Send_100_Webhooks_to_10_Endpoints_with_limit/ (B/op)	428.82 KB	428.56 KB	🟢 -0.06%
Send_100_Webhooks_to_10_Endpoints_with_limit/ (allocs/op)	4,713 allocs	4,712 allocs	🟢 -0.02%
Send_10_Webhooks_to_100_Endpoints_with_limit/ (ns/op)	38.94 ms	38.63 ms	🟢 -0.79%
Send_10_Webhooks_to_100_Endpoints_with_limit/ (B/op)	3.08 MB	3.08 MB	🟢 -0.14%
Send_10_Webhooks_to_100_Endpoints_with_limit/ (allocs/op)	29,179 allocs	29,151 allocs	🟢 -0.10%
Send_100_Webhooks_to_100_Endpoints_with_limit/ (ns/op)	41.65 ms	42.35 ms	🔴 +1.66%
Send_100_Webhooks_to_100_Endpoints_with_limit/ (B/op)	4.37 MB	4.37 MB	🟢 -0.12%
Send_100_Webhooks_to_100_Endpoints_with_limit/ (allocs/op)	47,188 allocs	47,162 allocs	🟢 -0.06%
Send_10_Webhooks_to_1000_Endpoints_with_limit/ (ns/op)	373.38 ms	366.87 ms	🟢 -1.74%
Send_10_Webhooks_to_1000_Endpoints_with_limit/ (B/op)	44.51 MB	44.43 MB	🟢 -0.18%
Send_10_Webhooks_to_1000_Endpoints_with_limit/ (allocs/op)	380,231 allocs	379,933 allocs	🟢 -0.08%

codecov · 2025-08-07T10:44:23Z

Codecov Report

❌ Patch coverage is 33.42037% with 765 lines in your changes missing coverage. Please review.
✅ Project coverage is 38.54%. Comparing base (be2d265) to head (0482f5f).
⚠️ Report is 7 commits behind head on main.

Files with missing lines	Patch %	Lines
server/backend/database/mongo/client_cache.go	30.05%	529 Missing and 11 partials ⚠️
server/backend/database/mongo/client.go	50.83%	138 Missing and 9 partials ⚠️
server/clients/clients.go	0.00%	44 Missing ⚠️
server/backend/database/memory/database.go	0.00%	29 Missing ⚠️
server/backend/backend.go	0.00%	4 Missing ⚠️
server/clients/housekeeping.go	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1424      +/-   ##
==========================================
- Coverage   38.77%   38.54%   -0.23%     
==========================================
  Files         184      185       +1     
  Lines       31249    32447    +1198     
==========================================
+ Hits        12116    12507     +391     
- Misses      18159    18940     +781     
- Partials      974     1000      +26

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

hackerwins-yorkie

k6 Load Test 📊

This is a comparison result between the previous(aeb1b83) and the current commit(5a3ee2e).

Change

Metric	Previous	Current	Change
transaction_time.avg	6812.78	6641.67	🟢 -2.51%
http_req_duration.avg	85.20	62.95	🟢 -26.12%
http_req_duration.p(95)	378.06	286.55	🟢 -24.21%
iteration_duration.avg	6812.82	6641.70	🟢 -2.51%
data_received.count	1,097,509,209	1,164,465,978	🔴 +6.10%
data_sent.count	59,663,788	61,282,612	🔴 +2.71%

Result

Details

    script: test/k6/presence.ts

 scenarios: (100.00%) 1 scenario, 500 max VUs, 3m0s max duration (incl. graceful stop):
          * default: Up to 500 looping VUs for 2m30s over 3 stages (gracefulRampDown: 30s, gracefulStop: 30s)

█ THRESHOLDS

active_clients_time
✓ 'p(95)<500' p(95)=95ms

attach_documents_time
✓ 'p(95)<10000' p(95)=328.94ms

deactivate_clients_time
✓ 'p(95)<10000' p(95)=460ms

http_req_duration
✓ 'p(95)<10000' p(95)=286.54ms

pushpulls_time
✓ 'p(95)<1000' p(95)=304ms

transaction_success_rate
✓ 'rate>0.99' rate=100.00%

transaction_time
✓ 'p(95)<30000' p(95)=7.5s

█ TOTAL RESULTS

checks_total.......: 130272  836.639481/s
checks_succeeded...: 100.00% 130272 out of 130272
checks_failed......: 0.00%   0 out of 130272

✓ status is 200
✓ response has valid body

CUSTOM
active_clients....................: 8142    52.289968/s
active_clients_success_rate.......: 100.00% 8142 out of 8142
active_clients_time...............: avg=21ms     min=1ms      med=5ms     max=469ms p(90)=58ms     p(95)=95ms    
attach_documents..................: 8142    52.289968/s
attach_documents_success_rate.....: 100.00% 8142 out of 8142
attach_documents_time.............: avg=92.83ms  min=3ms      med=40ms    max=1.15s p(90)=250ms    p(95)=328.94ms
deactivate_clients................: 8142    52.289968/s
deactivate_clients_success_rate...: 100.00% 8142 out of 8142
deactivate_clients_time...........: avg=116.96ms min=3ms      med=42ms    max=1.18s p(90)=346ms    p(95)=460ms   
pushpulls.........................: 40710   261.449838/s
pushpulls_success_rate............: 100.00% 40710 out of 40710
pushpulls_time....................: avg=67.52ms  min=1ms      med=21ms    max=1.2s  p(90)=193ms    p(95)=304ms   
transaction_success_rate..........: 100.00% 8142 out of 8142
transaction_time..................: avg=6.64s    min=6.01s    med=6.64s   max=8.67s p(90)=7.28s    p(95)=7.5s    

HTTP
http_req_duration.................: avg=62.95ms  min=708.95µs med=14.73ms max=1.18s p(90)=183.55ms p(95)=286.54ms
  { expected_response:true }......: avg=62.95ms  min=708.95µs med=14.73ms max=1.18s p(90)=183.55ms p(95)=286.54ms
http_req_failed...................: 0.00%   0 out of 65136
http_reqs.........................: 65136   418.31974/s

EXECUTION
iteration_duration................: avg=6.64s    min=6.01s    med=6.64s   max=8.67s p(90)=7.28s    p(95)=7.5s    
iterations........................: 8142    52.289968/s
vus...............................: 1       min=1              max=500
vus_max...........................: 500     min=500            max=500

NETWORK
data_received.....................: 1.2 GB  7.5 MB/s
data_sent.........................: 61 MB   394 kB/s

Copilot

Pull Request Overview

This PR introduces an in-memory caching system for client information to optimize MongoDB write performance during push-pull operations. The caching layer aims to reduce database load and latency by serving frequently accessed client data from memory.

Key changes include:

Implementation of a comprehensive client info caching system with adaptive flushing
Integration of cache operations into existing client lifecycle methods
Addition of cache metrics and monitoring capabilities

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
server/backend/database/mongo/client_cache.go	New cache implementation with adaptive flushing, metrics tracking, and cache management
server/backend/database/mongo/client.go	Integration of cache operations into client lifecycle methods and database queries
server/backend/backend.go	Added cache cleanup in shutdown sequence

Copilot · 2025-08-08T00:49:47Z

 		if err == mongo.ErrNoDocuments {
 			return nil, fmt.Errorf("%s: %w", refKey, database.ErrClientNotFound)
 		}
+		return nil, fmt.Errorf("decode client info: %w", err)


This error handling is incorrect. The code should return the original error from result.Err() when it's not mongo.ErrNoDocuments, but instead it's trying to decode and returning a decode error. The logic should handle result.Err() first before attempting to decode.

Copilot · 2025-08-08T00:49:47Z

+		return nil, fmt.Errorf("decode client info: %w", err)
+	}
+
+	if err := c.clientInfoCache.Set(refKey, &clientInfo); err != nil {


This line is unreachable because the function returns in the previous error handling block on line 1033. The decode logic and cache setting should only happen after successful result.Err() handling.

coderabbitai

Actionable comments posted: 2

🔭 Outside diff range comments (1)

server/backend/database/mongo/client.go (1)
802-818: Consider cache-first approach in ActivateClient

The method records a cache miss (line 804) but doesn't actually check the cache first. Consider checking if the client already exists in cache before performing the database operation.

Additionally, as mentioned in past reviews, consider using $max operator for updated_at to prevent timestamp regression in concurrent scenarios:
 res, err := c.collection(ColClients).UpdateOne(ctx, bson.M{
     "project_id": projectID,
     "key":        key,
 }, bson.M{
     "$set": bson.M{
         StatusKey:    database.ClientActivated,
-        "updated_at": now,
         "metadata":   metadata,
     },
+    "$max": bson.M{
+        "updated_at": now,
+    },
 }, options.UpdateOne().SetUpsert(true))

♻️ Duplicate comments (1)

server/backend/database/mongo/client.go (1)

1024-1034: Fix incorrect error handling logic

The error handling is incorrect. When result.Err() is not mongo.ErrNoDocuments, the original error should be returned instead of attempting to decode (which is unreachable).

 clientInfo := database.ClientInfo{}
-if err := result.Decode(&clientInfo); err != nil {
-    if err == mongo.ErrNoDocuments {
-        return nil, fmt.Errorf("%s: %w", refKey, database.ErrClientNotFound)
-    }
-    return nil, fmt.Errorf("decode client info: %w", err)
-}
+if result.Err() != nil {
+    if result.Err() == mongo.ErrNoDocuments {
+        return nil, fmt.Errorf("%s: %w", refKey, database.ErrClientNotFound)
+    }
+    return nil, fmt.Errorf("find client: %w", result.Err())
+}
+if err := result.Decode(&clientInfo); err != nil {
+    return nil, fmt.Errorf("decode client info: %w", err)
+}

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 06ecf54 and a6a79e4.

📒 Files selected for processing (2)

server/backend/database/mongo/client.go (16 hunks)
server/backend/database/mongo/client_cache.go (1 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (2)

server/backend/database/mongo/client.go (9)

api/types/resource_ref_key.go (2)

DocRefKey (24-27)

ClientRefKey (35-38)

server/backend/database/doc_info.go (1)

DocInfo (27-60)

server/backend/database/mongo/changestore.go (1)

ChangeStore (41-44)

pkg/cmap/cmap.go (1)

Map (36-38)

api/types/id.go (1)

ID (35-35)

pkg/document/time/version_vector.go (1)

VersionVector (33-33)

server/backend/database/mongo/client_cache.go (3)

ClientInfoCache (99-110)

NewClientInfoCache (113-135)

CacheMetrics (34-48)

server/backend/database/mongo/indexes.go (1)

ColClients (36-36)

server/backend/database/client_info.go (6)

ClientActivated (43-43)

DocumentAttached (49-49)

ClientDocInfo (56-60)

DocumentAttaching (48-48)

ClientDeactivated (42-42)

ClientInfo (66-92)

server/backend/database/mongo/client_cache.go (5)

server/backend/database/client_info.go (4)

ClientInfo (66-92)

DocumentDetached (50-50)

DocumentRemoved (51-51)

ClientDocInfo (56-60)

api/types/resource_ref_key.go (1)

ClientRefKey (35-38)

server/backend/database/mongo/client.go (2)

Client (51-59)

StatusKey (47-47)

server/logging/logging.go (1)

DefaultLogger (85-90)

server/backend/database/mongo/indexes.go (1)

ColClients (36-36)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: complex-test
GitHub Check: load-test
GitHub Check: build
GitHub Check: bench

🔇 Additional comments (6)

server/backend/database/mongo/client_cache.go (3)

549-692: Well-implemented metrics system

The metrics implementation is comprehensive with proper synchronization, detailed tracking per operation type, and informative logging with calculated hit rates.

236-239: Use database package constants for document status

The code uses string literals "detached" and "removed" but should use the constants from the database package for consistency.
-			if newDocInfo.Status == database.DocumentDetached || newDocInfo.Status == database.DocumentRemoved {
+			if newDocInfo.Status == database.DocumentDetached || newDocInfo.Status == database.DocumentRemoved {
Note: Based on the relevant code snippets, these constants are defined in server/backend/database/mongo/client.go as DocumentDetached = "detached" and DocumentRemoved = "removed". Ensure they're properly accessible or consider moving them to the database package for shared use.

Likely an incorrect or invalid review comment.

50-79: Missing TTL field initialization in DefaultCacheConfig

The CacheConfig struct defines a TTL field that is used in the cache operations (e.g., line 214 in UpdateClientInfo), but it's never initialized in DefaultCacheConfig(). This will cause the zero value (0s) to be used, potentially causing immediate expiration.

Add the TTL field to the config:
 type CacheConfig struct {
 	// BaseFlushInterval is the base interval for flushing cache to DB
 	BaseFlushInterval time.Duration
 	// MaxFlushInterval is the maximum interval for flushing cache to DB
 	MaxFlushInterval time.Duration
 	// MinFlushInterval is the minimum interval for flushing cache to DB
 	MinFlushInterval time.Duration
 	// MaxCacheSize is the maximum number of cached entries
 	MaxCacheSize int
 	// WritePressureThreshold is the threshold for write pressure detection
 	WritePressureThreshold int
 	// PressureCheckInterval is the interval for checking write pressure
 	PressureCheckInterval time.Duration
 	// EnableFlushCleanup enables cache cleanup after flush
 	EnableFlushCleanup bool
+	// TTL is the time-to-live for cache entries
+	TTL time.Duration
 }

 func DefaultCacheConfig() *CacheConfig {
 	return &CacheConfig{
 		BaseFlushInterval:      10 * time.Second,
 		MaxFlushInterval:       60 * time.Second,
 		MinFlushInterval:       5 * time.Second,
 		MaxCacheSize:           20000,
 		WritePressureThreshold: 500,
 		PressureCheckInterval:  10 * time.Second,
 		EnableFlushCleanup:     true,
+		TTL:                    5 * time.Minute,
 	}
 }
Likely an incorrect or invalid review comment.
server/backend/database/mongo/client.go (3)

120-131: Proper Client initialization pattern

The initialization now correctly creates a single Client instance and assigns the cache, fixing the previous double-initialization issue.

1369-1386: Good implementation of selective cache invalidation

The selective cache invalidation has been properly implemented as suggested in past reviews, with appropriate fallback to full invalidation when needed.

2313-2327: Clean implementation of cache metrics methods

The cache metrics methods are well-implemented with proper nil checks for safety.

coderabbitai · 2025-08-11T07:06:07Z

+		docKey := clientDocInfoKey(docID, "server_seq")
+		updateDoc["$max"].(bson.M)[docKey] = docInfo.ServerSeq
+
+		docKey = clientDocInfoKey(docID, "client_seq")
+		updateDoc["$max"].(bson.M)[docKey] = docInfo.ClientSeq
+
+		docKey = clientDocInfoKey(docID, StatusKey)
+		updateDoc["$set"].(bson.M)[docKey] = docInfo.Status


⚠️ Potential issue

Missing clientDocInfoKey helper function

The code uses clientDocInfoKey function which is not defined in this file. This function is defined in client.go but creates an unnecessary coupling between files.

Either:

Import or define the function in this file:

// clientDocInfoKey returns the key for the client document info. func clientDocInfoKey(docID types.ID, prefix string) string { return fmt.Sprintf("documents.%s.%s", docID, prefix) }

Or make it a method on the cache struct to keep it encapsulated.

🤖 Prompt for AI Agents

In server/backend/database/mongo/client_cache.go around lines 487 to 494, the helper clientDocInfoKey used to build document map keys is missing from this file (it currently exists in client.go and causes cross-file coupling); add a local implementation or move it to the cache type as a method so this file is self-contained — implement a function or method that accepts (docID types.ID, prefix string) and returns the key string formatted as "documents.%s.%s" (use fmt.Sprintf and the correct types import), then replace calls to clientDocInfoKey with the new local function/method and remove any dependency on client.go for this helper.

coderabbitai

Actionable comments posted: 4

🔭 Outside diff range comments (2)

server/clients/housekeeping.go (1)
62-67: Count only actual deactivations; skip no-ops cleanly

With the new housekeeping path, the DB may return (nil, nil) when no state change was performed (not found or already deactivated). The current code increments deactivatedCount unconditionally, which can inflate metrics. Increment only when a deactivation actually occurred.

Apply this diff:
-		if _, err := DeactivateForHousekeeping(ctx, be, pair.Project.ToProject(), pair.Client.RefKey()); err != nil {
-			return database.DefaultProjectID, err
-		}
-
-		deactivatedCount++
+		if info, err := DeactivateForHousekeeping(ctx, be, pair.Project.ToProject(), pair.Client.RefKey()); err != nil {
+			return database.DefaultProjectID, err
+		} else if info != nil {
+			deactivatedCount++
+		}
server/backend/database/mongo/client.go (1)
804-845: Consolidate to a single atomic upsert and prevent updated_at regression

The current flow performs UpdateOne, followed by a conditional FindOneAndUpdate for created_at, and then an extra FindOne in the non-upsert case. This has two issues:

Multiple round-trips that can be collapsed.

Potential timestamp regression if concurrent calls write out-of-order (an older updated_at can overwrite a newer one).

Use a single FindOneAndUpdate with:

$set for status and metadata

$setOnInsert for created_at

$max for updated_at to avoid regressing to an older timestamp

ReturnDocument(After) to return the final state in one call.
 	now := gotime.Now()
-
-	c.clientInfoCache.recordActivateClientMiss()
-	res, err := c.collection(ColClients).UpdateOne(ctx, bson.M{
-		"project_id": projectID,
-		"key":        key,
-	}, bson.M{
-		"$set": bson.M{
-			StatusKey:    database.ClientActivated,
-			"updated_at": now,
-			"metadata":   metadata,
-		},
-	}, options.UpdateOne().SetUpsert(true))
-	if err != nil {
-		return nil, fmt.Errorf("upsert client: %w", err)
-	}
-
-	var result *mongo.SingleResult
-	if res.UpsertedCount > 0 {
-		result = c.collection(ColClients).FindOneAndUpdate(ctx, bson.M{
-			"project_id": projectID,
-			"_id":        res.UpsertedID,
-		}, bson.M{
-			"$set": bson.M{
-				"created_at": now,
-			},
-		})
-	} else {
-		result = c.collection(ColClients).FindOne(ctx, bson.M{
-			"project_id": projectID,
-			"key":        key,
-		})
-	}
+	c.clientInfoCache.recordActivateClientMiss()
+	result := c.collection(ColClients).FindOneAndUpdate(
+		ctx,
+		bson.M{
+			"project_id": projectID,
+			"key":        key,
+		},
+		bson.M{
+			"$set": bson.M{
+				StatusKey:  database.ClientActivated,
+				"metadata": metadata,
+			},
+			"$setOnInsert": bson.M{
+				"created_at": now,
+			},
+			"$max": bson.M{
+				"updated_at": now,
+			},
+		},
+		options.FindOneAndUpdate().
+			SetUpsert(true).
+			SetReturnDocument(options.After),
+	)
 
 	clientInfo := database.ClientInfo{}
 	if err = result.Decode(&clientInfo); err != nil {
 		return nil, fmt.Errorf("decode client info: %w", err)
 	}

♻️ Duplicate comments (2)

server/backend/database/mongo/client_cache.go (2)
488-488: Missing clientDocInfoKey helper function

The code uses clientDocInfoKey function which is not defined in this file. This function is referenced on lines 488, 491, 494, 878, 883, and 884 but is not implemented.

Based on the past review comments, this issue was already identified. Please add the missing function to make this file self-contained:
+// clientDocInfoKey returns the key for the client document info.
+func clientDocInfoKey(docID types.ID, prefix string) string {
+	return fmt.Sprintf("documents.%s.%s", docID, prefix)
+}
83-89: Missing ExpiresAt field in CachedClientInfo struct

The struct is missing the ExpiresAt field that was referenced in previous code versions and might be needed for TTL functionality mentioned in the configuration.

Add the missing field if TTL functionality is planned:
 type CachedClientInfo struct {
 	ClientInfo *database.ClientInfo
 	UpdatedAt  time.Time
 	Dirty      bool
 	LastFlush  time.Time
+	ExpiresAt  time.Time
 }

🧹 Nitpick comments (5)

server/backend/database/mongo/client.go (2)

1088-1117: Write-back only on UpdateClientInfoAfterPushPull: add resilience to flush failures

This switches to a write-back-only path for client updates. If FlushToDB() fails, dirty data can linger longer than desired.

Add retry/backoff around periodic flush failures.

Consider immediate flush on specific transitions (e.g., detach, deactivate) or when write pressure is high.

Ensure Backend.Shutdown always calls CloseClientInfoCache() before disconnecting Mongo to flush outstanding changes (verified elsewhere).

1213-1254: Cache-first attached-clients lookup is good; verify DB fallback field/index

The DB fallback filters by "attached_docs". Ensure:

The cache flusher maintains "attached_docs" alongside Documents.

An appropriate index exists on {project_id, status, attached_docs} to make this query efficient.

If not, consider updating indexes and the update model builder.
server/backend/database/mongo/client_cache.go (3)
674-743: Consider write-back consistency for UpdateCheckpoint

The UpdateCheckpoint method uses write-back strategy with $max operator logic, but there's no mechanism to ensure consistency if multiple concurrent updates occur between cache and database flushes. Consider adding optimistic concurrency control or documenting the expected usage pattern.

Consider adding a comment to document the concurrency expectations:
 // UpdateCheckpoint updates only the checkpoint (ServerSeq, ClientSeq) for a specific document
 // This uses Write-back strategy with max operator for consistency
+// Note: Concurrent updates between cache flushes may result in lost updates. 
+// Callers should ensure proper synchronization at the application level.
 func (c *ClientInfoCache) UpdateCheckpoint(refKey types.ClientRefKey, docID types.ID, cp change.Checkpoint) error {
745-774: Potential race condition in write-through operations

The UpdateStatus and UpdateDocumentStatus methods unlock the mutex before calling the DB update function, then re-acquire it. This creates a window where the cache entry could be modified by another goroutine between the DB update and cache update, potentially leading to inconsistent state.

Consider refactoring to maintain consistency:
 // UpdateStatus updates the client status using CAS Write-through strategy
 func (c *ClientInfoCache) UpdateStatus(refKey types.ClientRefKey, status string) error {
-	// Check if update is needed by comparing with cache first
-	c.mu.Lock()
-	if cached, exists := c.cache[refKey]; exists {
-		if cached.ClientInfo.Status == status {
-			// No change needed, skip database update
-			c.mu.Unlock()
-			return nil
-		}
-	}
-	c.mu.Unlock()
+	c.mu.Lock()
+	defer c.mu.Unlock()
+	
+	// Check if update is needed
+	if cached, exists := c.cache[refKey]; exists {
+		if cached.ClientInfo.Status == status {
+			return nil // No change needed
+		}
+	}
 
-	// Update in DB (Write-through)
+	// Temporarily unlock for DB operation
+	c.mu.Unlock()
 	if err := c.updateStatusInDB(refKey, status); err != nil {
+		c.mu.Lock() // Re-acquire for defer
 		return fmt.Errorf("update status in DB: %w", err)
 	}
+	c.mu.Lock()
 
-	// Then update cache
-	c.mu.Lock()
-	defer c.mu.Unlock()
-
+	// Update cache after successful DB write
 	if cached, exists := c.cache[refKey]; exists {
 		cached.ClientInfo.Status = status
 		cached.UpdatedAt = time.Now()
-		// Not dirty since already written to DB
 		cached.Dirty = false
 	}
 	return nil
 }
Also applies to: 776-827

429-473: Optimize bulk write performance

The FlushToDB method processes all dirty entries in a single bulk operation, which is good for performance. However, consider adding batch size limits to avoid potential memory issues or MongoDB limits with very large caches.

Consider adding batch size configuration:
 // CacheConfig defines configuration parameters for the cache
 type CacheConfig struct {
 	// ... existing fields ...
+	// MaxBatchSize limits the number of entries processed in a single bulk write
+	MaxBatchSize int
 }

 // DefaultCacheConfig returns default cache configuration
 func DefaultCacheConfig() *CacheConfig {
 	return &CacheConfig{
 		// ... existing values ...
+		MaxBatchSize: 1000,
 	}
 }

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled
Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between a6a79e4 and d8b6aa5.

📒 Files selected for processing (6)

server/backend/database/database.go (1 hunks)
server/backend/database/memory/database.go (1 hunks)
server/backend/database/mongo/client.go (16 hunks)
server/backend/database/mongo/client_cache.go (1 hunks)
server/clients/clients.go (2 hunks)
server/clients/housekeeping.go (1 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (6)

server/clients/housekeeping.go (1)

server/clients/clients.go (1)

DeactivateForHousekeeping (96-144)

server/backend/database/database.go (2)

api/types/resource_ref_key.go (1)

ClientRefKey (35-38)

server/backend/database/client_info.go (1)

ClientInfo (66-92)

server/backend/database/memory/database.go (2)

api/types/resource_ref_key.go (1)

ClientRefKey (35-38)

server/backend/database/client_info.go (3)

ClientInfo (66-92)

ClientActivated (43-43)

ClientDeactivated (42-42)

server/clients/clients.go (5)

server/backend/backend.go (1)

Backend (47-77)

api/types/resource_ref_key.go (2)

ClientRefKey (35-38)

DocRefKey (24-27)

server/backend/database/client_info.go (3)

ClientInfo (66-92)

ClientDeactivated (42-42)

DocumentAttached (49-49)

server/documents/documents.go (1)

FindDocInfoByRefKey (382-388)

api/types/id.go (1)

ID (35-35)

server/backend/database/mongo/client.go (7)

api/types/resource_ref_key.go (2)

DocRefKey (24-27)

ClientRefKey (35-38)

server/backend/database/doc_info.go (1)

DocInfo (27-60)

pkg/cmap/cmap.go (1)

Map (36-38)

pkg/document/time/version_vector.go (1)

VersionVector (33-33)

server/backend/database/mongo/client_cache.go (3)

ClientInfoCache (101-111)

NewClientInfoCache (114-136)

CacheMetrics (36-50)

server/backend/database/mongo/indexes.go (1)

ColClients (36-36)

server/backend/database/client_info.go (6)

ClientActivated (43-43)

DocumentAttached (49-49)

ClientDocInfo (56-60)

DocumentAttaching (48-48)

ClientDeactivated (42-42)

ClientInfo (66-92)

server/backend/database/mongo/client_cache.go (5)

api/types/resource_ref_key.go (1)

ClientRefKey (35-38)

server/backend/database/mongo/client.go (2)

Client (51-59)

StatusKey (47-47)

server/logging/logging.go (1)

DefaultLogger (85-90)

server/backend/database/mongo/indexes.go (1)

ColClients (36-36)

server/backend/database/database.go (1)

ErrClientNotFound (46-46)

🔇 Additional comments (17)

server/backend/database/mongo/client.go (11)

120-132: Client initialization and cache wiring look correct

Creating the client instance first and then wiring the cache into the same instance avoids stale references and duplicate allocations. LGTM.

136-147: Ensure cache lifecycle is closed on shutdown paths

Close() intentionally does not close the clientInfoCache, relying on Backend.Shutdown to call CloseClientInfoCache() beforehand. Verify all shutdown paths (including tests/utilities that may call Client.Close directly) invoke CloseClientInfoCache() to avoid lost write-back flushes.

149-155: LGTM on explicit cache close API

Providing an explicit CloseClientInfoCache() improves lifecycle clarity and allows deterministic flush. Looks good.

855-886: Cache-first TryAttaching looks good; ensure DB state gets flushed reliably

The cache hit path updates only the in-memory entry and returns. This depends on the write-back flusher to persist changes. Make sure flush reliability (retry/backoff, flush on shutdown) is sufficient for this critical state transition, otherwise attaching state may be lost on crash.

932-1007: LGTM on cache-first DeactivateClient with DB fallback

Input validation and preconditions are preserved in both cache and DB paths. The $expr condition correctly rejects clients with any attaching/attached documents. Looks good.

1009-1048: Mongo returns nil on no-op; ensure memory backend matches

This method returns (nil, nil) when the client is not found or already deactivated, which is desirable for idempotent housekeeping. Please align the memory backend to match this behavior (see separate comment).

1053-1079: LGTM on cache-first FindClientInfoByRefKey

Straightforward cache read-through; on miss, populate cache. Error mapping remains consistent.

1411-1427: Selective cache invalidation is a solid improvement

Invalidating only clients attached to the affected document should significantly boost cache hit rates compared to blanket invalidation. LGTM.

2246-2262: LGTM: targeted invalidation on purge

Same as above: selective invalidation over full cache clear is the right choice here.

2051-2082: LGTM on cache-first IsDocumentAttached

Fast path via cache, with DB fallback that honors excludeClientID. Ensure "attached_docs" field is kept in sync by the cache flusher for correctness.

2355-2369: Metrics surfacing looks good

Exposing and logging cache metrics will help validate effectiveness and tune thresholds. LGTM.

server/backend/database/mongo/client_cache.go (6)

372-374: Improved error logging for adaptive flush

The error logging has been improved compared to the previous version that silently ignored errors.

205-226: Fixed deadlock issue in UpdateClientInfo

The method correctly avoids the deadlock by directly manipulating the cache instead of calling Set() while holding the lock.

269-283: Improved error handling for dirty entry flushes

Both Invalidate and InvalidateAll methods now properly return errors when flushing dirty entries fails, addressing the previous concern about silent error handling.

Also applies to: 285-296

612-672: Comprehensive metrics logging implementation

The metrics logging provides detailed insights into cache performance across different operations, which is excellent for monitoring and debugging cache effectiveness.

114-136: Well-structured cache initialization

The constructor properly initializes all components and starts background goroutines for adaptive flushing and metrics logging.

229-266: Add unit tests for sequence reset on detach/remove

There are no existing tests covering how mergeClientInfo handles DocumentDetached or DocumentRemoved, so we can’t be sure the reset-to-0 logic is behaving as intended. Please:

Review the intended semantics when a document’s status changes to detached/removed:

Should both ServerSeq and ClientSeq always reset to 0, even if newDocInfo has higher sequence numbers?

Or should you apply $max after resetting?

Add unit tests in *_test.go that exercise:

A document transitioning to detached/removed with higher newDocInfo sequences (verify you either drop or keep the new values per spec).

Normal merges (status other than detached/removed) to confirm your $max logic still applies.

Target tests against mergeClientInfo in server/backend/database/mongo/client_cache.go (lines 229–266).

coderabbitai · 2025-08-18T10:07:05Z

+	// DeactivateClientForHousekeeping deactivates the client for housekeeping purposes.
+	// This method bypasses cache and directly updates the database.
+	DeactivateClientForHousekeeping(ctx context.Context, refKey types.ClientRefKey) (*ClientInfo, error)
+


💡 Verification agent

🧩 Analysis chain

Clarify contract and align semantics across backends for housekeeping deactivation

Right now, Mongo returns (nil, nil) when the client is not found or already deactivated, while the memory backend can return a non-nil ClientInfo in the "already deactivated" case. Please document the intended contract here and align implementations accordingly.

I recommend standardizing on: return (nil, nil) if the client does not exist, belongs to another project, or is already deactivated. This enables callers (e.g., housekeeping) to treat nil as "no-op" consistently.

Apply this comment update to make the contract explicit:

- // DeactivateClientForHousekeeping deactivates the client for housekeeping purposes. - // This method bypasses cache and directly updates the database. + // DeactivateClientForHousekeeping deactivates the client for housekeeping purposes. + // This method bypasses any in-memory cache and directly updates the database. + // Returns (nil, nil) if the client does not exist, is not in the given project, + // or is already deactivated (i.e., no state change was performed). DeactivateClientForHousekeeping(ctx context.Context, refKey types.ClientRefKey) (*ClientInfo, error)

🏁 Script executed:

#!/bin/bash set -e echo "Searching for DeactivateClientForHousekeeping occurrences in Go files" rg -n "DeactivateClientForHousekeeping" -C2 --type=go

Length of output: 2279

🏁 Script executed:

#!/bin/bash set -e echo "Extracting memory DB implementation for DeactivateClientForHousekeeping:" sed -n '860,950p' server/backend/database/memory/database.go echo echo "Extracting Mongo client implementation for DeactivateClientForHousekeeping:" sed -n '1000,1100p' server/backend/database/mongo/client.go

Length of output: 6175

Clarify DeactivateClientForHousekeeping contract and align semantics across backends

The memory backend still returns a non-nil ClientInfo when the client is already deactivated, whereas the Mongo backend returns (nil, nil) in that case. Let’s standardize on returning (nil, nil) whenever no state change is needed (client not found, wrong project, or already deactivated).

• server/backend/database/database.go (interface): update doc comment
• server/backend/database/memory/database.go: change the “already deactivated” branch to return nil, nil

Suggested doc-comment diff in database.go:

- // DeactivateClientForHousekeeping deactivates the client for housekeeping purposes. - // This method bypasses cache and directly updates the database. + // DeactivateClientForHousekeeping deactivates the client for housekeeping purposes. + // This method bypasses any in-memory cache and directly updates the database. + // Returns (nil, nil) if the client does not exist, is not in the given project, + // or is already deactivated (i.e., no state change was performed).

Then, in memory/database.go, replace:

if clientInfo.Status != database.ClientActivated { return clientInfo, nil // Already deactivated, return current state }

with:

if clientInfo.Status != database.ClientActivated { return nil, nil // Already deactivated, no-op }

🤖 Prompt for AI Agents

In server/backend/database/database.go around lines 189-192, update the DeactivateClientForHousekeeping doc comment to state that the method returns (nil, nil) when no state change is needed (client not found, wrong project, or already deactivated); then in server/backend/database/memory/database.go locate the branch that checks if clientInfo.Status != database.ClientActivated and change its return to nil, nil (instead of returning the current ClientInfo) so the memory backend semantics match Mongo and the interface doc.

coderabbitai

Actionable comments posted: 1

🔭 Outside diff range comments (1)

server/backend/database/mongo/client.go (1)

810-835: Prevent timestamp regression and collapse to a single atomic upsert.

Concurrent ActivateClient calls may overwrite a newer updated_at with an older one. Also, you’re doing a second hop to set created_at only when inserted. Use FindOneAndUpdate with $max(updated_at) and $setOnInsert(created_at) to atomically return the updated doc and eliminate the follow-up query.

Apply this refactor:

-  c.clientInfoCache.recordActivateClientMiss()
-  res, err := c.collection(ColClients).UpdateOne(ctx, bson.M{
-    "project_id": projectID,
-    "key":        key,
-  }, bson.M{
-    "$set": bson.M{
-      StatusKey:    database.ClientActivated,
-      "updated_at": now,
-      "metadata":   metadata,
-    },
-  }, options.UpdateOne().SetUpsert(true))
-  if err != nil {
-    return nil, fmt.Errorf("activate client of %s: %w", key, err)
-  }
-
-  var result *mongo.SingleResult
-  if res.UpsertedCount > 0 {
-    result = c.collection(ColClients).FindOneAndUpdate(ctx, bson.M{
-      "project_id": projectID,
-      "_id":        res.UpsertedID,
-    }, bson.M{
-      "$set": bson.M{
-        "created_at": now,
-      },
-    })
-  } else {
-    result = c.collection(ColClients).FindOne(ctx, bson.M{
-      "project_id": projectID,
-      "key":        key,
-    })
-  }
+  c.clientInfoCache.recordActivateClientMiss()
+  result := c.collection(ColClients).FindOneAndUpdate(
+    ctx,
+    bson.M{
+      "project_id": projectID,
+      "key":        key,
+    },
+    bson.M{
+      "$set": bson.M{
+        StatusKey:  database.ClientActivated,
+        "metadata": metadata,       // keep behavior of updating metadata on activation
+      },
+      "$max": bson.M{
+        "updated_at": now,          // only move forward in time
+      },
+      "$setOnInsert": bson.M{
+        "created_at": now,          // set at creation only
+      },
+    },
+    options.FindOneAndUpdate().
+      SetUpsert(true).
+      SetReturnDocument(options.After),
+  )
+  if result.Err() != nil {
+    return nil, fmt.Errorf("activate client of %s: %w", key, result.Err())
+  }

This also reduces round-trips under contention.

🧹 Nitpick comments (4)

server/backend/database/mongo/client.go (4)
1103-1106: Redundant IsAttached() check.

You already validated membership via clientInfo.Documents[docInfo.ID]. Re-evaluating IsAttached without using its boolean is redundant and can be removed for clarity.
-  _, err := clientInfo.IsAttached(docInfo.ID)
-  if err != nil {
-    return err
-  }
+  if _, err := clientInfo.IsAttached(docInfo.ID); err != nil {
+    return err
+  }
Alternatively, rely solely on the earlier map check and drop this call if it doesn’t add semantics you need.

1108-1113: Write-back reliability for critical post-PushPull updates.

This path only updates the cache; durability depends on asynchronous flush. Consider adding retry/backoff around flush on dirty entries or a write-through option for this critical path, plus metrics/logging for failed flushes to avoid silent staleness under DB issues.

Would you like me to propose a minimal retry wrapper in ClientInfoCache.FlushToDB() and surface a counter for failed flush attempts in CacheMetrics?

Also applies to: 1126-1129

1250-1267: Sort DB fallback results for deterministic ordering.

The cache path returns a sorted slice; the DB fallback should match this to avoid nondeterministic order across executions.

Apply this after decoding:
   var clientInfos []*database.ClientInfo
   if err := cursor.All(ctx, &clientInfos); err != nil {
     return nil, fmt.Errorf("find attached clients of %s: %w", docRefKey, err)
   }
 
+  sort.Slice(clientInfos, func(i, j int) bool {
+    return string(clientInfos[i].ID) < string(clientInfos[j].ID)
+  })
+
   return clientInfos, nil
136-138: Add changeCache.Purge() to Client.Close for symmetry

The Shutdown method already calls CloseClientInfoCache() before DB.Close(), so the comment about closing the client-info cache is accurate. We still need to free the change cache in the Client.Close() method:

In server/backend/database/mongo/client.go, update func (c *Client) Close():
    if err := c.client.Disconnect(context.Background()); err != nil {
        return fmt.Errorf("close MongoDB client: %w", err)
    }

    c.docCache.Purge()
+   c.changeCache.Purge()
    c.vectorCache.Purge()

    return nil

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled
Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between d8b6aa5 and 93a65db.

📒 Files selected for processing (2)

server/backend/database/memory/database.go (1 hunks)
server/backend/database/mongo/client.go (16 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

server/backend/database/memory/database.go

🧰 Additional context used

🧬 Code Graph Analysis (1)

server/backend/database/mongo/client.go (10)

api/types/resource_ref_key.go (2)

DocRefKey (24-27)

ClientRefKey (35-38)

server/backend/database/doc_info.go (1)

DocInfo (27-60)

server/backend/database/mongo/changestore.go (1)

ChangeStore (41-44)

pkg/cmap/cmap.go (1)

Map (36-38)

api/types/id.go (1)

ID (35-35)

pkg/document/time/version_vector.go (1)

VersionVector (33-33)

server/backend/database/mongo/client_cache.go (3)

ClientInfoCache (101-111)

NewClientInfoCache (114-136)

CacheMetrics (36-50)

server/backend/database/mongo/indexes.go (1)

ColClients (36-36)

server/backend/database/client_info.go (4)

ClientActivated (43-43)

DocumentAttached (49-49)

ClientDocInfo (56-60)

ClientInfo (66-92)

server/logging/logging.go (1)

DefaultLogger (85-90)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: bench
GitHub Check: build
GitHub Check: complex-test
GitHub Check: load-test

🔇 Additional comments (15)

server/backend/database/mongo/client.go (15)

24-24: Import of sort looks good.

Required for deterministic ordering in cache-returned results. No concerns.

55-58: ClientInfo cache field addition is appropriate.

The new clientInfoCache field is well-scoped on Client alongside existing caches.

120-131: Single Client initialization: LGTM.

This pattern avoids creating a throwaway Client and correctly wires the cache to the returned instance.

149-155: Dedicated CloseClientInfoCache(): LGTM.

Clear ownership and lifecycle of the cache are well-defined.

861-893: Cache-first TryAttaching flow: LGTM.

Precondition checks and in-cache transition to Attaching are clear and consistent with the write-back strategy.

894-915: DB fallback path and cache update: LGTM.

The conditional update ensures the doc isn’t already attached; caching the fresh result keeps cache coherent.

Also applies to: 924-927

938-1014: Cache-first DeactivateClient flow: LGTM.

Precondition validation and in-cache transition to Deactivated are sound.

1016-1055: Housekeeping path bypass: LGTM.

Direct DB update with permissive nil return for “no-op” is reasonable; updating the cache afterward maintains coherence.

1060-1083: Read-through FindClientInfoByRefKey: LGTM.

Hit/miss metrics and caching the DB result are correctly handled.

1425-1441: Selective invalidation on document removal: LGTM.

Targeting only affected clients with a safe fallback to full invalidation is a solid balance between correctness and performance.

2272-2288: Selective invalidation on purge: LGTM.

Same here—the selective-first with full fallback approach is appropriate.

2077-2091: Cache-first IsDocumentAttached: LGTM.

The early-return based on cache reduces DB load; skip logic for excludeClientID is correct.

2381-2395: Cache metrics API: LGTM.

Simple, useful hooks to bubble up cache health and performance.

1250-1261: attached_docs field is defined, indexed, and maintained

Confirmed that:

An index on “attached_docs” exists in server/backend/database/mongo/indexes.go (Keys: “attached_docs”).

Updates to the field are applied in server/backend/database/mongo/client_cache.go (lines ~498–501 via clientInfo.AttachedDocuments()).

No further action needed.

1227-1249: types.ID implements a String() method; the compile concern is unfounded
We’ve confirmed in api/types/id.go that type ID string is paired with
func (id *ID) String() string { 
    return string(*id) 
}
As a result, attachedClients[i].ID.String() will compile correctly. Casting to string(ID) remains a stylistic choice, not a necessity. Please ignore this suggestion.

Likely an incorrect or invalid review comment.

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)

server/backend/database/mongo/client.go (3)

810-851: Critical: upsert decode can fail and updated_at can regress under concurrency

Upsert branch uses FindOneAndUpdate without ReturnDocument=After, which can decode “before” state (or none), causing ErrNoDocuments after a successful upsert.
Plain $set of updated_at risks a slower concurrent call overwriting a newer timestamp. This can break deactivation thresholds.

Refactor to a single atomic FindOneAndUpdate with SetUpsert(true), ReturnDocument(After), use $max for updated_at, and $setOnInsert for created_at. Also set metadata through $set (or $setOnInsert if you want it immutable at creation).

Apply this diff:

-  c.clientInfoCache.recordActivateClientMiss()
-  res, err := c.collection(ColClients).UpdateOne(ctx, bson.M{
-    "project_id": projectID,
-    "key":        key,
-  }, bson.M{
-    "$set": bson.M{
-      StatusKey:    database.ClientActivated,
-      "updated_at": now,
-      "metadata":   metadata,
-    },
-  }, options.UpdateOne().SetUpsert(true))
-  if err != nil {
-    return nil, fmt.Errorf("activate client of %s: %w", key, err)
-  }
-
-  var result *mongo.SingleResult
-  if res.UpsertedCount > 0 {
-    result = c.collection(ColClients).FindOneAndUpdate(ctx, bson.M{
-      "project_id": projectID,
-      "_id":        res.UpsertedID,
-    }, bson.M{
-      "$set": bson.M{
-        "created_at": now,
-      },
-    })
-  } else {
-    result = c.collection(ColClients).FindOne(ctx, bson.M{
-      "project_id": projectID,
-      "key":        key,
-    })
-  }
+  result := c.collection(ColClients).FindOneAndUpdate(
+    ctx,
+    bson.M{
+      "project_id": projectID,
+      "key":        key,
+    },
+    bson.M{
+      "$set": bson.M{
+        StatusKey:  database.ClientActivated,
+        "metadata": metadata,
+      },
+      // Prevent timestamp regression on concurrent writes
+      "$max": bson.M{
+        "updated_at": now,
+      },
+      // Ensure created_at exists only on insert
+      "$setOnInsert": bson.M{
+        "created_at": now,
+      },
+    },
+    options.FindOneAndUpdate().
+      SetUpsert(true).
+      SetReturnDocument(options.After),
+  )

1060-1085: Bug: swallows non-ErrNoDocuments decode errors and caches a zero-value struct

If result.Decode fails with an error other than ErrNoDocuments, the function falls through, caches an empty ClientInfo, and returns it. Return the decode error instead and only cache on success.

Apply this diff:

   clientInfo := database.ClientInfo{}
   if err := result.Decode(&clientInfo); err != nil {
     if err == mongo.ErrNoDocuments {
       return nil, fmt.Errorf("find client of %s: %w", refKey, database.ErrClientNotFound)
     }
+    return nil, fmt.Errorf("find client of %s: %w", refKey, err)
   }

   if err := c.clientInfoCache.Set(refKey, &clientInfo); err != nil {
     return nil, fmt.Errorf("set client info in cache: %w", err)
   }

2076-2107: Bug: DB errors in IsDocumentAttached return true, masking failures

The fallback returns true for any non-ErrNoDocuments error, which can erroneously block compaction/purges and hide operational issues. Return the error instead.

Apply this diff:

   result := c.collection(ColClients).FindOne(ctx, filter)
   if result.Err() == mongo.ErrNoDocuments {
     return false, nil
   }
-
-  return true, nil
+  if result.Err() != nil {
+    return false, fmt.Errorf("check attachment of %s: %w", docRefKey, result.Err())
+  }
+  return true, nil

🧹 Nitpick comments (4)

server/backend/database/mongo/client.go (4)
136-147: Also purge changeCache in Close()

Minor resource hygiene: changeCache isn’t purged on shutdown while doc/vector caches are. Purge it to avoid lingering memory.

Apply this diff:
   if err := c.client.Disconnect(context.Background()); err != nil {
     return fmt.Errorf("close MongoDB client: %w", err)
   }

   c.docCache.Purge()
+  c.changeCache.Purge()
   c.vectorCache.Purge()

   return nil
810-810: Metric mislabel: recording a “miss” without a cache lookup

You record an ActivateClient miss but never check the cache. Either check cache and record hit/miss or drop the metric.

861-893: Cache write-back path: ensure timely persistence and retries

On cache hit, you mutate cached ClientInfo to DocumentAttaching and return without a guaranteed immediate DB write. Confirm UpdateClientInfo marks the entry dirty and triggers/adheres to bounded-latency flush with retries/backoff. If not, attach/deactivate races may occur or be lost on crash.

Would you like a follow-up patch to add retry/backoff and metrics for FlushToDB failures?

917-922: Nit: fix error message formatting

Remove the stray space before the colon in the error message.

Apply this diff:
-    return nil, fmt.Errorf("try attaching %s to %s : %w", docID, refKey.ClientID, err)
+    return nil, fmt.Errorf("try attaching %s to %s: %w", docID, refKey.ClientID, err)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled
Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 93a65db and 5cce717.

📒 Files selected for processing (4)

server/backend/database/memory/database.go (1 hunks)
server/backend/database/mongo/client.go (17 hunks)
server/backend/database/mongo/client_cache.go (1 hunks)
server/clients/clients.go (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (3)

server/backend/database/memory/database.go
server/clients/clients.go
server/backend/database/mongo/client_cache.go

🧰 Additional context used

🧬 Code Graph Analysis (1)

server/backend/database/mongo/client.go (4)

api/types/resource_ref_key.go (2)

DocRefKey (24-27)

ClientRefKey (35-38)

server/backend/database/mongo/client_cache.go (3)

ClientInfoCache (101-111)

NewClientInfoCache (114-136)

CacheMetrics (36-50)

server/backend/database/client_info.go (6)

ClientActivated (43-43)

DocumentAttached (49-49)

ClientDocInfo (56-60)

DocumentAttaching (48-48)

ClientDeactivated (42-42)

ClientInfo (66-92)

server/backend/database/database.go (1)

ErrClientNotFound (46-46)

🔇 Additional comments (9)

server/backend/database/mongo/client.go (9)

120-131: Good fix: single Client instance initialization

This addresses the double-initialization issue and ensures the cache references the correct Client instance. Clean and correct.

938-1014: LGTM: cache-first deactivate with correctness checks

Properly validates activation and no-attached-docs precondition in cache path, mirrors DB filter with $expr fallback, and updates cache on DB path. Looks solid.

1016-1055: LGTM: housekeeping deactivation bypasses cache safely

Direct DB update semantics are clear. Returning (nil, nil) on “not found/already deactivated” is acceptable for housekeeping flows and you still refresh the cache when a change occurs.

1094-1131: LGTM: safer error handling in DB-read path

Using result.Err() for existence check and populating the cache with provided clientInfo avoids stale reads. This addresses the earlier “wrap wrong error” issue.

1226-1247: Deterministic ordering for cached attached clients — good improvement

Sorting by client ID after a cache hit provides stable results. Reasonable tradeoff vs. pushing sort to callers.

1424-1441: Targeted cache invalidation is a solid optimization

Selective invalidation for clients attached to the removed doc, with a safe fallback to full invalidation on failure, is the right balance.

2271-2287: Targeted cache invalidation on purge — good

Matches the removed-doc path. Fallback to full invalidation is pragmatic.

2380-2394: LGTM: cache metrics accessors

Non-intrusive accessors with nil-guard are fine. Useful for observability.

149-155: Verified: clientInfoCache closed in Backend.Shutdown

The Backend.Shutdown method (server/backend/backend.go lines 229–235) already invokes mongoClient.CloseClientInfoCache() before disconnecting the DB. All shutdown paths (including Yorkie.Shutdown and tests) go through this, so no additional calls are required.

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (3)

server/backend/database/mongo/client.go (1)

2342-2358: LGTM! Consistent cache invalidation pattern.

The invalidation logic for document purging follows the same selective pattern as document removal, maintaining consistency.
server/backend/database/mongo/client_cache.go (2)
473-475: LGTM! Error handling improved for adaptive flush.

The error logging for flush failures has been properly implemented.

612-619: Add the missing clientDocInfoKey helper function.

The function is currently defined in client.go but should be local to this file to avoid cross-file coupling.
+// clientDocInfoKey returns the key for the client document info.
+func clientDocInfoKey(docID types.ID, prefix string) string {
+	return fmt.Sprintf("documents.%s.%s", docID, prefix)
+}
+
 	// Add document-specific updates
 	for docID, docInfo := range clientInfo.Documents {
 		docKey := clientDocInfoKey(docID, "server_seq")

🧹 Nitpick comments (7)

test/integration/routing_test.go (2)

40-114: Consider extracting test constants and improving documentation.

The test validates cross-server routing well, but could benefit from:

Extracting magic numbers into named constants
Adding documentation about the test scenario

 func TestClientRouting(t *testing.T) {
 	ctx := context.Background()
 
+	// Test constants
+	const (
+		testDelay = 150 * time.Millisecond // Delay to reduce cross-server write races
+		numUpdates = 2                      // Number of updates per client
+		maxRetries = 5                      // Maximum retry attempts for attachment
+	)
+
 	const clientKey = "C1-routing-test"
 	docKey := helper.TestDocKey(t)
 
+	// This test validates that a client can seamlessly move between servers
+	// while maintaining document state through a shared database backend.
+	// Test flow: Server A -> Server B -> Server A with the same client key
+
 	// Start two servers A and B sharing the same DB.
 	svrA := helper.TestServer()
 	assert.NoError(t, svrA.Start())
 	defer func() { _ = svrA.Shutdown(true) }()
 
 	svrB := helper.TestServer()
 	assert.NoError(t, svrB.Start())
 	defer func() { _ = svrB.Shutdown(true) }()
 
 	// t0: Connect to A with fixed client key
 	cliA, err := client.Dial(svrA.RPCAddr(), client.WithKey(clientKey))
 	assert.NoError(t, err)
 	assert.NoError(t, cliA.Activate(ctx))
 
 	docA := document.New(docKey)
 	assert.NoError(t, cliA.Attach(ctx, docA))
 
-	for i := 0; i < 2; i++ {
+	for i := 0; i < numUpdates; i++ {
 		err := docA.Update(func(r *json.Object, p *presence.Presence) error {
 			r.SetString("k", "vA")
 			return nil
 		})
 		assert.NoError(t, err)
 		assert.NoError(t, cliA.Sync(ctx))
 	}
 	// Ensure the server has the latest state from A
 	assert.NoError(t, cliA.Detach(ctx, docA))
 	// Small delay to reduce cross-server write races
-	time.Sleep(150 * time.Millisecond)
+	time.Sleep(testDelay)
 
 	// t1: Connect to B with SAME client key
 	cliB, err := client.Dial(svrB.RPCAddr(), client.WithKey(clientKey))
 	assert.NoError(t, err)
 	assert.NoError(t, cliB.Activate(ctx))
 	docB := document.New(docKey)
-	assert.NoError(t, attachWithRetry(ctx, cliB, docB, 5))
+	assert.NoError(t, attachWithRetry(ctx, cliB, docB, maxRetries))
 	assert.NoError(t, cliB.Sync(ctx))
 	gotB := docB.Marshal()
 	assert.Contains(t, gotB, "\"k\":\"vA\"")
 
-	for i := 0; i < 2; i++ {
+	for i := 0; i < numUpdates; i++ {
 		err := docB.Update(func(r *json.Object, p *presence.Presence) error {
 			r.SetString("k", "vB")
 			return nil
 		})
 		assert.NoError(t, err)
 		assert.NoError(t, cliB.Sync(ctx))
 	}
 	assert.NoError(t, cliB.Detach(ctx, docB))
 	// Small delay to reduce cross-server write races
-	time.Sleep(150 * time.Millisecond)
+	time.Sleep(testDelay)
 
 	// t2: Reconnect to A with SAME client key
 	cliA2, err := client.Dial(svrA.RPCAddr(), client.WithKey(clientKey))
 	assert.NoError(t, err)
 	assert.NoError(t, cliA2.Activate(ctx))
 	docA2 := document.New(docKey)
-	assert.NoError(t, attachWithRetry(ctx, cliA2, docA2, 5))
+	assert.NoError(t, attachWithRetry(ctx, cliA2, docA2, maxRetries))
 	assert.NoError(t, cliA2.Sync(ctx))
 	gotA2 := docA2.Marshal()
 	assert.Contains(t, gotA2, "\"k\":\"vB\"")
 
 	err = docA2.Update(func(r *json.Object, p *presence.Presence) error {
 		r.SetString("k", "vA2")
 		return nil
 	})
 	assert.NoError(t, err)
 	assert.NoError(t, cliA2.Sync(ctx))
 }

216-216: Reduce or extract hardcoded sleep duration.

The 4500ms sleep appears multiple times and could block test execution. Consider extracting this to a constant or using a polling approach.

+const cacheTTLWaitTime = 4500 * time.Millisecond // Wait for cache TTL to expire
+
 		// t2: Deactivate client on backend A
 		_, err = backendA.DB.DeactivateClient(ctx, clientInfoA.RefKey())
 		assert.NoError(t, err)
 		// Small delay to work ttl cleanup
-		time.Sleep(4500 * time.Millisecond)
+		time.Sleep(cacheTTLWaitTime)

Also applies to: 245-245, 263-263

server/backend/database/mongo/client.go (3)

1495-1511: Consider implementing more targeted cache invalidation.

While the current implementation attempts selective invalidation, consider maintaining a reverse index of document-to-client mappings in the cache for more efficient invalidation without the fallback to InvalidateAll().

The fallback to full cache invalidation could impact performance in high-traffic scenarios. Consider:

Maintaining a reverse mapping in the cache (document -> clients)

Using a more resilient error handling that doesn't require full invalidation

Adding metrics to track how often the fallback is triggered

2467-2514: Consider consolidating these helper methods.

The updateDocumentStatusInDBDirect and updateStatusInDBDirect methods duplicate logic that exists in the cache implementation. Consider reusing the cache's DB update methods or extracting common logic to avoid duplication.
-// updateDocumentStatusInDBDirect updates document status in database directly
-func (c *Client) updateDocumentStatusInDBDirect(refKey types.ClientRefKey, docID types.ID, status string) error {
-	updateDoc := bson.M{
-		"$set": bson.M{
-			"updated_at": gotime.Now(),
-		},
-	}
-
-	// Add document-specific updates
-	docKey := clientDocInfoKey(docID, StatusKey)
-	updateDoc["$set"].(bson.M)[docKey] = status
-
-	// Reset sequence numbers for detached/removed/attaching documents
-	if status == database.DocumentDetached || status == database.DocumentRemoved || status == database.DocumentAttaching {
-		serverSeqKey := clientDocInfoKey(docID, "server_seq")
-		clientSeqKey := clientDocInfoKey(docID, "client_seq")
-		updateDoc["$set"].(bson.M)[serverSeqKey] = int64(0)
-		updateDoc["$set"].(bson.M)[clientSeqKey] = uint32(0)
-	}
-
-	_, err := c.collection(ColClients).UpdateOne(
-		context.Background(),
-		bson.M{
-			"project_id": refKey.ProjectID,
-			"_id":        refKey.ClientID,
-		},
-		updateDoc,
-	)
-	return err
-}
+// updateDocumentStatusInDBDirect updates document status in database directly
+func (c *Client) updateDocumentStatusInDBDirect(refKey types.ClientRefKey, docID types.ID, status string) error {
+	// Delegate to cache's DB update method which has the same logic
+	return c.clientInfoCache.updateDocumentStatusInDB(refKey, docID, status)
+}
810-850: Verify metadata upsert semantics

I didn’t find any existing tests or code paths that relied on the old behavior of including metadata in the filter—no tests assert that differing metadata produces distinct client documents. That said, moving metadata into the $set does change semantics: now metadata changes update the existing client instead of inserting a new one.

• If you intend this as a deliberate behavior change, please add a note to the PR description (and any migration guide) clarifying that clients are now uniqued by (project_id, key) only, with metadata merged on each activation.
• Otherwise, consider restoring metadata in the filter if callers expect each unique metadata set to create its own client document.

server/backend/database/mongo/client_cache.go (2)

78-89: Review default cache configuration values.

The default TTL of 3 seconds seems quite aggressive and could lead to frequent cache misses. Consider increasing it based on typical client session durations and update patterns.
 func DefaultCacheConfig() *CacheConfig {
 	return &CacheConfig{
 		BaseFlushInterval:      10 * time.Second,
 		MaxFlushInterval:       60 * time.Second,
 		MinFlushInterval:       5 * time.Second,
 		MaxCacheSize:           20000,
 		WritePressureThreshold: 500,
 		PressureCheckInterval:  10 * time.Second,
 		EnableFlushCleanup:     true,
-		TTL:                    3 * time.Second,
+		TTL:                    30 * time.Second, // Increase TTL for better cache hit rates
 	}
 }
983-987: Consider removing or improving the warning log.

The warning log at line 986 could be noisy in production if this is an expected scenario for new document attachments.
-			logging.DefaultLogger().Warnf("Client not found in DB for document attachment, creating minimal cache entry: %v", err)
+			// Expected for new document attachments - client will be loaded when needed
+			logging.DefaultLogger().Debugf("Client not found in DB for document attachment: %v", err)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled
Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 5cce717 and 8002813.

📒 Files selected for processing (3)

server/backend/database/mongo/client.go (17 hunks)
server/backend/database/mongo/client_cache.go (1 hunks)
test/integration/routing_test.go (1 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (3)

test/integration/routing_test.go (5)

server/rpc/connecthelper/status.go (1)

CodeOf (165-176)

test/helper/helper.go (13)

TestDocKey (344-373)

AdminUser (71-71)

AdminPassword (72-72)

UseDefaultProject (74-74)

ClientDeactivateThreshold (81-81)

ProjectCacheTTL (94-94)

AdminTokenDuration (80-80)

MongoConnectionURI (97-97)

MongoConnectionTimeout (98-98)

MongoPingTimeout (99-99)

HousekeepingInterval (75-75)

HousekeepingCandidatesLimitPerProject (76-76)

HousekeepingProjectFetchSize (77-77)

client/options.go (1)

WithKey (56-58)

server/profiling/prometheus/metrics.go (1)

NewMetrics (77-221)

server/backend/database/client_info.go (3)

ClientActivated (43-43)

ClientDeactivated (42-42)

DocumentAttaching (48-48)

server/backend/database/mongo/client_cache.go (5)

server/backend/database/client_info.go (6)

ClientInfo (66-92)

ClientDocInfo (56-60)

DocumentDetached (50-50)

DocumentRemoved (51-51)

DocumentAttached (49-49)

DocumentAttaching (48-48)

api/types/resource_ref_key.go (1)

ClientRefKey (35-38)

server/backend/database/mongo/client.go (2)

Client (51-59)

StatusKey (47-47)

server/backend/database/mongo/indexes.go (1)

ColClients (36-36)

server/backend/database/database.go (1)

ErrClientNotFound (46-46)

server/backend/database/mongo/client.go (4)

api/types/resource_ref_key.go (2)

DocRefKey (24-27)

ClientRefKey (35-38)

server/backend/database/mongo/client_cache.go (3)

ClientInfoCache (110-120)

NewClientInfoCache (123-146)

CacheMetrics (37-55)

server/backend/database/client_info.go (8)

ClientActivated (43-43)

DocumentAttached (49-49)

ClientDocInfo (56-60)

DocumentAttaching (48-48)

ClientDeactivated (42-42)

ClientInfo (66-92)

DocumentDetached (50-50)

DocumentRemoved (51-51)

server/backend/database/database.go (1)

ErrClientNotFound (46-46)

🔇 Additional comments (10)

test/integration/routing_test.go (1)

24-38: LGTM! Helper function handles retries appropriately.

The retry logic with exponential backoff is appropriate for handling temporary conflicts during attachment operations in distributed systems.

server/backend/database/mongo/client.go (6)

120-131: LGTM! Clean initialization of client with cache.

The client initialization now properly creates a single instance and assigns the cache with the correct reference.

862-898: LGTM! Efficient cache-first approach for TryAttaching.

The implementation correctly handles the cache-first path with proper validation and write-through updates to the database.

1029-1068: LGTM! Clean separation of housekeeping deactivation logic.

The dedicated housekeeping deactivation method bypasses cache checks appropriately and handles the "not found" case gracefully by returning nil.

1298-1318: LGTM! Efficient cache utilization for finding attached clients.

The implementation leverages cache effectively and includes proper sorting for deterministic results.

2147-2161: Cache-aware document attachment check is well-implemented.

The method efficiently checks cache first before falling back to database queries.

1119-1199: Error path on cache miss is correct; no stale cache entry to invalidate

The UpdateClientInfoAfterPushPull branch for cache misses only executes when c.clientInfoCache.Get(...) returns nil, so there is no existing cache entry that could go stale. It correctly returns ErrClientNotFound without calling Set, and thus requires no additional invalidation here. Invalidation of stale entries is already handled elsewhere via clientInfoCache.Invalidate calls.

server/backend/database/mongo/client_cache.go (3)

148-179: LGTM! Well-designed expired entry handling.

The background handling of expired entries with checkpoint preservation is a good design choice that balances performance with data consistency.

805-876: Well-implemented checkpoint update with write-back strategy.

The UpdateCheckpoint method correctly implements the write-back strategy with max operator semantics for consistency.

878-925: Efficient CAS write-through implementation for status updates.

The method correctly checks for changes before updating the database, avoiding unnecessary writes.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (4)

test/integration/routing_test.go (4)

29-29: Replace deprecated connect.CodeOf with connecthelper.CodeOf.

The connect.CodeOf function is deprecated. Use connecthelper.CodeOf instead, which is already available in the codebase.

Add the import:

+	"github.com/yorkie-team/yorkie/server/rpc/connecthelper"

Update the code:

-			if connect.CodeOf(err) == connect.CodeFailedPrecondition {
+			if connecthelper.CodeOf(err) == connecthelper.CodeFailedPrecondition {

40-114: Improve test isolation and resource cleanup.

The test has good coverage but could benefit from:

Using defer for client cleanup immediately after creation
Detaching documents in deferred cleanup to ensure proper cleanup even on test failure

Apply these improvements:

 func TestClientRouting(t *testing.T) {
 	ctx := context.Background()
 
 	const clientKey = "C1-routing-test"
 	docKey := helper.TestDocKey(t)
 
 	// Start two servers A and B sharing the same DB.
 	svrA := helper.TestServer()
 	assert.NoError(t, svrA.Start())
 	defer func() { _ = svrA.Shutdown(true) }()
 
 	svrB := helper.TestServer()
 	assert.NoError(t, svrB.Start())
 	defer func() { _ = svrB.Shutdown(true) }()
 
 	// t0: Connect to A with fixed client key
 	cliA, err := client.Dial(svrA.RPCAddr(), client.WithKey(clientKey))
 	assert.NoError(t, err)
+	defer func() { _ = cliA.Deactivate(ctx) }()
 	assert.NoError(t, cliA.Activate(ctx))
 
 	docA := document.New(docKey)
 	assert.NoError(t, cliA.Attach(ctx, docA))
+	defer func() { _ = cliA.Detach(ctx, docA) }()

120-199: Extract backend configuration to reduce duplication.

The backend configuration is nearly identical for both backends A and B. Consider extracting it to a helper function to improve maintainability.

Create a helper function:

func createTestBackend(t *testing.T) (*backend.Backend, error) {
	met, err := prometheus.NewMetrics()
	if err != nil {
		return nil, err
	}

	return backend.New(
		&backend.Config{
			AdminUser:                   helper.AdminUser,
			AdminPassword:               helper.AdminPassword,
			UseDefaultProject:           helper.UseDefaultProject,
			ClientDeactivateThreshold:   helper.ClientDeactivateThreshold,
			SnapshotThreshold:           helper.SnapshotThreshold,
			SnapshotCacheSize:           helper.SnapshotCacheSize,
			AuthWebhookCacheSize:        helper.AuthWebhookSize,
			AuthWebhookCacheTTL:         helper.AuthWebhookCacheTTL.String(),
			AuthWebhookMaxWaitInterval:  helper.AuthWebhookMaxWaitInterval.String(),
			AuthWebhookMinWaitInterval:  helper.AuthWebhookMinWaitInterval.String(),
			AuthWebhookRequestTimeout:   helper.AuthWebhookRequestTimeout.String(),
			EventWebhookMaxWaitInterval: helper.EventWebhookMaxWaitInterval.String(),
			EventWebhookMinWaitInterval: helper.EventWebhookMinWaitInterval.String(),
			EventWebhookRequestTimeout:  helper.EventWebhookRequestTimeout.String(),
			ProjectCacheSize:            helper.ProjectCacheSize,
			ProjectCacheTTL:             helper.ProjectCacheTTL.String(),
			AdminTokenDuration:          helper.AdminTokenDuration,
		}, &mongo.Config{
			ConnectionURI:     helper.MongoConnectionURI,
			YorkieDatabase:    helper.TestDBName(),
			ConnectionTimeout: helper.MongoConnectionTimeout,
			PingTimeout:       helper.MongoPingTimeout,
		}, &housekeeping.Config{
			Interval:                  helper.HousekeepingInterval.String(),
			CandidatesLimitPerProject: helper.HousekeepingCandidatesLimitPerProject,
			ProjectFetchSize:          helper.HousekeepingProjectFetchSize,
		}, met, nil, nil)
}

Then update the test:

-	metA, err := prometheus.NewMetrics()
-	assert.NoError(t, err)
-
-	backendA, err := backend.New(
-		// ... long config ...
-	)
+	backendA, err := createTestBackend(t)
 	assert.NoError(t, err)
 	defer func() { _ = backendA.Shutdown() }()
 
-	metB, err := prometheus.NewMetrics()
-	assert.NoError(t, err)
-
-	backendB, err := backend.New(
-		// ... long config ...
-	)
+	backendB, err := createTestBackend(t)
 	assert.NoError(t, err)
 	defer func() { _ = backendB.Shutdown() }()

230-251: Consider adding assertion for document existence.

The test verifies that the document is in the DocumentAttaching status, but doesn't verify that backend B can actually find the document by key. This would provide more comprehensive coverage.

Add this assertion after line 250:

 		assert.NoError(t, err)
 		assert.Equal(t, database.DocumentAttaching, clientInfoB.Documents[docInfoA.ID].Status)
+
+		// Verify backend B can find the document
+		docInfoB, err := backendB.DB.FindDocInfo(ctx, clientInfoA.RefKey(), docKey)
+		assert.NoError(t, err)
+		assert.Equal(t, docInfoA.ID, docInfoB.ID)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled
Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between a3dc337 and 3d5046e.

📒 Files selected for processing (2)

server/backend/database/mongo/client_cache.go (1 hunks)
test/integration/routing_test.go (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

server/backend/database/mongo/client_cache.go

🧰 Additional context used

🧬 Code graph analysis (1)

test/integration/routing_test.go (6)

server/backend/database/mongo/client.go (2)

Client (51-59)

Dial (62-132)

server/rpc/connecthelper/status.go (1)

CodeOf (165-176)

test/helper/helper.go (1)

TestDocKey (344-373)

client/options.go (1)

WithKey (56-58)

server/profiling/prometheus/metrics.go (1)

NewMetrics (77-221)

server/backend/database/client_info.go (3)

ClientActivated (43-43)

ClientDeactivated (42-42)

DocumentAttaching (48-48)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: load-test
GitHub Check: complex-test
GitHub Check: bench
GitHub Check: build

🔇 Additional comments (2)

test/integration/routing_test.go (2)

200-228: Good test structure with clear state transitions!

The test clearly documents the progression through different states (t0-t4) and validates cache behavior across backends. The assertion that the same client key results in the same client ID is particularly important for consistency.

253-275: Test effectively validates cross-backend housekeeping behavior.

The test properly verifies that housekeeping deactivation on one backend is visible to another backend after cache TTL, and that the client can be reactivated with the same ID.

coderabbitai · 2025-08-21T06:40:22Z

+		_, err = backendA.DB.DeactivateClient(ctx, clientInfoA.RefKey())
+		assert.NoError(t, err)
+		// Small delay to work ttl cleanup
+		time.Sleep(11 * time.Second)


🛠️ Refactor suggestion

Replace hard-coded cache TTL delays with configurable constants.

The tests use hard-coded 11 * time.Second delays in multiple places. This value appears to be related to cache TTL behavior but is not documented. Consider extracting this to a named constant with documentation.

Add a constant at the package level:

// cacheTTLWait is the time to wait for cache TTL expiration. // This should be slightly longer than the actual cache TTL to ensure reliable test behavior. const cacheTTLWait = 11 * time.Second

Then update the code:

- time.Sleep(11 * time.Second) + time.Sleep(cacheTTLWait)

Also applies to: 245-245, 263-263

🤖 Prompt for AI Agents

In test/integration/routing_test.go around lines 216, 245 and 263 the tests use hard-coded sleeps like "11 * time.Second"; extract that magic value into a documented package-level constant named cacheTTLWait (e.g. "// cacheTTLWait is the time to wait for cache TTL expiration. This should be slightly longer than the actual cache TTL to ensure reliable test behavior." const cacheTTLWait = 11 * time.Second) and replace the three occurrences of time.Sleep(11 * time.Second) with time.Sleep(cacheTTLWait); ensure the file imports time (or keeps it) and that the constant is placed at the top of the file so all tests can use it.

coderabbitai

Actionable comments posted: 4

♻️ Duplicate comments (1)

server/backend/database/mongo/client_cache.go (1)
614-614: Missing clientDocInfoKey helper function.

The code uses clientDocInfoKey function which is not defined in this file, causing a compilation error.

Add the missing helper function:
+// clientDocInfoKey returns the key for the client document info.
+func clientDocInfoKey(docID types.ID, prefix string) string {
+	return fmt.Sprintf("documents.%s.%s", docID, prefix)
+}
+
// buildUpdateModel builds a MongoDB update model for the given client info
func (c *ClientInfoCache) buildUpdateModel(refKey types.ClientRefKey,
	clientInfo *database.ClientInfo) mongo.WriteModel {

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled
Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 3d5046e and bd40e7b.

📒 Files selected for processing (1)

server/backend/database/mongo/client_cache.go (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

server/backend/database/mongo/client_cache.go (5)

server/backend/database/client_info.go (6)

ClientInfo (66-92)

ClientDocInfo (56-60)

DocumentDetached (50-50)

DocumentRemoved (51-51)

DocumentAttached (49-49)

DocumentAttaching (48-48)

api/types/resource_ref_key.go (1)

ClientRefKey (35-38)

server/backend/database/mongo/client.go (2)

Client (51-59)

StatusKey (47-47)

server/backend/database/mongo/indexes.go (1)

ColClients (36-36)

server/backend/database/database.go (1)

ErrClientNotFound (46-46)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: load-test
GitHub Check: complex-test
GitHub Check: bench
GitHub Check: build

🔇 Additional comments (4)

server/backend/database/mongo/client_cache.go (4)

182-201: LGTM! Efficient cache retrieval with TTL handling.

The implementation correctly handles TTL expiration by checking the expiration time and scheduling background cleanup while returning a cache miss. The use of DeepCopy() prevents data races.

329-367: LGTM! Solid merge logic with MongoDB $max semantics.

The merge implementation correctly applies MongoDB's $max operator semantics for sequence numbers and handles special cases for detached/removed documents by resetting sequences. The logic is well-structured and matches the expected behavior.

553-599: LGTM! Well-designed bulk flush implementation.

The flush logic correctly snapshots dirty entries, releases locks during database operations, and includes proper cleanup. The bulk write approach is efficient for performance.

807-878: LGTM! Proper write-back strategy for checkpoint updates.

The checkpoint update implementation correctly uses write-back caching with MongoDB's $max operator semantics, handles cache misses by loading from DB, and maintains consistency.

coderabbitai · 2025-08-21T07:48:44Z

+	interval := c.config.MinFlushInterval +
+		time.Duration(float64(c.config.MaxFlushInterval-c.config.MinFlushInterval)*ratio)
+
+	return interval


⚠️ Potential issue

Fix potential panic in flush interval calculation.

When MaxFlushInterval is shorter than MinFlushInterval, the subtraction c.config.MaxFlushInterval-c.config.MinFlushInterval will result in a negative duration, which can cause unexpected behavior.

Apply this diff to add validation:

// calculateFlushInterval calculates adaptive flush interval based on pressure func (c *ClientInfoCache) calculateFlushInterval() time.Duration { c.pressureMu.RLock() pressureLevel := c.writePressure.PressureLevel c.pressureMu.RUnlock() + // Validate configuration + if c.config.MaxFlushInterval < c.config.MinFlushInterval { + return c.config.MinFlushInterval + } + // Higher pressure = shorter interval ratio := 1.0 - pressureLevel interval := c.config.MinFlushInterval + time.Duration(float64(c.config.MaxFlushInterval-c.config.MinFlushInterval)*ratio) return interval }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

interval := c.config.MinFlushInterval +

time.Duration(float64(c.config.MaxFlushInterval-c.config.MinFlushInterval)*ratio)

return interval

// calculateFlushInterval calculates adaptive flush interval based on pressure

func (c *ClientInfoCache) calculateFlushInterval() time.Duration {

c.pressureMu.RLock()

pressureLevel := c.writePressure.PressureLevel

c.pressureMu.RUnlock()

// Validate configuration

if c.config.MaxFlushInterval < c.config.MinFlushInterval {

return c.config.MinFlushInterval

}

// Higher pressure = shorter interval

ratio := 1.0 - pressureLevel

interval := c.config.MinFlushInterval +

time.Duration(float64(c.config.MaxFlushInterval-c.config.MinFlushInterval)*ratio)

return interval

}

🤖 Prompt for AI Agents

In server/backend/database/mongo/client_cache.go around lines 538 to 541, the flush interval calculation can panic or behave unexpectedly if MaxFlushInterval < MinFlushInterval; compute diff := c.config.MaxFlushInterval - c.config.MinFlushInterval, if diff < 0 set diff = 0 (or swap intervals) and clamp ratio to [0,1], then calculate interval := c.config.MinFlushInterval + time.Duration(float64(diff)*ratio) and return interval; optionally add a debug/warn log when the config is invalid.

coderabbitai

Actionable comments posted: 4

♻️ Duplicate comments (1)

server/backend/database/mongo/client_cache.go (1)

536-542: Clamp invalid flush interval configs; avoid negative durations.

If MaxFlushInterval < MinFlushInterval (misconfig), interval becomes negative. Add a guard and clamp.

 func (c *ClientInfoCache) calculateFlushInterval() time.Duration {
@@
-  // Higher pressure = shorter interval
+  // Guard against invalid configuration.
+  if c.config.MaxFlushInterval < c.config.MinFlushInterval {
+    return c.config.MinFlushInterval
+  }
+  // Higher pressure = shorter interval
   ratio := 1.0 - pressureLevel
   interval := c.config.MinFlushInterval +
       time.Duration(float64(c.config.MaxFlushInterval-c.config.MinFlushInterval)*ratio)
   return interval
 }

🧹 Nitpick comments (8)

server/backend/database/mongo/client_cache.go (8)
204-212: Avoid DB operations while holding the cache lock in expiration paths.

handleExpiration and ttlCleanup call handleExpiredEntry (which performs DB IO) while c.mu is held, blocking all cache operations and risking priority inversion.

Refactor to capture-and-release before flushing:
 func (c *ClientInfoCache) handleExpiration(refKey types.ClientRefKey) {
-  c.mu.Lock()
-  defer c.mu.Unlock()
-
-  if cached, exists := c.cache[refKey]; exists && time.Now().After(cached.ExpiresAt) {
-    c.handleExpiredEntry(refKey, cached)
-    delete(c.cache, refKey)
-  }
+  var cached *CachedClientInfo
+  c.mu.Lock()
+  if e, exists := c.cache[refKey]; exists && time.Now().After(e.ExpiresAt) {
+    cached = e
+    delete(c.cache, refKey)
+  }
+  c.mu.Unlock()
+  if cached != nil {
+    c.handleExpiredEntry(refKey, cached)
+  }
 }
 
 func (c *ClientInfoCache) ttlCleanup() {
   ticker := time.NewTicker(c.config.TTL)
   defer ticker.Stop()
@@
-      c.mu.Lock()
-      now := time.Now()
-      for refKey, cached := range c.cache {
-        if now.After(cached.ExpiresAt) {
-          c.handleExpiredEntry(refKey, cached)
-          delete(c.cache, refKey)
-        }
-      }
-      c.mu.Unlock()
+      var toFlush map[types.ClientRefKey]*CachedClientInfo
+      now := time.Now()
+      c.mu.Lock()
+      for refKey, cached := range c.cache {
+        if now.After(cached.ExpiresAt) {
+          if toFlush == nil {
+            toFlush = make(map[types.ClientRefKey]*CachedClientInfo)
+          }
+          toFlush[refKey] = cached
+          delete(c.cache, refKey)
+        }
+      }
+      c.mu.Unlock()
+      for refKey, cached := range toFlush {
+        c.handleExpiredEntry(refKey, cached)
+      }
Do the same “capture, unlock, flush” pattern for handleBatchExpiration:
 func (c *ClientInfoCache) handleBatchExpiration(refKeys []types.ClientRefKey) {
-  c.mu.Lock()
-  defer c.mu.Unlock()
-
-  now := time.Now()
-  for _, refKey := range refKeys {
-    if cached, exists := c.cache[refKey]; exists && now.After(cached.ExpiresAt) {
-      c.handleExpiredEntry(refKey, cached)
-      delete(c.cache, refKey)
-    }
-  }
+  now := time.Now()
+  toFlush := make(map[types.ClientRefKey]*CachedClientInfo, len(refKeys))
+  c.mu.Lock()
+  for _, refKey := range refKeys {
+    if cached, exists := c.cache[refKey]; exists && now.After(cached.ExpiresAt) {
+      toFlush[refKey] = cached
+      delete(c.cache, refKey)
+    }
+  }
+  c.mu.Unlock()
+  for refKey, cached := range toFlush {
+    c.handleExpiredEntry(refKey, cached)
+  }
 }
Also applies to: 488-500, 266-278

338-352: Consistency: Reset seq on 'attaching' as well.

Elsewhere (UpdateDocumentStatus) you reset sequences for attaching, detached, and removed. mergeClientInfo resets only for detached/removed. Align behavior.
-			// Reset sequence numbers when document status changes
-			if newDocInfo.Status == database.DocumentDetached || newDocInfo.Status == database.DocumentRemoved {
+			// Reset sequence numbers when status transitions require it
+			if newDocInfo.Status == database.DocumentDetached ||
+				newDocInfo.Status == database.DocumentRemoved ||
+				newDocInfo.Status == database.DocumentAttaching {
 				existingDocInfo.ServerSeq = 0
 				existingDocInfo.ClientSeq = 0
 				existingDocInfo.Status = newDocInfo.Status
663-667: Make metrics logging idempotent; avoid duplicate goroutines.

StartMetricsLogging is called in the constructor and is also public. If called again, it spawns duplicate loggers. Guard with sync.Once.
 type ClientInfoCache struct {
 	mu            sync.RWMutex
 	cache         map[types.ClientRefKey]*CachedClientInfo
-	flushCh       chan struct{}
 	stopCh        chan struct{}
 	config        *CacheConfig
 	client        *Client
 	writePressure *WritePressure
 	pressureMu    sync.RWMutex
 	metrics       *CacheMetrics
+	metricsOnce   sync.Once
 }
@@
 	// Start metrics logging
-	cache.StartMetricsLogging()
+	cache.StartMetricsLogging()
@@
 func (c *ClientInfoCache) StartMetricsLogging() {
-	go c.periodicMetricsLogging()
+	c.metricsOnce.Do(func() {
+		go c.periodicMetricsLogging()
+	})
 }
Optional: if flushCh is not used, remove it entirely (see separate comment).

Also applies to: 110-120, 138-146

411-434: Avoid DB writes while holding c.mu in evictOldest.

evictOldest calls flushSingleToDB while Set holds c.mu. This blocks all writers during IO. Not a correctness bug, but it can stall cache ops under load.

Refactor to identify the oldest under lock, release the lock to flush, then re-lock and delete if still present. This two-phase approach prevents long critical sections. I can provide a concrete diff if you prefer this change now.

122-136: Validate configuration to prevent zero/negative tickers.

If TTL or PressureCheckInterval are zero or negative, time.NewTicker will panic. Add simple validation in the constructor.

Example change:
 func NewClientInfoCache(config *CacheConfig, client *Client) *ClientInfoCache {
 	if config == nil {
 		config = DefaultCacheConfig()
 	}
+	// Basic validation
+	if config.TTL <= 0 {
+		config.TTL = 10 * time.Second
+	}
+	if config.PressureCheckInterval <= 0 {
+		config.PressureCheckInterval = 10 * time.Second
+	}
+	if config.MinFlushInterval <= 0 {
+		config.MinFlushInterval = 5 * time.Second
+	}
+	if config.MaxFlushInterval < config.MinFlushInterval {
+		config.MaxFlushInterval = config.MinFlushInterval
+	}
 
 	cache := &ClientInfoCache{
 		cache:         make(map[types.ClientRefKey]*CachedClientInfo),
-		flushCh:       make(chan struct{}, 1),
 		stopCh:        make(chan struct{}),
 		config:        config,
 		client:        client,
 		writePressure: &WritePressure{},
 		metrics:       &CacheMetrics{},
 	}
Also applies to: 77-89

410-420: TTL and eviction metadata usage.

Using UpdatedAt for eviction is okay; consider renaming to LastAccess if you decide to update it on reads later for a true LRU.

606-606: Follow-up: tests to cover failure and TTL paths.

FlushToDB should re-mark entries dirty on error (after the refactor).

TTL expiration should only $max checkpoints and not alter statuses/attached_docs.

InvalidateAll should not deadlock and must clear entries.

Do you want me to draft table-driven tests for these scenarios (using a fake mongo.Client or an interface abstraction) and a small harness to simulate BulkWrite failures?

19-34: Remove unused strings import and flushCh field

The strings package and the flushCh channel are never referenced elsewhere in the repository; removing them will clean up dead code and avoid confusion.

• File: server/backend/database/mongo/client_cache.go
– Remove the strings import (currently at line 21).
– Remove the flushCh chan struct{} field in the ClientInfoCache struct (line 113).
– Remove the flushCh: make(chan struct{}, 1), initialization in NewClientInfoCache (line 130).

Suggested diff:
--- a/server/backend/database/mongo/client_cache.go
+++ b/server/backend/database/mongo/client_cache.go
@@ -19,7 +19,6 @@ import (
 	"context"
 	"fmt"
-	"strings"
 	"sync"
 	"sync/atomic"
 	"time"
@@ -113,7 +112,6 @@ type ClientInfoCache struct {
 	mu     sync.RWMutex
 	cache  map[types.ClientRefKey]*CachedClientInfo
-	flushCh chan struct{}
 	stopCh  chan struct{}
 	config  *CacheConfig
 }
@@ -128,7 +126,6 @@ func NewClientInfoCache(config *CacheConfig) *ClientInfoCache {
 	cache:  make(map[types.ClientRefKey]*CachedClientInfo),
-	flushCh: make(chan struct{}, 1),
 	stopCh:  make(chan struct{}),
 	config:  config,
 }

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled
Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between bd40e7b and 0482f5f.

📒 Files selected for processing (1)

server/backend/database/mongo/client_cache.go (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

server/backend/database/mongo/client_cache.go (6)

server/backend/database/client_info.go (6)

ClientInfo (66-92)

ClientDocInfo (56-60)

DocumentDetached (50-50)

DocumentRemoved (51-51)

DocumentAttached (49-49)

DocumentAttaching (48-48)

api/types/resource_ref_key.go (1)

ClientRefKey (35-38)

server/backend/database/mongo/client.go (2)

Client (51-59)

StatusKey (47-47)

server/logging/logging.go (1)

DefaultLogger (85-90)

server/backend/database/mongo/indexes.go (1)

ColClients (36-36)

server/backend/database/database.go (1)

ErrClientNotFound (46-46)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: load-test
GitHub Check: complex-test
GitHub Check: bench
GitHub Check: build

🔇 Additional comments (9)

server/backend/database/mongo/client_cache.go (9)

748-812: Metrics look solid.

Atomic loads with derived percentages and cache usage ratio are clean and safe. Good separation between counters and logging.

814-885: Checkpoint update path is correct and safe.

Deep-copying maps, applying $max logic, and marking Dirty with TTL refresh is correct for write-back semantics.

936-1031: Write-through status updates are coherent.

You avoid unnecessary DB writes, then mirror the change into cache. Sequence reset rules for status transitions are applied consistently here.

441-460: Post-flush cleanup heuristic is reasonable.

The “recently flushed and not dirty” removal with logging provides a pragmatic bound on cache size while minimizing churn. Works well with the adaptive flush loop.

481-481: Minor: adaptive flush logging improved.

Errors are logged; combined with pressure-based interval, this is fine.

1-16: License header and package structure.

Header and package naming look consistent with the repo.

569-594: BulkWrite result handling is good.

Including the operation count and warning on partial modifications is useful for ops visibility.

1033-1053: DB lookup error wrapping is correct.

Wrapping ErrNoDocuments with ErrClientNotFound preserves sentinel error semantics for errors.Is.

284-301: Set path is straightforward and safe.

Deep-copy on insert, TTL assignment, and size check are correct. Pair this with the evictOldest refactor to avoid holding the lock over IO.

coderabbitai · 2025-08-21T08:55:21Z

+// handleExpiredEntry processes an expired cache entry
+// Status information is discarded, checkpoint information is flushed to DB using max operator
+func (c *ClientInfoCache) handleExpiredEntry(refKey types.ClientRefKey, cached *CachedClientInfo) {
+	atomic.AddInt64(&c.metrics.TTLExpirations, 1)
+
+	// Create a copy of client info with only checkpoint information
+	checkpointInfo := &database.ClientInfo{
+		ID:        cached.ClientInfo.ID,
+		ProjectID: cached.ClientInfo.ProjectID,
+		Key:       cached.ClientInfo.Key,
+		Status:    cached.ClientInfo.Status,
+		Documents: make(map[types.ID]*database.ClientDocInfo),
+		Metadata:  cached.ClientInfo.Metadata,
+		CreatedAt: cached.ClientInfo.CreatedAt,
+		UpdatedAt: cached.ClientInfo.UpdatedAt,
+	}
+
+	// Copy only checkpoint information (ServerSeq, ClientSeq) from documents
+	for docID, docInfo := range cached.ClientInfo.Documents {
+		checkpointInfo.Documents[docID] = &database.ClientDocInfo{
+			ServerSeq: docInfo.ServerSeq,
+			ClientSeq: docInfo.ClientSeq,
+			Status:    docInfo.Status,
+		}
+	}
+
+	// Flush checkpoint information to DB using max operator
+	if err := c.flushSingleToDB(refKey, checkpointInfo); err != nil {
+		atomic.AddInt64(&c.metrics.TTLFlushErrors, 1)
+		logging.DefaultLogger().Errorf("Failed to flush expired entry checkpoint to DB: %v", err)
+	}
+}


🛠️ Refactor suggestion

TTL-expiration flush should not mutate statuses or attached_docs.

handleExpiredEntry says “Status information is discarded,” but buildUpdateModel unconditionally $set’s per-document Status and attached_docs. TTL-driven flushes should only $max the checkpoint (ServerSeq/ClientSeq) to avoid regressing status in DB.

Introduce a checkpoint-only write model and use it from TTL paths:

func (c *ClientInfoCache) handleExpiredEntry(refKey types.ClientRefKey, cached *CachedClientInfo) { atomic.AddInt64(&c.metrics.TTLExpirations, 1) @@ - // Flush checkpoint information to DB using max operator - if err := c.flushSingleToDB(refKey, checkpointInfo); err != nil { + // Flush only checkpoints ($max), do not touch statuses or attached_docs. + if err := c.flushSingleCheckpointToDB(refKey, checkpointInfo); err != nil { atomic.AddInt64(&c.metrics.TTLFlushErrors, 1) logging.DefaultLogger().Errorf("Failed to flush expired entry checkpoint to DB: %v", err) } } // buildUpdateModel builds a MongoDB update model for the given client info func (c *ClientInfoCache) buildUpdateModel(refKey types.ClientRefKey, clientInfo *database.ClientInfo) mongo.WriteModel { @@ // Update attached_docs array with correct state attachedDocs := clientInfo.AttachedDocuments() updateDoc["$set"].(bson.M)["attached_docs"] = attachedDocs return mongo.NewUpdateOneModel(). SetFilter(bson.M{ "project_id": refKey.ProjectID, "_id": refKey.ClientID, }). SetUpdate(updateDoc) } +// buildCheckpointUpdateModel builds an update that only $max-es checkpoint fields. +func (c *ClientInfoCache) buildCheckpointUpdateModel(refKey types.ClientRefKey, + clientInfo *database.ClientInfo) mongo.WriteModel { + updateDoc := bson.M{ + "$max": bson.M{}, + } + for docID, docInfo := range clientInfo.Documents { + docKey := clientDocInfoKey(docID, "server_seq") + updateDoc["$max"].(bson.M)[docKey] = docInfo.ServerSeq + docKey = clientDocInfoKey(docID, "client_seq") + updateDoc["$max"].(bson.M)[docKey] = docInfo.ClientSeq + } + return mongo.NewUpdateOneModel(). + SetFilter(bson.M{ + "project_id": refKey.ProjectID, + "_id": refKey.ClientID, + }). + SetUpdate(updateDoc) +} + // flushSingleToDB flushes a single client info entry to the database func (c *ClientInfoCache) flushSingleToDB(refKey types.ClientRefKey, clientInfo *database.ClientInfo) error { update := c.buildUpdateModel(refKey, clientInfo) _, err := c.client.collection(ColClients).BulkWrite(context.Background(), []mongo.WriteModel{update}) if err != nil { return fmt.Errorf("flush single client info: %w", err) } return nil } + +// flushSingleCheckpointToDB flushes only checkpoint fields ($max) for a single entry. +func (c *ClientInfoCache) flushSingleCheckpointToDB(refKey types.ClientRefKey, clientInfo *database.ClientInfo) error { + update := c.buildCheckpointUpdateModel(refKey, clientInfo) + _, err := c.client.collection(ColClients).BulkWrite(context.Background(), []mongo.WriteModel{update}) + if err != nil { + return fmt.Errorf("flush single client checkpoint: %w", err) + } + return nil +}

Also applies to: 608-641, 643-653

coderabbitai · 2025-08-21T08:55:21Z

+func (c *ClientInfoCache) InvalidateAll() error {
+	c.mu.Lock()
+	defer c.mu.Unlock()
+
+	// Force flush all dirty entries before clearing
+	if err := c.FlushToDB(); err != nil {
+		return fmt.Errorf("failed to flush all dirty entries: %w", err)
+	}
+	c.cache = make(map[types.ClientRefKey]*CachedClientInfo)
+	return nil
+}


🛠️ Refactor suggestion

⚠️ Potential issue

Deadlock: InvalidateAll holds c.mu and re-enters via FlushToDB.

InvalidateAll acquires c.mu and calls FlushToDB, which also acquires c.mu. This is a guaranteed self-deadlock.

Apply this minimal fix to flush outside the lock:

func (c *ClientInfoCache) InvalidateAll() error { - c.mu.Lock() - defer c.mu.Unlock() - - // Force flush all dirty entries before clearing - if err := c.FlushToDB(); err != nil { - return fmt.Errorf("failed to flush all dirty entries: %w", err) - } - c.cache = make(map[types.ClientRefKey]*CachedClientInfo) - return nil + // Flush outside the lock to avoid deadlock. + if err := c.FlushToDB(); err != nil { + return fmt.Errorf("failed to flush all dirty entries: %w", err) + } + // Now clear the cache safely. + c.mu.Lock() + c.cache = make(map[types.ClientRefKey]*CachedClientInfo) + c.mu.Unlock() + return nil }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

func (c *ClientInfoCache) InvalidateAll() error {

c.mu.Lock()

defer c.mu.Unlock()

// Force flush all dirty entries before clearing

if err := c.FlushToDB(); err != nil {

return fmt.Errorf("failed to flush all dirty entries: %w", err)

}

c.cache = make(map[types.ClientRefKey]*CachedClientInfo)

return nil

}

func (c *ClientInfoCache) InvalidateAll() error {

// Flush outside the lock to avoid deadlock.

if err := c.FlushToDB(); err != nil {

return fmt.Errorf("failed to flush all dirty entries: %w", err)

}

// Now clear the cache safely.

c.mu.Lock()

c.cache = make(map[types.ClientRefKey]*CachedClientInfo)

c.mu.Unlock()

return nil

}

coderabbitai · 2025-08-21T08:55:21Z

+	c.mu.Lock()
+	dirtyClients := make(map[types.ClientRefKey]*database.ClientInfo)
+
+	for refKey, cached := range c.cache {
+		if cached.Dirty {
+			// Take a deep copy snapshot to avoid data races with concurrent updates
+			// after releasing the lock.
+			dirtyClients[refKey] = cached.ClientInfo.DeepCopy()
+			cached.Dirty = false
+			cached.LastFlush = time.Now()
+		}
+	}
+	c.mu.Unlock()


🛠️ Refactor suggestion

⚠️ Potential issue

Data loss risk: Dirty cleared before DB success in FlushToDB.

You clear Dirty and set LastFlush while still holding the lock, before the BulkWrite completes. If the write fails, entries remain non-dirty and won’t be retried, leading to silent data loss.

Refactor to only clear Dirty and set LastFlush after a successful write:

func (c *ClientInfoCache) FlushToDB() error { - c.mu.Lock() - dirtyClients := make(map[types.ClientRefKey]*database.ClientInfo) - - for refKey, cached := range c.cache { - if cached.Dirty { - // Take a deep copy snapshot to avoid data races with concurrent updates - // after releasing the lock. - dirtyClients[refKey] = cached.ClientInfo.DeepCopy() - cached.Dirty = false - cached.LastFlush = time.Now() - } - } - c.mu.Unlock() + // 1) Snapshot dirty entries without mutating state. + c.mu.Lock() + dirtyClients := make(map[types.ClientRefKey]*database.ClientInfo) + dirtyKeys := make([]types.ClientRefKey, 0) + for refKey, cached := range c.cache { + if cached.Dirty { + dirtyClients[refKey] = cached.ClientInfo.DeepCopy() + dirtyKeys = append(dirtyKeys, refKey) + } + } + c.mu.Unlock() if len(dirtyClients) == 0 { return nil } @@ // Execute bulk update if len(updates) > 0 { result, err := c.client.collection(ColClients).BulkWrite(context.Background(), updates) if err != nil { return fmt.Errorf("bulk write client info (%d operations): %w", len(updates), err) } if result.ModifiedCount != int64(len(updates)) { logging.DefaultLogger().Warnf( "Bulk write partial success: expected %d modifications, got %d", len(updates), result.ModifiedCount, ) } } - // Update last flush time - c.pressureMu.Lock() - c.writePressure.LastFlushTime = time.Now() - c.pressureMu.Unlock() + // 2) Mark flushed entries clean after success. + now := time.Now() + c.mu.Lock() + for _, refKey := range dirtyKeys { + if entry, ok := c.cache[refKey]; ok { + entry.Dirty = false + entry.LastFlush = now + } + } + c.mu.Unlock() + + // Update last flush time (pressure metrics). + c.pressureMu.Lock() + c.writePressure.LastFlushTime = now + c.pressureMu.Unlock() // Cleanup cache after successful flush if c.config.EnableFlushCleanup { c.cleanupAfterFlush() } return nil }

Also applies to: 580-606

🤖 Prompt for AI Agents

In server/backend/database/mongo/client_cache.go around lines 555-567 (and similarly 580-606), the code clears Cached.Dirty and updates LastFlush while still holding the lock and before the BulkWrite completes, risking silent data loss if the DB write fails; change the flow to (1) while holding the lock, snapshot deep copies of dirty entries into a local map and DO NOT modify the cached entries, then release the lock; (2) perform the BulkWrite using the snapshot; (3) only if the BulkWrite succeeds, re-acquire the lock and for each successfully-written refKey clear Dirty and set LastFlush on the actual cache entries (and handle partial failures by only clearing those that succeeded); ensure errors from the write cause no mutation of Dirty flags so failed entries remain retried.

coderabbitai · 2025-08-21T08:55:21Z

+	// Check if update is needed by comparing with cache first
+	c.mu.Lock()
+	if cached, exists := c.cache[refKey]; exists {
+		if cached.ClientInfo.Documents != nil {
+			if docInfo, exists := cached.ClientInfo.Documents[docID]; exists {
+				if docInfo.Status == status {
+					// No change needed, skip database update
+					c.mu.Unlock()
+					return nil
+				}
+			}
+		}
+	}
+	c.mu.Unlock()
+


🛠️ Refactor suggestion

Use errors.Is for ErrClientNotFound detection; remove brittle string matching.

strings.Contains(err.Error(), "client not found") is fragile. loadClientInfoFromDB wraps ErrClientNotFound; prefer errors.Is and simplify imports.

@@ -import ( - "context" - "fmt" - "strings" - "sync" +import ( + "context" + "errors" + "fmt" + "sync" "sync/atomic" "time" @@ - // Update in DB (Write-through) + // Update in DB (Write-through) if err := c.updateDocumentStatusInDB(refKey, docID, status); err != nil { return fmt.Errorf("update document status in DB: %w", err) } @@ - clientInfo, err := c.loadClientInfoFromDB(refKey) + clientInfo, err := c.loadClientInfoFromDB(refKey) if err != nil { - // If client not found in DB, this might be a new document attachment - // In this case, we need to create a minimal client info structure - if strings.Contains(err.Error(), "client not found") { + if errors.Is(err, database.ErrClientNotFound) { return fmt.Errorf("cannot update document status for non-existent client %s: %w", refKey.ClientID, err) } return fmt.Errorf("load client info from DB: %w", err) }

Note: Removing strings import here keeps imports clean.

Also applies to: 989-998, 19-26

🤖 Prompt for AI Agents

In server/backend/database/mongo/client_cache.go around lines 939-953 (and similarly update sites at 989-998 and import area at 19-26), the code currently detects a missing client by doing strings.Contains(err.Error(), "client not found"); replace that brittle string match with errors.Is(err, ErrClientNotFound) (ensure the package that defines ErrClientNotFound is imported or referenced correctly), remove the now-unused strings import, and simplify the error handling branch to treat ErrClientNotFound as the not-found case while preserving other error propagation paths.

hackerwins · 2025-09-01T11:47:23Z

This PR leads to #1472 and following PRs.

Checkpoint(ClientInfo) Caching System

5903d4c

hackerwins marked this pull request as ready for review August 1, 2025 09:53

hackerwins marked this pull request as draft August 1, 2025 09:53

coderabbitai Bot reviewed Aug 1, 2025

View reviewed changes

mnnseong added 2 commits August 4, 2025 10:29

concurrency bug fix

b0fd6c8

resolve linter errors

b5c2e87

mnnseong marked this pull request as ready for review August 4, 2025 01:49

mnnseong marked this pull request as draft August 4, 2025 01:51

hackerwins marked this pull request as ready for review August 4, 2025 01:53

coderabbitai Bot reviewed Aug 4, 2025

View reviewed changes

Comment thread server/backend/database/mongo/client_cache.go

Comment thread server/backend/database/mongo/client.go Outdated

Comment thread server/backend/database/mongo/client.go Outdated

hackerwins mentioned this pull request Aug 4, 2025

Introduce Checkpoint Cache for PushPull Optimization #1381

Closed

3 tasks

Refactor cache structure and add state caching

06ecf54

coderabbitai Bot reviewed Aug 7, 2025

View reviewed changes

hackerwins-yorkie reviewed Aug 7, 2025

View reviewed changes

hackerwins requested a review from Copilot August 8, 2025 00:48

Copilot AI reviewed Aug 8, 2025

View reviewed changes

hackerwins requested review from hackerwins and kokodak August 8, 2025 01:39

Refactor metadata upsert logic and enable cleanup

a6a79e4

coderabbitai Bot reviewed Aug 11, 2025

View reviewed changes

write-through client state with CAS

d8b6aa5

coderabbitai Bot reviewed Aug 18, 2025

View reviewed changes

Merge branch 'main' into cp-caching

93a65db

coderabbitai Bot reviewed Aug 18, 2025

View reviewed changes

Comment thread server/backend/database/mongo/client.go

hackerwins marked this pull request as draft August 19, 2025 06:17

Revise code with coderabbit

5cce717

mnnseong marked this pull request as ready for review August 20, 2025 06:41

mnnseong marked this pull request as draft August 20, 2025 06:42

coderabbitai Bot reviewed Aug 20, 2025

View reviewed changes

Implement TTL & Routing Test

8002813

mnnseong marked this pull request as ready for review August 21, 2025 04:28

coderabbitai Bot reviewed Aug 21, 2025

View reviewed changes

mnnseong added 2 commits August 21, 2025 13:38

Resolve linter error

a3dc337

adjust TTL and flush interval

3d5046e

coderabbitai Bot reviewed Aug 21, 2025

View reviewed changes

Resolve data race

bd40e7b

coderabbitai Bot reviewed Aug 21, 2025

View reviewed changes

mnnseong added 2 commits August 21, 2025 17:09

trigger CI

6e420fd

Revise Code

0482f5f

coderabbitai Bot reviewed Aug 21, 2025

View reviewed changes

hackerwins marked this pull request as draft September 1, 2025 07:34

hackerwins closed this Sep 1, 2025

This was referenced Sep 1, 2025

Introduce client cache for faster client retrieval #1472

Merged

Ensure deactivation by adding skipCache to FindClientInfoByRefKey #1480

Merged

hackerwins deleted the cp-caching branch April 22, 2026 17:58

Uh oh!

Conversation

mnnseong commented Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested labels

Suggested reviewers

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

hackerwins-yorkie left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Go Benchmark Analysis 📊

Significant Changes (≥20% difference)

Key Observations 🔍

Clock Analysis

Detailed Test Results

Uh oh!

codecov Bot commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

hackerwins-yorkie left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

k6 Load Test 📊

Change

Result

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Aug 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Aug 8, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Aug 11, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

mnnseong commented Aug 1, 2025 •

edited

Loading

coderabbitai Bot commented Aug 1, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

hackerwins-yorkie left a comment •

edited

Loading

codecov Bot commented Aug 7, 2025 •

edited

Loading

hackerwins-yorkie left a comment •

edited

Loading