-
Notifications
You must be signed in to change notification settings - Fork 13k
fix(presence): update connections on heartbeat and remove them when stale #37551
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Looks like this PR is ready to merge! 🎉 |
🦋 Changeset detectedLatest commit: c230094 The changes in this PR will be included in the next version bump. This PR includes changesets to release 41 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
WalkthroughA new PresenceReaper system is introduced to periodically identify and remove stale user connections from the database. Heartbeat messages are intercepted to trigger presence updates, preventing offline sessions from keeping users marked as online. Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant Client as DDP Client
participant HB as Heartbeat
participant Presence as Presence Service
participant Reaper as PresenceReaper
participant DB as UsersSessions Collection
User->>Client: Heartbeat messageReceived
Client->>Client: messageReceived()
Client->>Presence: updateConnection(uid, connectionId)
Presence->>DB: Update session timestamp
DB-->>Presence: Success
rect rgb(220, 240, 220)
Note over Reaper,DB: Periodic Reaper Run (60s interval)
Reaper->>DB: Query stale connections<br/>(updated before cutoffDate)
DB-->>Reaper: Cursor with user sessions
Reaper->>Reaper: processDocument()<br/>Filter stale vs. valid
Reaper->>Reaper: buildReaperPlan<br/>Mark for removal/offline
Reaper->>DB: bulkWrite()<br/>Remove stale connections
DB-->>Reaper: Batch update complete
Reaper->>Presence: onUpdate(userIds[])
Presence->>Presence: handleReaperUpdates()
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Areas requiring extra attention:
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (5 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
8c8ebb6 to
3c26ca0
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #37551 +/- ##
========================================
Coverage 67.69% 67.70%
========================================
Files 3457 3457
Lines 113831 113831
Branches 20908 20908
========================================
+ Hits 77061 77069 +8
+ Misses 34639 34633 -6
+ Partials 2131 2129 -2
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
c180d53 to
d90e850
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
packages/models/src/models/UsersSessions.ts (1)
32-41: Optional status handling is good; consider a more explicit checkMaking
statusoptional and guarding the$setkeeps pings from overwriting the stored status while still refreshing_updatedAt, which matches the intended behavior.To make this future‑proof against any falsy-but-valid status values, you could prefer an explicit undefined check instead of a truthiness check:
- $set: { - ...(status && { 'connections.$.status': status }), - 'connections.$._updatedAt': new Date(), - }, + $set: { + ...(status !== undefined && { 'connections.$.status': status }), + 'connections.$._updatedAt': new Date(), + },apps/meteor/client/meteor/overrides/ddpOverREST.ts (1)
9-29: Fix unsound type predicate and make ping call explicitly fire-and-forgetThe
shouldBypasstype guard (line 9) claims to narrow to non-method messages when it returns true, but it also returnstruefor severalmethodmessages (login/resume, UserPresence:, stream-, and bypassMethods). This violates the type predicate's promise and is unsound.Additionally, the type narrowing isn't actually being used—the code immediately checks
if (message.msg === 'ping')after the predicate succeeds (line 34), indicating the type isn't relied upon.Change
shouldBypassto return a plain boolean:-const shouldBypass = (message: Meteor.IDDPMessage): message is Exclude<Meteor.IDDPMessage, Meteor.IDDPMethodMessage> => { +const shouldBypass = (message: Meteor.IDDPMessage): boolean => { if (message.msg !== 'method') { return true; } const { method, params } = message; // ... rest unchanged }For the ping side effect (lines 34–36), make the fire-and-forget intent explicit and avoid unhandled rejections:
if (message.msg === 'ping') { - sdk.call('UserPresence:ping'); + void sdk.call('UserPresence:ping').catch(() => {}); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Jira integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (15)
apps/meteor/client/meteor/overrides/ddpOverREST.ts(2 hunks)apps/meteor/definition/externals/meteor/meteor.d.ts(1 hunks)apps/meteor/server/methods/userPresence.ts(2 hunks)ee/apps/ddp-streamer/src/DDPStreamer.ts(1 hunks)ee/apps/ddp-streamer/src/configureServer.ts(1 hunks)ee/packages/presence/src/Presence.ts(2 hunks)ee/packages/presence/src/lib/processConnectionStatus.ts(0 hunks)ee/packages/presence/src/lib/processStatus.ts(1 hunks)ee/packages/presence/src/processPresenceAndStatus.ts(1 hunks)ee/packages/presence/tests/lib/processConnectionStatus.test.ts(1 hunks)ee/packages/presence/tests/lib/processPresenceAndStatus.test.ts(1 hunks)ee/packages/presence/tests/lib/processStatus.test.ts(1 hunks)packages/core-services/src/types/IPresence.ts(1 hunks)packages/model-typings/src/models/IUsersSessionsModel.ts(1 hunks)packages/models/src/models/UsersSessions.ts(1 hunks)
💤 Files with no reviewable changes (1)
- ee/packages/presence/src/lib/processConnectionStatus.ts
🧰 Additional context used
🧬 Code graph analysis (8)
ee/packages/presence/tests/lib/processStatus.test.ts (1)
ee/packages/presence/src/lib/processStatus.ts (1)
processStatus(6-16)
ee/packages/presence/src/processPresenceAndStatus.ts (3)
packages/core-typings/src/IUserSession.ts (1)
IUserSessionConnection(3-9)ee/packages/presence/src/lib/processConnectionStatus.ts (1)
processConnectionStatus(6-14)ee/packages/presence/src/lib/processStatus.ts (1)
processStatus(6-16)
packages/model-typings/src/models/IUsersSessionsModel.ts (1)
packages/core-typings/src/IUserSession.ts (1)
IUserSession(11-14)
ee/apps/ddp-streamer/src/configureServer.ts (2)
ee/packages/presence/src/Presence.ts (1)
Presence(11-284)apps/meteor/server/methods/userPresence.ts (3)
userId(24-30)userId(31-37)userId(38-44)
ee/packages/presence/tests/lib/processPresenceAndStatus.test.ts (1)
ee/packages/presence/src/processPresenceAndStatus.ts (1)
processPresenceAndStatus(16-31)
packages/models/src/models/UsersSessions.ts (1)
packages/core-typings/src/IUserSession.ts (1)
IUserSession(11-14)
ee/apps/ddp-streamer/src/DDPStreamer.ts (3)
ee/apps/ddp-streamer/src/configureServer.ts (4)
server(10-10)userId(119-125)userId(126-132)userId(133-139)apps/meteor/server/methods/userPresence.ts (3)
userId(24-30)userId(31-37)userId(38-44)ee/packages/presence/src/Presence.ts (1)
Presence(11-284)
apps/meteor/server/methods/userPresence.ts (2)
ee/packages/presence/src/Presence.ts (1)
Presence(11-284)ee/apps/ddp-streamer/src/configureServer.ts (3)
userId(119-125)userId(126-132)userId(133-139)
🔇 Additional comments (13)
packages/model-typings/src/models/IUsersSessionsModel.ts (1)
8-8: LGTM! Optional status enables ping-based connection tracking.Making the
statusparameter optional allows the system to update connection timestamps (via ping handlers) without forcing a status change. This helps maintain accurate connection state for presence tracking and addresses the dangling session issue.apps/meteor/definition/externals/meteor/meteor.d.ts (1)
50-61: LGTM! Improved type safety for DDP messages.The refactoring separates ping and method messages into distinct interfaces (
IDDPPingMessageandIDDPMethodMessage), then creates a union type. This enables type guards and better type narrowing at call sites, improving type safety for DDP message handling.ee/packages/presence/tests/lib/processStatus.test.ts (1)
1-30: LGTM! Comprehensive test coverage for processStatus.The test suite thoroughly validates all combinations of connection status and default status across the four UserStatus values (ONLINE, BUSY, AWAY, OFFLINE). The tests confirm the expected precedence logic:
- OFFLINE connection status always takes precedence
- ONLINE default allows connection status to pass through
- Non-ONLINE defaults override the connection status
packages/core-services/src/types/IPresence.ts (1)
18-18: LGTM! Parameter reordering improves API consistency.The signature change reorders parameters to
(uid, session, status?)which is more logical (identifying the connection before specifying its status) and makes status optional. This enables ping handlers to update connection tracking without forcing a status change, which is key to fixing dangling sessions.Note: This is a breaking change, but the PR appears to update all call sites consistently.
ee/apps/ddp-streamer/src/configureServer.ts (1)
119-139: LGTM! Presence methods updated consistently.All three presence methods (
UserPresence:online,UserPresence:away, and the newUserPresence:ping) have been updated to use the new signature with parameters in the order(userId, session, status?). The newpingmethod omits the status parameter, allowing it to update connection tracking without changing the user's status, which is crucial for fixing dangling sessions.ee/packages/presence/src/Presence.ts (2)
7-7: LGTM! Import path updated for refactored module.The import has been updated to reflect the reorganization of presence utilities, with
processPresenceAndStatusnow in a dedicated module at the root of the presence package.
206-212: LGTM! Implementation correctly handles optional status.The
setConnectionStatusmethod now accepts an optionalstatusparameter and passes it through toUsersSessions.updateConnectionStatusById. Whenstatusis undefined (as in ping handlers), the connection timestamp is updated without changing the status, which helps maintain accurate connection state while preventing unwanted status changes.ee/packages/presence/src/lib/processStatus.ts (1)
1-16: LGTM! Status precedence logic is well-designed.The
processStatusfunction implements a clear precedence hierarchy for determining user status:
- OFFLINE connection status always takes precedence - respects actual disconnection
- ONLINE default allows connection status to pass through - shows real-time presence
- Non-ONLINE defaults override connection status - respects explicit user status settings (BUSY, AWAY)
This logic is intuitive and well-tested in the corresponding test file.
ee/apps/ddp-streamer/src/DDPStreamer.ts (1)
219-225: LGTM, but note the unawaited promise for error handling purposes.The new
PINGevent handler correctly guards against missinguserIdand successfully maintains connection state. The concern about concurrent ping events is partially mitigated:✅ Safe: MongoDB's atomic
updateOnewith the$positional operator inupdateConnectionStatusByIdprevents data corruption from concurrent updates to the same user's session document.
⚠️ Consideration:Presence.setConnectionStatus()is not awaited in the PING handler, which means errors in eitherupdateConnectionStatusByIdor the subsequentupdateUserPresencecall will be silently swallowed. WhileupdateUserPresenceperforms a read-compute-write sequence that could theoretically result in stale status being written if multiple calls interleave for the same user, this is acceptable for presence tracking since the system is eventually consistent by design. However, for proper error handling and observability, consider awaiting the call or attaching a.catch()handler to log unexpected failures.ee/packages/presence/tests/lib/processConnectionStatus.test.ts (1)
4-4: Import path update looks correctThe new relative path to
processConnectionStatusmatches thetests/lib→src/liblayout and keeps the existing test behavior intact.ee/packages/presence/src/processPresenceAndStatus.ts (1)
7-31: Presence derivation from fresh sessions looks soundFiltering by
_updatedAt <= 5 minutes, reducing withprocessConnectionStatusfromOFFLINE, and then deriving the finalstatusviaprocessStatusgives the expected behavior for empty, stale-only, and mixed-session inputs and cleanly separatesstatusConnectionfromstatus.ee/packages/presence/tests/lib/processPresenceAndStatus.test.ts (1)
6-286: Test suite gives comprehensive coverage of the new presence logicThe scenarios here (single vs multiple connections, no connections, stale vs recent updates, and different default statuses) exercise all key branches of
processPresenceAndStatus, including the 5‑minute freshness window and the precedence rules betweenstatusConnectionandstatus.apps/meteor/server/methods/userPresence.ts (1)
12-44: Server-side ping wiring and parameter reordering look consistentThe new
'UserPresence:ping'method and the reordered arguments forPresence.setConnectionStatus(userId, connection.id, status)align with the updatedsetConnectionStatus(uid, session, status?)signature and allow pings to refresh_updatedAtwithout mutating status, while still guarding for missinguserId/connection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
.changeset/spicy-nails-design.md (1)
1-10: Changeset metadata looks good; consider clarifying the summary.The affected packages and patch version bump are appropriate for this fix. However, the summary could be more specific to help users understand the mechanism. Currently it reads generically; consider mentioning that the fix refreshes connection activity via ping messages and excludes idle sessions.
Example revision:
-Fixes user status inaccuracy by refreshing active connections and filtering out the stale ones. +Fixes user status inaccuracy by refreshing connection activity on ping and filtering out idle sessions (no activity for 5+ minutes).This gives future readers clearer context about the fix's scope and timing strategy.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Jira integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
.changeset/spicy-nails-design.md(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: 📦 Build Packages
- GitHub Check: CodeQL-Build
- GitHub Check: CodeQL-Build
KevLehman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think diego has more context on how this should behave, so my comments are on code-side.
I don't see anything spooky on the implementation rather than the fact the connections would be always updated (since we're gonna update the updatedAt on every ping)
after that just run |
|
/backport 7.12.3 |
|
Sorry, I couldn't do that backport because of conflicts. Could you please solve them? you can do so by running the following commands: after that just run |
|
/backport 7.11.3 |
|
/backport 7.10.6 |
|
Sorry, I couldn't do that backport because of conflicts. Could you please solve them? you can do so by running the following commands: after that just run |
|
Sorry, I couldn't do that backport because of conflicts. Could you please solve them? you can do so by running the following commands: after that just run |
|
/patch |
|
Pull request #37883 added to Project: "Patch 7.13.2" |
|
/backport 7.12.3 |
|
Pull request #37884 added to Project: "Patch 7.12.3" |
|
/backport 7.11.3 |
|
/backport 7.10.6 |
|
Pull request #37885 added to Project: "Patch 7.11.3" |
|
Pull request #37886 added to Project: "Patch 7.10.6" |
Proposed changes (including videos or screenshots)
The
Presence.updateConnectionfunction is called when a heartbeat is received from a connection.Connections with an
_updatedAtfield older than 5 minutes are considered stale and removed periodically.Issue(s)
https://rocketchat.atlassian.net/browse/SUP-846
Steps to test or reproduce
Further comments
I wrote e2e tests to verify reactivity in a separate PR since it's a bit complicated: #37710
Summary by CodeRabbit
New Features
Tests
✏️ Tip: You can customize this high-level summary in your review settings.