Skip to content

Conversation

@shash-hq
Copy link

@shash-hq shash-hq commented Jan 18, 2026

Description

This PR fixes a race condition where trash recovery operations were executed outside of the main transaction session. This could lead to data inconsistency (orphaned records) if the transaction was aborted after the trash operation succeeded but before the main operation finished (or vice-versa).

Changes

  • Modified BaseRaw.ts in @rocket.chat/models to propagate the session from deleteOne, deleteMany, and findOneAndDelete options to the corresponding trashCollection operations.
  • Resolved a circular dependency in BaseRaw.ts imports.

Verification

  • Added specific unit test coverage locally to verify session propagation (test not included in PR to keep changes minimal, but verified locally).
  • Verified that upsert: true and session are correctly passed to trash.updateOne.

Closes #38179

Summary by CodeRabbit

  • Refactor
    • Enhanced database operation reliability through improved session management in update and delete operations to ensure consistent transactional context.

✏️ Tip: You can customize this high-level summary in your review settings.

@shash-hq shash-hq requested review from a team as code owners January 18, 2026 11:19
@changeset-bot
Copy link

changeset-bot bot commented Jan 18, 2026

⚠️ No Changeset found

Latest commit: bc5981f

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@dionisio-bot
Copy link
Contributor

dionisio-bot bot commented Jan 18, 2026

Looks like this PR is not ready to merge, because of the following issues:

  • This PR is missing the 'stat: QA assured' label
  • This PR is missing the required milestone or project

Please fix the issues and try again

If you have any trouble, please check the PR guidelines

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 18, 2026

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

The PR modifies BaseRaw.ts to propagate MongoDB session objects through update, delete, and find operations, enabling transactional context for multi-document transactions. The getCollectionName utility is now defined locally, and the UpdaterImpl import source is updated.

Changes

Cohort / File(s) Summary
Session Propagation
packages/models/src/models/BaseRaw.ts
Added session: options?.session parameter to MongoDB operations including updateOne, deleteOne, findOneAndDelete, and related trash-collection operations. Refactored getCollectionName from import to local constant definition. Updated UpdaterImpl import path from parent module to explicit ../updater location.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 A session flows through every op,
MongoDB's dance won't stop,
From trash to main, transactions gleam,
Rolling back the orphaned dream,
No race conditions dare compete!

🚥 Pre-merge checks | ✅ 4 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Linked Issues check ⚠️ Warning The PR addresses only session propagation but lacks rollback verification, error handling, logging, monitoring, and alerts required by issue #38179. Implement rollback verification with deletedCount checks, preserve original errors alongside rollback errors, add error logging with metrics, trigger operator alerts, and ensure idempotent retries.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main change: propagating session to trash recovery operations to fix race conditions.
Out of Scope Changes check ✅ Passed All changes are in scope—session propagation directly addresses the race condition. However, the PR is significantly incomplete relative to the full requirements.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
packages/models/src/models/BaseRaw.ts (2)

327-341: Consider: Inconsistent error handling between deleteOne and findOneAndDelete.

Session propagation looks correct here. However, unlike findOneAndDelete (lines 369-374), this method lacks rollback logic if col.deleteOne fails after the trash record is inserted.

If the main collection's deleteOne throws (line 338-340), the trash record remains orphaned. While session/transaction usage can mitigate this (the transaction will abort on error), callers not using transactions will still have inconsistency.

For consistency with findOneAndDelete, consider wrapping the final deleteOne in a try/catch with rollback:

🔧 Potential rollback pattern (optional)
+		try {
 		if (options) {
-			return this.col.deleteOne(filter, options);
+			return await this.col.deleteOne(filter, options);
 		}
-		return this.col.deleteOne(filter);
+		return await this.col.deleteOne(filter);
+		} catch (e) {
+			await this.trash?.deleteOne({ _id } as Filter<TDeleted>, { session: options?.session });
+			throw e;
+		}

387-418: Session propagation is correct; same rollback consideration as deleteOne.

The session is properly propagated to:

  • find() cursor (line 387)
  • Each trash.updateOne() call (lines 402-409)
  • Final deleteMany() (lines 414-417)

However, similar to deleteOne, if the final deleteMany fails after trash records are inserted, there's no rollback. For bulk operations, this could leave multiple orphaned trash records.

When using transactions, the session will handle rollback. For non-transactional usage, consider adding try/catch with bulk rollback:

🔧 Potential rollback pattern
+		try {
 		if (options) {
 			return this.col.deleteMany({ _id: { $in: ids } } as unknown as Filter<T>, options);
 		}
 		return this.col.deleteMany({ _id: { $in: ids } } as unknown as Filter<T>);
+		} catch (e) {
+			await this.trash?.deleteMany({ _id: { $in: ids } } as unknown as Filter<TDeleted>, { session: options?.session });
+			throw e;
+		}
🤖 Fix all issues with AI agents
In `@apps/meteor/server/services/omnichannel/service.ts`:
- Around line 34-43: handlePresenceUpdate currently assumes user.roles is an
array and calls user.roles.some(...), which can throw if roles is undefined;
update the function (handlePresenceUpdate) to defensively handle
missing/non-array roles by checking Array.isArray(user.roles) (or defaulting to
an empty array) before calling .some, then only call
notifyAgentStatusChanged(user._id, user.status as UserStatus) when the
role-check passes.
🧹 Nitpick comments (4)
packages/models/src/models/BaseRaw.ts (1)

32-34: Minor: Unconventional ordering of const and import statements.

The const getCollectionName declaration on line 33 is placed between two import statements. While syntactically valid, it's more conventional to keep all imports together at the top, followed by local declarations.

That said, this resolves the circular dependency issue mentioned in the PR objectives, and the logic is correct.

♻️ Suggested reordering
-import { UpdaterImpl } from '../updater';
-const getCollectionName = (name: string): string => `rocketchat_${name}`;
-import type { Updater } from '../updater';
+import { UpdaterImpl } from '../updater';
+import type { Updater } from '../updater';
+
+const getCollectionName = (name: string): string => `rocketchat_${name}`;
apps/meteor/server/modules/listeners/listeners.module.ts (1)

184-209: Consider extracting common presence notification logic.

The batch handler duplicates the logic from the single presence.status handler (lines 159-182). Extracting this into a shared helper function would reduce duplication and ensure consistency.

♻️ Suggested refactor
// Extract helper function
const handleUserPresence = (user: { _id: string; username?: string; name?: string; status?: UserStatus; statusText?: string; roles?: string[] }) => {
  const { _id, username, name, status, statusText, roles } = user;
  if (!status || !username) {
    return;
  }

  notifications.notifyUserInThisInstance(_id, 'userData', {
    type: 'updated',
    id: _id,
    diff: {
      status,
      ...(statusText && { statusText }),
    },
    unset: {
      ...(!statusText && { statusText: 1 }),
    },
  });

  notifications.notifyLoggedInThisInstance('user-status', [_id, username, STATUS_MAP[status], statusText, name, roles]);

  if (_id) {
    notifications.sendPresence(_id, username, STATUS_MAP[status], statusText);
  }
};

// Then use in both handlers:
service.onEvent('presence.status', ({ user }) => handleUserPresence(user));
service.onEvent('presence.status.batch', (batch) => batch.forEach(({ user }) => handleUserPresence(user)));
ee/packages/presence/src/Presence.spec.ts (1)

83-100: Verify previousStatus preservation in debounce scenario.

The test verifies that the final status ('offline') is captured when the same user has multiple updates, but it doesn't verify the previousStatus value in the emitted batch. Based on the test setup, the second call passes 'online' as previousStatus, which would overwrite the first call's 'busy'.

Consider adding an assertion to confirm the expected previousStatus behavior:

💡 Suggested enhancement
     expect(batch[0].user.status).toBe('offline');
+    // Verify previousStatus from the last broadcast call is used
+    expect(batch[0].previousStatus).toBe('online');
ee/packages/presence/src/Presence.ts (1)

131-137: Clear presence batch on stop to avoid retaining user data.

stopped() clears timers but leaves any queued user payloads in memory. Clearing the batch on stop keeps teardown tidy, especially if the service is restarted in-process.

♻️ Proposed tweak
 		if (this.batchTimeout) {
 			clearTimeout(this.batchTimeout);
 			this.batchTimeout = undefined;
 		}
+		this.presenceBatch.clear();
 	}

Comment on lines 34 to 43
private async handlePresenceUpdate(user: Pick<IUser, '_id' | 'username' | 'status' | 'statusText' | 'name' | 'roles'>): Promise<void> {
if (!user?._id) {
return;
}
const hasRole = user.roles.some((role) => ['livechat-manager', 'livechat-monitor', 'livechat-agent'].includes(role));
if (hasRole) {
// TODO change `Livechat.notifyAgentStatusChanged` to a service call
await notifyAgentStatusChanged(user._id, user.status as UserStatus);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add defensive check for user.roles being undefined.

If user.roles is undefined or not an array, calling .some() on line 38 will throw a TypeError. While the event payload type includes roles, runtime data might not always guarantee its presence.

🛡️ Suggested defensive fix
 private async handlePresenceUpdate(user: Pick<IUser, '_id' | 'username' | 'status' | 'statusText' | 'name' | 'roles'>): Promise<void> {
   if (!user?._id) {
     return;
   }
-  const hasRole = user.roles.some((role) => ['livechat-manager', 'livechat-monitor', 'livechat-agent'].includes(role));
+  const hasRole = user.roles?.some((role) => ['livechat-manager', 'livechat-monitor', 'livechat-agent'].includes(role));
   if (hasRole) {
     // TODO change `Livechat.notifyAgentStatusChanged` to a service call
     await notifyAgentStatusChanged(user._id, user.status as UserStatus);
   }
 }
🤖 Prompt for AI Agents
In `@apps/meteor/server/services/omnichannel/service.ts` around lines 34 - 43,
handlePresenceUpdate currently assumes user.roles is an array and calls
user.roles.some(...), which can throw if roles is undefined; update the function
(handlePresenceUpdate) to defensively handle missing/non-array roles by checking
Array.isArray(user.roles) (or defaulting to an empty array) before calling
.some, then only call notifyAgentStatusChanged(user._id, user.status as
UserStatus) when the role-check passes.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 8 files

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="ee/packages/presence/src/Presence.ts">

<violation number="1" location="ee/packages/presence/src/Presence.ts:341">
P1: Presence broadcasting now emits only `presence.status.batch`, breaking existing listeners still subscribed to `presence.status`.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

return;
}

this.api?.broadcast('presence.status.batch', batch);
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Jan 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Presence broadcasting now emits only presence.status.batch, breaking existing listeners still subscribed to presence.status.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At ee/packages/presence/src/Presence.ts, line 341:

<comment>Presence broadcasting now emits only `presence.status.batch`, breaking existing listeners still subscribed to `presence.status`.</comment>

<file context>
@@ -287,21 +304,42 @@ export class Presence extends ServiceClass implements IPresence {
+				return;
+			}
+
+			this.api?.broadcast('presence.status.batch', batch);
+		}, 500);
 	}
</file context>
Fix with Cubic

@shash-hq shash-hq force-pushed the fix/38179-trash-race-condition branch from 95713c6 to e969238 Compare January 18, 2026 11:51
@shash-hq shash-hq changed the title Fix: Trash recovery race condition (#38179) fix(models): propagate session to trash recovery operations (#38179) Jan 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Trash Recovery Race Condition - Failed Rollback Leaves Orphaned Records

2 participants