Skip to content

Conversation

@ggazzo
Copy link
Member

@ggazzo ggazzo commented Oct 1, 2025

As per FB-52, this PR fixes an issue where remote servers fail to join RC-originated rooms because membership events for local RC users were not being emitted at the correct time, preventing those events from being saved into our state.

When a room is created on RC and local users A and B join before a remote user is invited, their membership events were only emitted during the transaction flow. They were not persisted in a way that backfill or later federation queries could retrieve. As a result, remote homeservers (e.g., Synapse) receive an invite referencing users for whom they have no membership state, eventually failing with:

502 - Failed to make_join via any server

Fix: Membership events are now emitted immediately after processing invites/joins, outside of the PDU transaction-only path. This ensures the events are correctly stored in our state, allowing backfill to include them and enabling remote servers to obtain the complete membership graph required to successfully evaluate and join the room.

Release Notes

  • New Features

    • Added support for room creation events in federation event handling.
  • Improvements

    • Enhanced event emission for membership and room state changes.
    • Improved error handling and reporting for room information readiness.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 1, 2025

Walkthrough

The pull request extends the federation SDK to emit homeserver events (membership and room creation) through an EventEmitterService integrated into invite and room services. It introduces a new RoomInfoNotReadyError error type and extends event types to include room state and version information.

Changes

Cohort / File(s) Summary
Event Signatures & Exports
packages/federation-sdk/src/index.ts
Narrowed federation-core imports to EventStagingStore only. Extended HomeserverEventSignatures with new 'homeserver.matrix.room.create' mapping containing PduForType<'m.room.create'> event and EventID.
Event Emission Integration
packages/federation-sdk/src/services/invite.service.ts, packages/federation-sdk/src/services/room.service.ts
Injected EventEmitterService into both services. Added event emission at multiple state-change points (after local handling, after server broadcasting, after remote processing). Extended ProcessInviteEvent type with invite_room_state and room_version fields. Updated constructor signatures.
Event Processing Updates
packages/federation-sdk/src/services/staging-area.service.ts
Refactored roomId extraction to read directly from event. Added dedicated handling for m.room.create events to emit homeserver.matrix.room.create with event_id and full event payload.
Error Handling
packages/federation-sdk/src/services/state.service.ts
Introduced new RoomInfoNotReadyError class. Updated UnknownRoomError with proper name property. Replaced generic Error with RoomInfoNotReadyError in getRoomInformation when create-event mapping is missing.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

  • Review the emit points in invite.service.ts and room.service.ts to verify events are emitted at correct lifecycle stages
  • Verify ProcessInviteEvent type extension is properly propagated to all callers
  • Confirm RoomInfoNotReadyError is thrown and handled appropriately in error paths
  • Validate staging-area.service refactoring correctly extracts roomId and handles room.create events

Possibly related PRs

Suggested reviewers

  • rodrigok
  • sampaiodiego

Poem

🐰 Events dance through federation's halls,

Room creation echoes through the calls,

Membership whispers, room-state flows,

Error handling—where proper care goes,

Emitters chirp the changes bright! 🌟

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix: emit member events outside pdu process' accurately captures the main purpose of the PR—to emit homeserver events outside the PDU processing flow across multiple services.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/emit-join

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov-commenter
Copy link

codecov-commenter commented Oct 1, 2025

Codecov Report

❌ Patch coverage is 40.00000% with 27 lines in your changes missing coverage. Please review.
✅ Project coverage is 52.73%. Comparing base (5cc3127) to head (c17ab0a).

Files with missing lines Patch % Lines
...ages/federation-sdk/src/services/invite.service.ts 11.11% 16 Missing ⚠️
...ederation-sdk/src/services/staging-area.service.ts 0.00% 9 Missing ⚠️
...ckages/federation-sdk/src/services/room.service.ts 77.77% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #239      +/-   ##
==========================================
- Coverage   53.03%   52.73%   -0.30%     
==========================================
  Files          96       96              
  Lines       12571    12608      +37     
==========================================
- Hits         6667     6649      -18     
- Misses       5904     5959      +55     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ggazzo ggazzo force-pushed the feat/emit-join branch 2 times, most recently from 8205956 to 7238484 Compare November 13, 2025 21:41
@ggazzo ggazzo marked this pull request as ready for review November 14, 2025 16:22
@ggazzo ggazzo changed the title fix: emit events outside pdu process fix: emit member events outside pdu process Nov 14, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Jira integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 5cc3127 and c17ab0a.

📒 Files selected for processing (5)
  • packages/federation-sdk/src/index.ts (2 hunks)
  • packages/federation-sdk/src/services/invite.service.ts (5 hunks)
  • packages/federation-sdk/src/services/room.service.ts (4 hunks)
  • packages/federation-sdk/src/services/staging-area.service.ts (1 hunks)
  • packages/federation-sdk/src/services/state.service.ts (2 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: sampaiodiego
Repo: RocketChat/homeserver PR: 224
File: packages/federation-sdk/src/services/event-authorization.service.ts:261-268
Timestamp: 2025-10-10T22:18:31.655Z
Learning: In packages/federation-sdk/src/services/state.service.ts, the method `StateService.getLatestRoomState(roomId: string)` has return type `Promise<State>` and never returns undefined. If the state is not found, it throws an error with message "No state found for room ${roomId}" instead of returning undefined.
Learnt from: ricardogarim
Repo: RocketChat/homeserver PR: 184
File: packages/federation-sdk/src/services/event-authorization.service.ts:213-223
Timestamp: 2025-09-14T13:34:12.448Z
Learning: ACL events (m.room.server_acl) are not yet implemented in the homeserver codebase, so ACL-related code currently handles the simpler case where no ACL events exist in room state.
📚 Learning: 2025-10-10T22:18:31.655Z
Learnt from: sampaiodiego
Repo: RocketChat/homeserver PR: 224
File: packages/federation-sdk/src/services/event-authorization.service.ts:261-268
Timestamp: 2025-10-10T22:18:31.655Z
Learning: In packages/federation-sdk/src/services/state.service.ts, the method `StateService.getLatestRoomState(roomId: string)` has return type `Promise<State>` and never returns undefined. If the state is not found, it throws an error with message "No state found for room ${roomId}" instead of returning undefined.

Applied to files:

  • packages/federation-sdk/src/services/staging-area.service.ts
  • packages/federation-sdk/src/services/state.service.ts
  • packages/federation-sdk/src/services/room.service.ts
🧬 Code graph analysis (2)
packages/federation-sdk/src/services/staging-area.service.ts (2)
packages/room/src/manager/v3.ts (1)
  • eventId (17-29)
packages/room/src/manager/event-wrapper.ts (1)
  • event (129-138)
packages/federation-sdk/src/index.ts (1)
packages/room/src/types/_common.ts (2)
  • PduForType (26-26)
  • EventID (8-8)
🔇 Additional comments (13)
packages/federation-sdk/src/services/state.service.ts (2)

58-69: LGTM! Improved error categorization.

The new error classes properly set the name property for better stack traces and debugging. RoomInfoNotReadyError provides more specific error handling for room information readiness issues.


85-102: LGTM! More specific error handling.

Replacing the generic Error with RoomInfoNotReadyError improves error categorization and makes it easier for callers to handle room information readiness issues specifically.

packages/federation-sdk/src/services/staging-area.service.ts (2)

274-281: LGTM! Consistent event handling for room creation.

The new case for m.room.create events follows the established pattern and aligns with the new event signature added to HomeserverEventSignatures.


271-271: The review comment is incorrect.

The code at line 271 correctly destructures roomId directly from the event object. The EventStagingStore interface defines roomId as a direct property, making this extraction valid. The other usages of event.event.room_id in the file are accessing a different nested property from the Matrix event structure, not an inconsistency with line 271.

Likely an incorrect or invalid review comment.

packages/federation-sdk/src/index.ts (2)

4-4: LGTM! Cleaned up unused imports.

The import statement now only includes EventStagingStore, which is the only type from @rocket.chat/federation-core directly used in this file. The other types remain available through re-exports.


90-93: LGTM! Well-typed event signature for room creation.

The new event signature for homeserver.matrix.room.create is properly typed and consistent with other event signatures in the interface.

packages/federation-sdk/src/services/invite.service.ts (4)

15-22: LGTM! Proper dependencies added for event emission and error handling.

The imports correctly add EventEmitterService for the new emission pathway and RoomInfoNotReadyError to support the updated error handling from state service.


41-47: LGTM! EventEmitterService properly injected.

The dependency injection follows the established pattern and enables the dual emission pathway introduced in this PR.


148-151: LGTM! Consistent event emission across invite scenarios.

The membership event emissions are consistently placed after invite processing in both remote invite and host server scenarios, matching the pattern established in the local invite flow.

Also applies to: 247-250


113-116: No issues with the emission architecture—both emissions are intentional and in separate code paths.

The code contains two emitterService.emit('homeserver.matrix.membership') calls within the inviteUserToRoom method, but they are in mutually exclusive conditional branches:

  1. Lines 113-116: Executes when the invited user belongs to your server (local invite)
  2. Lines ~140-143: Executes when the invited user is from a remote server (federated invite)

Each path handles a different scenario and appropriately emits the membership event after its respective processing. This is not duplicate or problematic "dual emission"—it's proper architectural separation for two distinct invite workflows.

packages/federation-sdk/src/services/room.service.ts (3)

36-46: LGTM! Updated imports for error handling.

The imports correctly add RoomInfoNotReadyError to support the updated error handling from state service.


51-66: LGTM! EventEmitterService properly injected.

The dependency injection follows the established pattern and is consistent with the changes in InviteService.


888-902: Verify dual emission is intentional.

The code emits the same membership event through both eventEmitterService (line 888) and emitterService (line 899). Confirm this dual emission is intentional and required for the architecture, as it creates two separate event pathways.

Comment on lines +1282 to 1283
// TODO: check if all sendEventToAllServersInRoom should be followed by an emitter
void this.federationService.sendEventToAllServersInRoom(event);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Track the TODO for consistent event emission pattern.

The TODO comment suggests that the pattern of emitting events after sendEventToAllServersInRoom may need to be applied consistently throughout the codebase. Consider creating an issue to track this refactoring work.

Do you want me to scan the codebase for other occurrences of sendEventToAllServersInRoom that might need the same emission pattern?

🤖 Prompt for AI Agents
In packages/federation-sdk/src/services/room.service.ts around lines 1282-1283,
the TODO notes inconsistency about whether sendEventToAllServersInRoom should be
followed by an emitter; create a tracked fix: open an issue in the repo (include
context, affected method, and proposed change to always emit after
sendEventToAllServersInRoom), replace the inline TODO with a brief reference to
that issue (e.g., "see issue #NNN") and, as part of the issue or a follow-up PR,
scan the codebase for all usages of sendEventToAllServersInRoom and update each
callsite to consistently emit the event (or document justified exceptions), and
add unit/integration tests to validate the emission behavior where applicable.

@ricardogarim ricardogarim self-requested a review November 21, 2025 13:15
@ggazzo ggazzo merged commit e39d5b1 into main Nov 21, 2025
3 checks passed
@ggazzo ggazzo deleted the feat/emit-join branch November 21, 2025 15:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants