Skip to content

Conversation

@rodrigok
Copy link
Member

@rodrigok rodrigok commented Oct 12, 2025

Issue: https://www.loom.com/share/8575e0dd5af34b55b9dc35892a889bf7?sid=dd0f10ca-4395-4db0-8f3d-ab2eaebf5d34

Fix demo: https://www.loom.com/share/fc30c45f8dc54762bb78d9cb4f26bd9a?sid=bece9811-eea6-40ca-a96b-8769a92c99a7

Summary by CodeRabbit

  • New Features
    • Federation now backfills missing messages/events in rooms, improving timeline completeness and continuity.
  • Bug Fixes
    • Event processing defers when prerequisites are missing, reducing out-of-order or skipped messages in federated rooms.
  • Documentation
    • Minor clarifications to inline comments.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 12, 2025

Walkthrough

Introduces FederationService.getMissingEvents and updates StagingAreaService to fetch/process missing events via FederationService when a latest event is present, signaling postponement with a new MissingEventsError. Adjusts constructor injection, adds logging, and tweaks minor log/comment text elsewhere without changing behavior.

Changes

Cohort / File(s) Summary of changes
Staging area flow and DI updates
packages/federation-sdk/src/services/staging-area.service.ts
Injects FederationService; adds MissingEventsError; alters processing to fetch missing events via FederationService when latestEvent exists; returns addedMissing from dependency stage; postpones current event by throwing MissingEventsError; expanded logging.
Federation API expansion
packages/federation-sdk/src/services/federation.service.ts
Adds public method getMissingEvents(domain, roomId, earliestEvents, latestEvents, limit, minDepth); posts to FederationEndpoints.getMissingEvents; returns { events: Pdu[] }; error logging and rethrow; imports EventID.
Listener log tweak
packages/federation-sdk/src/listeners/staging-area.listener.ts
Updates a debug message string from “Processing event {data}” to “Processing room {data}”.
Event service comment/TODO
packages/federation-sdk/src/services/event.service.ts
Adds TODO in checkIfEventsExists about returning only IDs; no functional changes.
State service comment
packages/federation-sdk/src/services/state.service.ts
Updates an inline comment regarding authorization check context; no logic changes.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Worker
  participant StagingAreaService as StagingAreaService
  participant DepStage as Dependency Stage
  participant FederationService as FederationService
  participant MissingEventService as MissingEventService
  participant Storage as Persistence

  Worker->>StagingAreaService: processEventForRoom(event)
  StagingAreaService->>DepStage: processDependencyStage(event)
  alt latestEvent exists
    DepStage->>FederationService: getMissingEvents(domain, roomId, earliest[], latest[], limit, minDepth)
    FederationService-->>DepStage: { events: Pdu[] }
    DepStage->>Storage: persist missing events
    note right of DepStage: addedMissing = true
    DepStage-->>StagingAreaService: addedMissing = true
    StagingAreaService-->>Worker: throw MissingEventsError (postpone)
  else no latestEvent
    DepStage->>MissingEventService: fetch missing events (legacy path)
    MissingEventService-->>DepStage: events (if any)
    DepStage->>Storage: persist (if any)
    DepStage-->>StagingAreaService: addedMissing (bool)
    opt addedMissing = true
      StagingAreaService-->>Worker: throw MissingEventsError (postpone)
    end
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • sampaiodiego
  • ggazzo

Poem

In burrows of bytes I hop with cheer,
Fetching lost events that once weren’t here.
A pause, a postpone—then off I zoom,
With federated trails in every room.
Logs now whisper what’s coming soon—
Thump-thump! the code keeps perfect tune. 🥕🐇

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title clearly identifies the main issue addressed by the changeset, namely fixing the loading of missed events, and aligns with the implemented updates to fetch and handle missing events properly. It is concise, specific, and reflects the core objective of the pull request without extraneous details. The “fix:” prefix correctly indicates a bug fix.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/load-missed-events

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov-commenter
Copy link

Codecov Report

❌ Patch coverage is 0% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 60.30%. Comparing base (f3104e5) to head (3286411).

Files with missing lines Patch % Lines
...kages/federation-sdk/src/services/state.service.ts 0.00% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #276   +/-   ##
=======================================
  Coverage   60.30%   60.30%           
=======================================
  Files          67       67           
  Lines        6673     6673           
=======================================
  Hits         4024     4024           
  Misses       2649     2649           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
packages/federation-sdk/src/services/federation.service.ts (1)

178-180: Fix the error log message.

The catch block still logs 'getEvent failed', which makes debugging harder for getMissingEvents. Update the message so it reflects the failing operation.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Jira integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between f3104e5 and 3286411.

📒 Files selected for processing (5)
  • packages/federation-sdk/src/listeners/staging-area.listener.ts (1 hunks)
  • packages/federation-sdk/src/services/event.service.ts (1 hunks)
  • packages/federation-sdk/src/services/federation.service.ts (2 hunks)
  • packages/federation-sdk/src/services/staging-area.service.ts (6 hunks)
  • packages/federation-sdk/src/services/state.service.ts (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
packages/federation-sdk/src/services/federation.service.ts (3)
packages/federation-sdk/src/index.ts (3)
  • EventID (27-27)
  • Pdu (21-21)
  • FederationEndpoints (41-41)
packages/room/src/types/v3-11.ts (1)
  • Pdu (769-769)
packages/federation-sdk/src/specs/federation-api.ts (1)
  • FederationEndpoints (24-71)
🔇 Additional comments (1)
packages/federation-sdk/src/services/state.service.ts (1)

648-648: Comment update improves clarity.

The extra detail clarifies that both prior state and the stored event state are considered during authorization—helps future readers.

Comment on lines +195 to +213
const missingEvents = await this.federationService.getMissingEvents(
event.origin,
event.event.room_id,
[latestEvent._id],
[eventId],
10,
0,
);

this.logger.debug(
`Persisting ${missingEvents.events.length} fetched missing events`,
);

await this.eventService.processIncomingPDUs(
event.origin,
missingEvents.events,
);

addedMissing = missingEvents.events.length > 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Handle remote fetch failures without dropping the staged event.

If federationService.getMissingEvents throws (network error, 4xx/5xx, etc.), the exception bubbles up to processEventForRoom, falls into the generic else branch, logs an error, and the event is un-staged via eventService.markEventAsUnstaged(event). That means we permanently discard the event just because fetching its dependencies failed once.

We should avoid unstaging in this case—either catch the error here and return false (so processing is retried later) or rethrow a dedicated error handled like the other postponement cases.

🤖 Prompt for AI Agents
In packages/federation-sdk/src/services/staging-area.service.ts around lines 195
to 213, the call to this.federationService.getMissingEvents can throw and
currently bubbles up causing the event to be un-staged; wrap the
getMissingEvents call (and subsequent processing of fetched events) in a
try/catch, and on any error catch it, log a debug/error message including the
error, and return false so the staged event is left in place for retry instead
of calling eventService.markEventAsUnstaged(event); ensure normal success path
still processes and marks addedMissing as before.

Comment on lines +208 to +214
await this.eventService.processIncomingPDUs(
event.origin,
missingEvents.events,
);

addedMissing = missingEvents.events.length > 0;
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Prevent re-staging the current event via processIncomingPDUs.

When /get_missing_events includes the event we’re currently processing (allowed by spec), processIncomingPDUs does not find it in the persistent store and stages it again. That creates a duplicate entry for the same event and can cause an infinite loop. Before calling processIncomingPDUs, filter out the current eventId (and any other events we already staged) so we only enqueue genuinely missing ones.

🤖 Prompt for AI Agents
packages/federation-sdk/src/services/staging-area.service.ts lines ~208-214:
before calling this.eventService.processIncomingPDUs(event.origin,
missingEvents.events) filter missingEvents.events to remove the current eventId
(the event we are processing) and any event IDs already staged/persisted so we
only pass genuinely missing events; then call processIncomingPDUs with the
filtered list and compute addedMissing from the filtered length. Ensure you
check the actual event id field name used in this context and use the
staging/persistent store lookup to exclude already-staged IDs.

@ggazzo ggazzo merged commit 8daade8 into main Oct 12, 2025
3 checks passed
@ggazzo ggazzo deleted the fix/load-missed-events branch October 12, 2025 13:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants