-
Notifications
You must be signed in to change notification settings - Fork 19
feat: implement backfill endpoint #234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds backfill support: two repository methods to locate/backfill events, a service method to assemble backfill PDUs (KeyRepository removed), DTOs for params/query/response, a new federation backfill route, and a DTO re-export. EventBaseDto origin field changed to an optional/hidden form. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant C as Federation Client
participant HS as Homeserver Controller
participant ES as EventService
participant ER as EventRepository
participant DB as MongoDB
Note over C,HS: GET /_matrix/federation/v1/backfill/:roomId?v=...&limit=...
C->>HS: HTTP GET Backfill
HS->>HS: Validate params (roomId, v[], limit)
HS->>ES: getBackfillEvents(roomId, eventIds, limit)
ES->>ER: findNewestEventForBackfill(roomId, eventIds) -- optional
ER-->>ES: newestEvent | null
ES->>ER: findEventsForBackfill(roomId, depth, originServerTs, limit)
ER->>DB: Query by depth and origin_server_ts (<=), sort desc, limit
DB-->>ER: Events (cursor)
ER-->>ES: EventStore cursor
ES->>ES: Map events -> PDUs, set origin & origin_server_ts
ES-->>HS: { origin, origin_server_ts, pdus }
HS-->>C: 200 OK (BackfillResponseDto)
alt Error
ES->>ES: Log error and rethrow
HS-->>C: 500 (BackfillErrorResponseDto)
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Tip 🧪 Early access (models): enabledWe are currently testing Sonnet 4.5 code review models, which should lead to better review quality. However, this model may result in higher noise levels in the review comments. Please disable the early access features if the noise level causes any inconvenience. Note:
Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #234 +/- ##
=======================================
Coverage 81.74% 81.74%
=======================================
Files 63 63
Lines 4695 4695
=======================================
Hits 3838 3838
Misses 857 857 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (6)
packages/federation-sdk/src/repositories/event.repository.ts (2)
385-423: Depth/TS window can return unrelated branch events; consider DAG-anchored traversalSelecting by depth/time alone may return events from different branches not reachable via prev_events from the anchors. Prefer a small, bounded graph walk from the provided eventIds following prev_events until limit is met, de-duping along the way. This better matches Matrix backfill intent.
Sketch:
// Pseudocode const queue = [...new Set(eventIds)]; const seen = new Set<EventID>(queue); const results: EventStore[] = []; while (queue.length && results.length < limit) { const batchIds = queue.splice(0, 25); const nodes = await this.collection.find({ _id: { $in: batchIds }, 'event.room_id': roomId }).toArray(); for (const n of nodes) { for (const prev of n.event.prev_events ?? []) { if (!seen.has(prev)) { seen.add(prev); queue.push(prev as EventID); } } // push predecessors only (exclude anchors) if (!eventIds.includes(n._id) && results.length < limit) results.push(n); } } // Optionally sort by depth desc, ts desc before return.This stays within limit (max 100) and returns only reachable history.
385-423: Indexing for backfill queriesTo keep these queries efficient at scale, ensure a compound index exists:
- { 'event.room_id': 1, 'event.depth': 1, 'event.origin_server_ts': 1 }
Migration snippet (Mongo shell/Compass):
db.getCollection('events').createIndex( { 'event.room_id': 1, 'event.depth': 1, 'event.origin_server_ts': 1 } );packages/federation-sdk/src/services/event.service.ts (1)
831-846: Harden limit parsing and sanitize anchorsIf limit is ever undefined/NaN at runtime, Math.min/Math.max yields NaN and Mongo .limit(NaN) will error. Also normalize anchors.
Apply this diff:
- const parsedLimit = Math.min(Math.max(1, limit), 100); + const parsedLimit = Number.isFinite(limit) ? Math.min(Math.max(1, limit), 100) : 100; + // Normalize anchors: trim, dedupe, drop empties + const normalized = Array.from( + new Set(eventIds.map(String).map((s) => s.trim()).filter(Boolean)) + ) as EventID[]; + if (normalized.length === 0) { + throw new Error('M_BAD_REQUEST'); + } - - const events = await this.eventRepository.findEventsForBackfill( + const events = await this.eventRepository.findEventsForBackfill( roomId, - eventIds, + normalized, parsedLimit, );packages/homeserver/src/controllers/federation/transactions.controller.ts (1)
95-113: Make limit optional with sensible default and normalize vDocs imply optional limit; DTO currently makes it required. Default to 100 here and de-dupe/validate anchors.
Apply this diff:
- const limit = query.limit; + const limit = Number.isFinite(query.limit) ? query.limit : 100; const eventIdParam = query.v; if (!eventIdParam) { set.status = 400; return { errcode: 'M_BAD_REQUEST', error: 'Event ID must be provided in v query parameter', }; } - const eventIds = Array.isArray(eventIdParam) - ? eventIdParam - : [eventIdParam]; + const eventIds = Array.from( + new Set( + (Array.isArray(eventIdParam) ? eventIdParam : [eventIdParam]) + .map((s) => s.trim()) + .filter(Boolean) + ) + );packages/homeserver/src/dtos/federation/backfill.dto.ts (2)
8-11: Optional limit with default; tighten v constraintsAlign with endpoint behavior: make limit optional (default 100) and ensure v isn’t empty.
Apply this diff:
-export const BackfillQueryDto = t.Object({ - limit: t.Number({ minimum: 1, maximum: 100 }), - v: t.Union([t.String(), t.Array(t.String())]), -}); +export const BackfillQueryDto = t.Object({ + limit: t.Optional(t.Number({ minimum: 1, maximum: 100, default: 100 })), + v: t.Union([ + t.String({ minLength: 1 }), + t.Array(t.String({ minLength: 1 }), { minItems: 1 }), + ]), +});
13-17: Use integer for origin_server_tsFederation timestamps are millisecond integers. Prefer Integer to ensure schema enforcement.
Apply this diff:
export const BackfillResponseDto = t.Object({ origin: t.String(), - origin_server_ts: t.Number(), + origin_server_ts: t.Integer(), pdus: t.Array(EventBaseDto), });
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Jira integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (6)
packages/federation-sdk/src/repositories/event.repository.ts(2 hunks)packages/federation-sdk/src/services/event.service.ts(1 hunks)packages/homeserver/src/controllers/federation/transactions.controller.ts(2 hunks)packages/homeserver/src/dtos/common/event.dto.ts(0 hunks)packages/homeserver/src/dtos/federation/backfill.dto.ts(1 hunks)packages/homeserver/src/dtos/index.ts(1 hunks)
💤 Files with no reviewable changes (1)
- packages/homeserver/src/dtos/common/event.dto.ts
🧰 Additional context used
🧬 Code graph analysis (4)
packages/federation-sdk/src/repositories/event.repository.ts (3)
packages/room/src/types/_common.ts (1)
EventID(8-8)packages/room/src/state_resolution/definitions/definitions.ts (1)
EventStore(96-98)packages/core/src/models/event.model.ts (1)
EventStore(21-25)
packages/federation-sdk/src/services/event.service.ts (2)
packages/room/src/types/_common.ts (1)
EventID(8-8)packages/room/src/types/v3-11.ts (1)
Pdu(729-729)
packages/homeserver/src/dtos/federation/backfill.dto.ts (1)
packages/homeserver/src/dtos/common/event.dto.ts (1)
EventBaseDto(19-35)
packages/homeserver/src/controllers/federation/transactions.controller.ts (2)
packages/room/src/types/_common.ts (1)
EventID(8-8)packages/homeserver/src/dtos/federation/backfill.dto.ts (4)
BackfillParamsDto(4-6)BackfillQueryDto(8-11)BackfillResponseDto(13-17)BackfillErrorResponseDto(19-22)
🔇 Additional comments (1)
packages/homeserver/src/dtos/index.ts (1)
7-7: Re-export looks goodBackfill DTOs are now exposed via the public DTO index; no issues spotted.
packages/homeserver/src/controllers/federation/transactions.controller.ts
Show resolved
Hide resolved
…orBackfill for improved clarity and functionality
…ify its functionality and limitations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
packages/homeserver/src/dtos/common/event.dto.ts (2)
19-23: Clarify the purpose ofHiddenOptionalhelper.The
HiddenOptionalfunction currently returns the schema unchanged (no-op). If the intent is to hide the field from schema introspection or OpenAPI documentation, this implementation doesn't achieve that. Consider either:
- Implementing the hiding logic (e.g., setting metadata that Elysia respects for documentation generation).
- Removing this helper if it's not needed yet.
- Adding a comment explaining the future intent if this is a placeholder.
If hiding from OpenAPI is the goal, you may need to explore Elysia's schema metadata options or transformers. Otherwise, this could be simplified to just
t.Optional(t.String())on line 35.
35-35: Consider restoring the field description.The
originfield previously had a description ('Origin server') that has been removed. While making the field optional aligns with the backfill feature, consider preserving the description for better API documentation:-origin: HiddenOptional(t.Optional(t.String())), +origin: HiddenOptional(t.Optional(t.String({ description: 'Origin server' }))),
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Jira integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
packages/federation-sdk/src/repositories/event.repository.ts(2 hunks)packages/federation-sdk/src/services/event.service.ts(1 hunks)packages/homeserver/src/dtos/common/event.dto.ts(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/federation-sdk/src/services/event.service.ts
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-29T16:12:21.316Z
Learnt from: ricardogarim
PR: RocketChat/homeserver#234
File: packages/federation-sdk/src/repositories/event.repository.ts:407-423
Timestamp: 2025-09-29T16:12:21.316Z
Learning: The Matrix Federation API backfill endpoint should include the anchor events provided in the 'v' parameter along with preceding events, as explicitly stated in the Matrix specification: "including the given event(s)".
Applied to files:
packages/federation-sdk/src/repositories/event.repository.ts
🧬 Code graph analysis (1)
packages/federation-sdk/src/repositories/event.repository.ts (3)
packages/room/src/types/_common.ts (1)
EventID(8-8)packages/room/src/state_resolution/definitions/definitions.ts (1)
EventStore(96-98)packages/core/src/models/event.model.ts (1)
EventStore(21-25)
🔇 Additional comments (3)
packages/federation-sdk/src/repositories/event.repository.ts (3)
2-2: LGTM: Import cleanup.The removal of unused imports (
EventBase,Filter,FindOptions,UpdateResult) improves code clarity and reduces clutter.Also applies to: 9-9
385-401: LGTM: Correctly identifies the newest anchor event.The query correctly finds the latest event among the provided IDs by sorting on depth (primary) and origin_server_ts (tiebreaker), which aligns with Matrix event ordering semantics.
403-424: LGTM: Backfill query correctly implements Matrix Federation spec.The query correctly retrieves events preceding (and including at equal depth) the reference point. The use of
$ltefor events at the same depth is correct per the Matrix specification, which explicitly states backfill should include the anchor events.Based on learnings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
packages/homeserver/src/dtos/common/event.dto.ts (2)
19-23: Clarify the purpose of theHiddenOptionalhelper.The
HiddenOptionalfunction is an identity function that returns its input unchanged. The name suggests it "hides" optional fields, but the implementation provides no such behavior at runtime.If this is intended to:
- Exclude the field from OpenAPI/Swagger documentation: Document this intent with a comment and verify it works with your Elysia version.
- Mark fields for special processing: Explain the processing logic.
- Serve as documentation: Consider using a more descriptive JSDoc comment instead of a no-op wrapper.
Please clarify the intended behavior and add a comment explaining why this wrapper exists. If it has no functional purpose, consider removing it and using
t.Optional()directly.
35-35: Consider restoring the field description.The
originfield previously had a description ('Origin server') that has been removed. This metadata is valuable for:
- Generated API documentation (OpenAPI/Swagger)
- Developer understanding when using this DTO
- IDE tooltips and autocomplete hints
If the
HiddenOptionalwrapper is intended to suppress the field from documentation, consider adding a code comment explaining this decision. Otherwise, restore the description:-origin: HiddenOptional(t.Optional(t.String())), +origin: HiddenOptional(t.Optional(t.String({ description: 'Origin server' }))),
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Jira integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
packages/homeserver/src/dtos/common/event.dto.ts(3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Code Quality Checks(lint, test, tsc)
🔇 Additional comments (2)
packages/homeserver/src/dtos/common/event.dto.ts (2)
1-1: Remove unusedTSchemaimport and reconsider theoriginfield design.Verification confirms:
TSchemais unused – it appears only in the import statement and is never referenced in the file. Remove it.
HiddenOptionalis a no-op wrapper – this function simply returns its input unchanged (lines 21-23). The name suggests it should hide fields from documentation, but it provides no actual functionality.
originfield may not belong in PDU schema – According to the Matrix specification, theoriginfield was removed from the PDU/event schema and exists only at the transaction level. Making it optional withHiddenOptional(t.Optional(t.String()))(line 35) may not align with the spec intent. Consider removing this field entirely fromEventBaseDtorather than making it optional, or verify whether your implementation requires this deviation from the spec.Lost field description – the
originfield previously had a description that was removed during this change.- import { TSchema, t } from 'elysia'; + import { t } from 'elysia';
35-35: The original concern is incorrect—makingoriginoptional inEventBaseDtodoes not break existing code.After verification, the
originfield inEventBaseDto(the API/wire format) is independent from howoriginis accessed in the codebase. All internal code usesPersistentEventBaseobjects where.originis a computed getter that extracts the domain from thesenderfield (seepackages/room/src/manager/event-wrapper.ts:88-93), not from theoriginfield in the PDU.Specifically:
packages/room/src/manager/room-state.ts:126accessescreateEvent.origin, which calls the getter that extracts fromsender, not the PDU field- Authorization rules at
packages/room/src/authorizartion-rules/rules.ts:97,727similarly access the computed property- Federation services that access
event.originwork with raw PDU objects from API responses whereoriginmay legitimately be optional (e.g., in backfill responses per Matrix spec)The change correctly reflects that the
originfield is optional in Matrix PDU wire format while maintaining backward compatibility with all existing code paths.Likely an incorrect or invalid review comment.
Summary by CodeRabbit
New Features
Refactor
Chores