Skip to content

Conversation

@ggazzo
Copy link
Member

@ggazzo ggazzo commented Sep 15, 2025

Summary by CodeRabbit

  • New Features

    • Binary and multipart HTTP response handling with redirect support and richer response API.
    • Media access checks via Authorization header and a header-based authorization flow.
    • Remote media download method for fetching binary media from federation peers.
  • Changes

    • More descriptive invalid-signature errors.
    • Emitted message events now merge incoming content fields.
    • Homeserver now exposes federation media routes that explicitly return "not implemented" for local hosting.
  • Breaking Changes

    • Old homeserver media endpoints removed; media service simplified to remote-only downloads.
  • Refactor

    • New repositories and wiring added to map uploads and bridged rooms.
  • Tests

    • Added tests for binary data retrieval and multipart handling.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 15, 2025

Walkthrough

Refactors signature verification messaging; rewrites core fetch to stream responses into a Buffer with JSON/text/multipart helpers, redirects, timeout and aborts; adds Upload and MatrixBridgedRoom repositories and container wiring; extends event/media authorization and binary media requests; replaces media controller with federation stub routes; minor staging content merge.

Changes

Cohort / File(s) Summary of Changes
Auth verification refactor
packages/core/src/utils/authentication.ts
Reworks detached signature verification to explicit byte conversions and a nacl.verify call; improves thrown error message on invalid signature; public API unchanged.
Fetch utility overhaul
packages/core/src/utils/fetch.ts
Reimplements fetch<T> to stream into a Buffer (50MB cap), expose buffer()/json()/text()/multipart() helpers, detect multipart boundaries and redirects, use IncomingHttpHeaders, support abort signal and 20s timeout, and add MultipartResult type.
Container wiring for new repos
packages/federation-sdk/src/container.ts
Registers new Mongo collections (rocketchat_uploads, rocketchat_matrix_bridged_rooms) and singletons for UploadRepository and MatrixBridgedRoomRepository; minor service registration reorder.
New repositories
packages/federation-sdk/src/repositories/matrix-bridged-room.repository.ts, packages/federation-sdk/src/repositories/upload.repository.ts
Adds MatrixBridgedRoom and Upload types and singleton repositories with injection tokens (MatrixBridgedRoomCollection, UploadCollection) and lookup methods: findMatrixRoomId(rid) and findRocketChatRoomIdByMediaId(mediaId).
Event authorization extensions
packages/federation-sdk/src/services/event-authorization.service.ts
Injects UploadRepository and MatrixBridgedRoomRepository; adds canAccessMedia and canAccessMediaFromAuthorizationHeader; adjusts error logging param name; integrates media ACL/membership/visibility checks and header-based signature validation.
Federation request enhancements + tests
packages/federation-sdk/src/services/federation-request.service.ts, packages/federation-sdk/src/services/federation-request.service.spec.ts
makeSignedRequest may return `T
Media service rewrite
packages/federation-sdk/src/services/media.service.ts
Removes MXC/token/local download logic; injects FederationRequestService; adds `downloadFromRemoteServer(serverName, mediaId): Promise<Buffer
Staging content merge tweak
packages/federation-sdk/src/services/staging-area.service.ts
In emitted m.room.message, spreads incoming event.event.content into emitted content (explicit fields still override).
Homeserver media routes replacement
packages/homeserver/src/controllers/media.controller.ts, packages/homeserver/src/homeserver.module.ts, packages/homeserver/src/controllers/federation/media.controller.ts
Removes old media controller; updates module import to new federation media controller; adds federation mediaPlugin that returns 404/M_UNRECOGNIZED for media endpoints (stubbed homeserver behavior).

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Client
  participant FedReq as FederationRequestService
  participant Fetch as fetch<T>()
  participant Remote as RemoteServer
  participant CDN as CDN

  Client->>FedReq: requestBinaryData(GET, server, /media/download, qs?)
  FedReq->>Fetch: makeSignedRequest(GET, domain, uri, qs)
  Fetch->>Remote: HTTP GET (signed)
  Remote-->>Fetch: 200 multipart/mixed (parts or Location)
  alt multipart with Location (redirect)
    Fetch-->>FedReq: { redirect }
    FedReq->>CDN: GET redirected URL (unsigned)
    CDN-->>FedReq: 200 binary (Buffer)
    FedReq-->>Client: Buffer
  else multipart with part content
    Fetch-->>FedReq: { content: Buffer }
    FedReq-->>Client: Buffer
  else JSON/text/binary
    Fetch-->>FedReq: json() / text() / buffer()
    FedReq-->>Client: parsed result or Buffer
  end
Loading
sequenceDiagram
  autonumber
  participant Caller
  participant Authz as EventAuthorizationService
  participant Validator as validateAuthorizationHeader
  participant UploadRepo as UploadRepository
  participant BridgeRepo as MatrixBridgedRoomRepository
  participant State as StateService

  Caller->>Authz: canAccessMediaFromAuthorizationHeader(mediaId, authHeader, method, uri, body?)
  Authz->>Validator: validateAuthorizationHeader(...)
  Validator-->>Authz: success | throw
  alt signature valid
    Authz->>UploadRepo: findRocketChatRoomIdByMediaId(mediaId)
    UploadRepo-->>Authz: rcRoomId | null
    alt rcRoomId found
      Authz->>BridgeRepo: findMatrixRoomId(rcRoomId)
      BridgeRepo-->>Authz: matrixRoomId | null
      alt matrixRoomId found
        Authz->>State: checkServerAcl / membership / visibility
        State-->>Authz: allow | deny
        Authz-->>Caller: { authorized: true } | { authorized: false }
      else missing matrix mapping
        Authz-->>Caller: { authorized: false }
      end
    else missing upload
      Authz-->>Caller: { authorized: false }
    end
  else signature invalid
    Authz-->>Caller: { authorized: false }
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

Suggested reviewers

  • sampaiodiego

Poem

I thump my paw: old routes, adieu,
I hop through buffers, byte by byte.
Bridged rooms mapped and uploads true,
Signed headers keep the path polite.
Rabbit fetches, tunnels bright. 🐇📦✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title "feat: adds canAccessMedia middleware and upload repo" is concise and directly describes two significant additions present in the changeset — the media access authorization (canAccessMedia) and the new Upload repository — so it accurately conveys the PR's primary intent. It is specific and not vague, making it useful for a teammate scanning history. The title does not list every refactor (e.g., media-service/controller moves or MatrixBridgedRoom additions), which is acceptable for a short PR title.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch refactor/media-service-2

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov-commenter
Copy link

codecov-commenter commented Sep 15, 2025

Codecov Report

❌ Patch coverage is 55.17241% with 26 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.01%. Comparing base (7209530) to head (b995362).

Files with missing lines Patch % Lines
packages/core/src/utils/authentication.ts 0.00% 14 Missing ⚠️
...ion-sdk/src/services/federation-request.service.ts 72.72% 12 Missing ⚠️
Additional details and impacted files
@@                 Coverage Diff                  @@
##           feat/acl-methods     #198      +/-   ##
====================================================
- Coverage             81.12%   80.01%   -1.12%     
====================================================
  Files                    62       58       -4     
  Lines                  4662     4578      -84     
====================================================
- Hits                   3782     3663     -119     
- Misses                  880      915      +35     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ggazzo ggazzo force-pushed the refactor/media-service-2 branch from c2bc4ad to b580320 Compare September 15, 2025 20:56
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (9)
packages/federation-sdk/src/services/media.service.ts (4)

15-19: Return type can’t be null — tighten to Promise

The code either returns a Buffer or throws; it never returns null.

Apply this diff:

-  ): Promise<Buffer | null> {
+  ): Promise<Buffer> {

19-23: Revisit legacy client endpoints; prefer federation only or gate behind a flag

Since Matrix v1.11 introduced authenticated media and a dedicated federation endpoint, legacy client endpoints (/_matrix/media/{v3|r0}/download/...) are deprecated and often disabled (404) when authenticated media is enabled (default in Synapse). Consider removing these fallbacks or hiding them behind a config flag; optionally add /_matrix/client/v1/media/download/... only if you can supply an access token. (www2.matrix.org)

Minimal gating example:

-    const endpoints = [
-      `/_matrix/federation/v1/media/download/${mediaId}`,
-      `/_matrix/media/v3/download/${serverName}/${mediaId}`,
-      `/_matrix/media/r0/download/${serverName}/${mediaId}`,
-    ];
+    const useLegacyClientFallback =
+      (this.configService as any)?.allowLegacyClientMediaFallback === true;
+
+    const endpoints = [
+      `/_matrix/federation/v1/media/download/${mediaId}`,
+      ...(useLegacyClientFallback
+        ? [
+            `/_matrix/media/v3/download/${serverName}/${mediaId}`,
+            `/_matrix/media/r0/download/${serverName}/${mediaId}`,
+          ]
+        : []),
+    ];

25-33: Pass timeout_ms for federation download to match spec semantics

The federation media download endpoint supports timeout_ms (default 20s). Plumb it to avoid hanging when content isn’t yet available. Keep legacy endpoints unchanged. (spec.matrix.org)

Apply this diff:

-    for (const endpoint of endpoints) {
+    const timeoutMs =
+      (this.configService as any)?.mediaDownloadTimeoutMs ?? 20_000;
+    for (const endpoint of endpoints) {
       try {
-        // TODO: Stream remote file downloads instead of buffering the entire file in memory.
-        const response = await this.federationRequest.requestBinaryData(
-          'GET',
-          serverName,
-          endpoint,
-        );
+        // TODO: Stream remote file downloads instead of buffering the entire file in memory.
+        const response = await this.federationRequest.requestBinaryData(
+          'GET',
+          serverName,
+          endpoint,
+          endpoint.startsWith('/_matrix/federation/v1/media/download/')
+            ? { timeout_ms: String(timeoutMs) }
+            : undefined,
+        );

36-43: Preserve context in the final error

Include the last error message to ease debugging which endpoint failed and why.

Apply this diff:

-    for (const endpoint of endpoints) {
+    let lastErr: unknown;
+    for (const endpoint of endpoints) {
       try {
         // ...
         return response;
       } catch (err) {
+        lastErr = err;
         this.logger.debug(
           `Endpoint ${endpoint} failed: ${err instanceof Error ? err.message : String(err)}`,
         );
       }
     }
 
-    throw new Error(`Failed to download media ${mediaId} from ${serverName}`);
+    throw new Error(
+      `Failed to download media ${mediaId} from ${serverName}. Last error: ${
+        lastErr instanceof Error ? lastErr.message : String(lastErr)
+      }`,
+    );
packages/core/src/utils/fetch.ts (5)

157-169: content-type can be string[]; normalize before decoding helpers

This avoids false negatives in json/text/multipart detection.

Apply this diff:

-    const contentType = response.headers['content-type'] || '';
+    const rawCT = response.headers['content-type'];
+    const contentType = Array.isArray(rawCT) ? rawCT.join(';') : rawCT || '';

49-59: Recognize +json subtypes (e.g., application/problem+json)

Broaden JSON detection to common RFC-compliant subtypes.

Apply this diff:

-function handleJson<T>(contentType: string, body: Buffer): Promise<T | null> {
-  if (!contentType.includes('application/json')) {
+function handleJson<T>(contentType: string, body: Buffer): Promise<T | null> {
+  if (!/\bapplication\/(json|[\w.+-]+?\+json)\b/i.test(contentType)) {
     return Promise.resolve(null);
   }

61-67: Return text for common textual application types as well

Current logic returns empty string for application/json or xml. Slightly broaden detection.

Apply this diff:

-function handleText(contentType: string, body: Buffer): Promise<string> {
-  if (!contentType.includes('text/')) {
-    return Promise.resolve('');
-  }
-  return Promise.resolve(body.toString());
-}
+function handleText(contentType: string, body: Buffer): Promise<string> {
+  const isText = /\btext\/|\/(json|xml|svg|csv)\b/i.test(contentType);
+  return Promise.resolve(isText ? body.toString('utf8') : '');
+}

6-10: Type advertises headers in MultipartResult but they’re never populated

Either populate headers of the selected part or drop the field to avoid confusion and dead API surface.

Example (populate when returning the last part):

// inside parseMultipart after computing the last part:
const headersObj: Record<string, string> = {};
for (const line of headersStr.split(/\r?\n/)) {
  const m = line.match(/^\s*([^:]+):\s*(.+)\s*$/);
  if (m) headersObj[m[1].toLowerCase()] = m[2];
}
return { content: partBody, headers: headersObj };

149-153: Make timeout and size limits configurable via DI or env

You already left TODOs; wire them to config to avoid hard-coded policies in hot paths.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c2bc4ad and b580320.

📒 Files selected for processing (7)
  • packages/core/src/utils/fetch.ts (1 hunks)
  • packages/federation-sdk/src/container.ts (2 hunks)
  • packages/federation-sdk/src/services/event-authorization.service.ts (4 hunks)
  • packages/federation-sdk/src/services/media.service.ts (1 hunks)
  • packages/federation-sdk/src/services/staging-area.service.ts (1 hunks)
  • packages/homeserver/src/controllers/media.controller.ts (0 hunks)
  • packages/homeserver/src/homeserver.module.ts (1 hunks)
💤 Files with no reviewable changes (1)
  • packages/homeserver/src/controllers/media.controller.ts
🚧 Files skipped from review as they are similar to previous changes (4)
  • packages/federation-sdk/src/services/staging-area.service.ts
  • packages/homeserver/src/homeserver.module.ts
  • packages/federation-sdk/src/services/event-authorization.service.ts
  • packages/federation-sdk/src/container.ts
🧰 Additional context used
🧬 Code graph analysis (2)
packages/core/src/utils/fetch.ts (2)
packages/core/src/index.ts (1)
  • fetch (89-89)
packages/federation-sdk/src/services/federation-request.service.ts (1)
  • request (142-166)
packages/federation-sdk/src/services/media.service.ts (1)
packages/federation-sdk/src/services/federation-request.service.ts (1)
  • singleton (26-215)
🔇 Additional comments (2)
packages/federation-sdk/src/services/media.service.ts (1)

12-13: Good DI: FederationRequestService injection is appropriate

This aligns MediaService with the new signed/binary federation requests path.

packages/core/src/utils/fetch.ts (1)

1-1: Verification blocked — file not found: packages/core/src/utils/fetch.ts

Repository search returned no match for that path; unable to inspect or confirm the header/SNI or header-shape issues. Confirm the file path or provide the file contents. If the file exists, ensure: normalize RequestInit.headers → OutgoingHttpHeaders (handle Headers / array / object), derive servername from the lowercase 'host' header with fallback to url.hostname for TLS SNI, pass the normalized headers to https.request, and default method to 'GET' when unset.

Comment on lines +16 to +47
function parseMultipart(buffer: Buffer, boundary: string): MultipartResult {
const bufferStr = buffer.toString();

// check if the second part contains a Location header (CDN redirect)
// pattern: after first boundary and JSON part, look for Location header
const parts = bufferStr.split(`--${boundary}`);
if (parts.length >= 3) {
const secondPart = parts[2];
const locationMatch = secondPart.match(/\r?\nLocation:\s*(.+)\r?\n/i);

if (locationMatch) {
return {
content: Buffer.from(''),
redirect: locationMatch[1].trim(),
};
}
}

// find where the last part's content starts (after the last \r\n\r\n)
const lastHeaderEnd = buffer.lastIndexOf('\r\n\r\n');
if (lastHeaderEnd === -1) return { content: buffer };

const binaryStart = lastHeaderEnd + 4;
const closingBoundary = buffer.lastIndexOf(`\r\n--${boundary}`);

const content =
closingBoundary > binaryStart
? buffer.subarray(binaryStart, closingBoundary)
: buffer.subarray(binaryStart);

return { content };
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Multipart parsing via Buffer.toString risks corruption; use Buffer-safe boundary scanning

Converting the whole payload to string can mangle binary content and mis-detect boundaries. Parse by boundary using Buffer indices and only decode header sections.

Apply this diff to replace parseMultipart:

-function parseMultipart(buffer: Buffer, boundary: string): MultipartResult {
-  const bufferStr = buffer.toString();
-
-  // check if the second part contains a Location header (CDN redirect)
-  // pattern: after first boundary and JSON part, look for Location header
-  const parts = bufferStr.split(`--${boundary}`);
-  if (parts.length >= 3) {
-    const secondPart = parts[2];
-    const locationMatch = secondPart.match(/\r?\nLocation:\s*(.+)\r?\n/i);
-
-    if (locationMatch) {
-      return {
-        content: Buffer.from(''),
-        redirect: locationMatch[1].trim(),
-      };
-    }
-  }
-
-  // find where the last part's content starts (after the last \r\n\r\n)
-  const lastHeaderEnd = buffer.lastIndexOf('\r\n\r\n');
-  if (lastHeaderEnd === -1) return { content: buffer };
-
-  const binaryStart = lastHeaderEnd + 4;
-  const closingBoundary = buffer.lastIndexOf(`\r\n--${boundary}`);
-
-  const content =
-    closingBoundary > binaryStart
-      ? buffer.subarray(binaryStart, closingBoundary)
-      : buffer.subarray(binaryStart);
-
-  return { content };
-}
+function parseMultipart(buffer: Buffer, boundary: string): MultipartResult {
+  const delim = Buffer.from(`--${boundary}`);
+  const nextDelimPrefix = Buffer.from(`\r\n--${boundary}`);
+  const parts: { headers: Buffer; body: Buffer }[] = [];
+
+  let idx = buffer.indexOf(delim);
+  while (idx !== -1) {
+    let start = idx + delim.length;
+    if (buffer[start] === 13 && buffer[start + 1] === 10) start += 2; // \r\n
+    const headerEnd = buffer.indexOf('\r\n\r\n', start, 'ascii');
+    if (headerEnd === -1) break;
+    const headersBuf = buffer.subarray(start, headerEnd);
+    const nextIdx = buffer.indexOf(nextDelimPrefix, headerEnd + 4, 'ascii');
+    const bodyEnd = nextIdx === -1 ? buffer.length : nextIdx;
+    parts.push({ headers: headersBuf, body: buffer.subarray(headerEnd + 4, bodyEnd) });
+    idx = nextIdx === -1 ? -1 : nextIdx + 2; // skip \r\n
+  }
+
+  // CDN redirect convention: Location header in second part
+  if (parts[1]) {
+    const h = parts[1].headers.toString('ascii');
+    const m = h.match(/\bLocation:\s*(.+)\r?$/mi);
+    if (m) return { content: Buffer.alloc(0), redirect: m[1].trim() };
+  }
+
+  return parts.length ? { content: parts[parts.length - 1].body } : { content: buffer };
+}

I can also add tests covering binary bodies and Location redirects on request.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
function parseMultipart(buffer: Buffer, boundary: string): MultipartResult {
const bufferStr = buffer.toString();
// check if the second part contains a Location header (CDN redirect)
// pattern: after first boundary and JSON part, look for Location header
const parts = bufferStr.split(`--${boundary}`);
if (parts.length >= 3) {
const secondPart = parts[2];
const locationMatch = secondPart.match(/\r?\nLocation:\s*(.+)\r?\n/i);
if (locationMatch) {
return {
content: Buffer.from(''),
redirect: locationMatch[1].trim(),
};
}
}
// find where the last part's content starts (after the last \r\n\r\n)
const lastHeaderEnd = buffer.lastIndexOf('\r\n\r\n');
if (lastHeaderEnd === -1) return { content: buffer };
const binaryStart = lastHeaderEnd + 4;
const closingBoundary = buffer.lastIndexOf(`\r\n--${boundary}`);
const content =
closingBoundary > binaryStart
? buffer.subarray(binaryStart, closingBoundary)
: buffer.subarray(binaryStart);
return { content };
}
function parseMultipart(buffer: Buffer, boundary: string): MultipartResult {
const delim = Buffer.from(`--${boundary}`);
const nextDelimPrefix = Buffer.from(`\r\n--${boundary}`);
const parts: { headers: Buffer; body: Buffer }[] = [];
let idx = buffer.indexOf(delim);
while (idx !== -1) {
let start = idx + delim.length;
if (buffer[start] === 13 && buffer[start + 1] === 10) start += 2; // \r\n
const headerEnd = buffer.indexOf('\r\n\r\n', start, 'ascii');
if (headerEnd === -1) break;
const headersBuf = buffer.subarray(start, headerEnd);
const nextIdx = buffer.indexOf(nextDelimPrefix, headerEnd + 4, 'ascii');
const bodyEnd = nextIdx === -1 ? buffer.length : nextIdx;
parts.push({ headers: headersBuf, body: buffer.subarray(headerEnd + 4, bodyEnd) });
idx = nextIdx === -1 ? -1 : nextIdx + 2; // skip \r\n
}
// CDN redirect convention: Location header in second part
if (parts[1]) {
const h = parts[1].headers.toString('ascii');
const m = h.match(/\bLocation:\s*(.+)\r?$/mi);
if (m) return { content: Buffer.alloc(0), redirect: m[1].trim() };
}
return parts.length ? { content: parts[parts.length - 1].body } : { content: buffer };
}

Comment on lines +118 to +126
res.on('data', (chunk) => {
total += chunk.length;
if (total > MAX_RESPONSE_BYTES) {
request.destroy(new Error('Response exceeds size limit'));
return;
}
chunks.push(chunk);
});
request.on('error', (err) => {
reject(err);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Stop the response stream when size limit is exceeded

Destroying only the request may not halt the inbound response; explicitly destroy the response to stop I/O and free memory earlier.

Apply this diff:

-        res.on('data', (chunk) => {
+        res.on('data', (chunk) => {
           total += chunk.length;
           if (total > MAX_RESPONSE_BYTES) {
-            request.destroy(new Error('Response exceeds size limit'));
+            const err = new Error('Response exceeds size limit');
+            res.destroy(err);
+            request.destroy(err);
             return;
           }
           chunks.push(chunk);
         });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
res.on('data', (chunk) => {
total += chunk.length;
if (total > MAX_RESPONSE_BYTES) {
request.destroy(new Error('Response exceeds size limit'));
return;
}
chunks.push(chunk);
});
request.on('error', (err) => {
reject(err);
res.on('data', (chunk) => {
total += chunk.length;
if (total > MAX_RESPONSE_BYTES) {
const err = new Error('Response exceeds size limit');
res.destroy(err);
request.destroy(err);
return;
}
chunks.push(chunk);
});
🤖 Prompt for AI Agents
In packages/core/src/utils/fetch.ts around lines 118 to 126, the code only
destroys the request when the response exceeds MAX_RESPONSE_BYTES which may not
stop the inbound response stream; update the data handler to also destroy the
response object (e.g., call res.destroy(new Error('Response exceeds size
limit'))) when the limit is exceeded (you can still call request.destroy(...)
for safety), so the response stream is terminated immediately and memory/IO is
freed.

Comment on lines +154 to +155
request.end(options.body);
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Guard unsupported RequestInit.body types

RequestInit.body can be Headers API types (ReadableStream, URLSearchParams, etc.). https.request only accepts string | Buffer. Without guarding, this can throw at runtime.

Apply this diff:

-      request.end(options.body);
+      const payload =
+        typeof options.body === 'string' || Buffer.isBuffer(options.body)
+          ? (options.body as string | Buffer)
+          : undefined;
+      if (options.body && !payload) {
+        request.destroy(new Error('Unsupported request body type'));
+        return;
+      }
+      request.end(payload);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
request.end(options.body);
});
const payload =
typeof options.body === 'string' || Buffer.isBuffer(options.body)
? (options.body as string | Buffer)
: undefined;
if (options.body && !payload) {
request.destroy(new Error('Unsupported request body type'));
return;
}
request.end(payload);
});
🤖 Prompt for AI Agents
In packages/core/src/utils/fetch.ts around lines 154-155, the call
request.end(options.body) can throw because RequestInit.body may be
non-string/Buffer types (ReadableStream, URLSearchParams, FormData, Blob,
ArrayBuffer, etc.); update the code to guard and normalize body types before
calling request.end: if body is undefined/null call request.end() with no arg;
if body is a string or Buffer pass it through; if body is URLSearchParams call
body.toString(); if body is ArrayBuffer or Observable/TypedArray/Blob convert to
Buffer; if body is a ReadableStream or Node stream pipe it into the request
instead of using request.end; for unsupported types reject the promise with a
clear error. Ensure headers like content-length are adjusted accordingly when
you convert the body.

Comment on lines +27 to +35
// TODO: Stream remote file downloads instead of buffering the entire file in memory.
const response = await this.federationRequest.requestBinaryData(
'GET',
serverName,
mediaId,
status: response.status,
});

return {
errcode: 'M_NOT_FOUND',
error: 'Remote media not found',
};
endpoint,
);

return response;
} catch (err) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Unbounded buffering risk on large media; add max-bytes guard or streaming

Current path buffers entire responses in memory (including redirects), which can OOM the process with large/attacker-controlled media. Add a configurable byte cap and enforce it in FederationRequestService (both for direct responses and redirect fetches); or stream to the caller with a controlled pipeline and backpressure.

Outside this file, extend FederationRequestService like below and thread a maxBytes from config:

// federation-request.service.ts (sketch)
async requestBinaryData(
  method: string,
  targetServer: string,
  endpoint: string,
  queryParams?: Record<string, string>,
  opts?: { maxBytes?: number },
): Promise<Buffer> {
  const res = await this.makeSignedRequest<unknown>({ method, domain: targetServer, uri: endpoint, queryString: queryParams ? new URLSearchParams(queryParams).toString() : '' }) as any;

  // If makeSignedRequest returns a node-fetch Response, prefer streaming with size checks.
}

// In handleRedirect(...)
const redirectResponse = await fetch(new URL(redirect), { method: 'GET', headers: {} });
// Check content-length if present
const len = Number(redirectResponse.headers.get('content-length') ?? '0');
if (opts?.maxBytes && len && len > opts.maxBytes) {
  throw new Error(`Media too large: ${len} > ${opts.maxBytes}`);
}
// Otherwise, stream and enforce maxBytes while accumulating.

Then call it here with something like:

const maxBytes = (this.configService as any)?.maxMediaDownloadBytes ?? 50 * 1024 * 1024;
await this.federationRequest.requestBinaryData('GET', serverName, endpoint, query, { maxBytes });

Want me to wire this across both services?

🤖 Prompt for AI Agents
In packages/federation-sdk/src/services/media.service.ts around lines 27 to 35,
the current code buffers entire remote responses into memory which can OOM on
large/attacker-controlled media; update the call to FederationRequestService to
pass a configurable maxBytes limit (read from config, e.g. maxMediaDownloadBytes
with a sensible default like 50MB), and modify
FederationRequestService.requestBinaryData to accept an opts object with
maxBytes, check Content-Length on initial/redirect responses and immediately
fail if it exceeds maxBytes, and otherwise read the body as a stream while
enforcing a running byte counter that throws when the cap is exceeded (apply the
same checks for redirect fetches). Ensure tests and call sites are updated to
supply or default the maxBytes value.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (8)
packages/homeserver/src/controllers/federation/media.controller.ts (8)

3-6: Centralize the not-implemented payload to remove duplication.

All handlers build the same 404 body inline. Add a tiny helper and reuse it across routes to reduce repetition and keep error text consistent.

Insert this helper above the plugin:

const notImplemented = ({ set }: { set: { status: number } }) => {
	set.status = 404;
	return {
		errcode: 'M_UNRECOGNIZED' as const,
		error: 'This endpoint is not implemented on homeserver side' as const,
	};
};

18-24: Replace inline 404 bodies with the shared helper; drop unnecessary async/blocks.

No awaits are used; returning the helper directly simplifies each handler.

-				async ({ set }) => {
-					set.status = 404;
-					return {
-						errcode: 'M_UNRECOGNIZED',
-						error: 'This endpoint is not implemented on homeserver side',
-					};
-				},
+				({ set }) => notImplemented({ set }),

Also applies to: 38-43, 63-68, 92-97, 126-131


74-77: Use integers and basic bounds for query types.

timeout_ms should be an integer and non‑negative to match typical API expectations.

-					query: t.Object({
-						allow_remote: t.Optional(t.Boolean()),
-						timeout_ms: t.Optional(t.Number()),
-					}),
+					query: t.Object({
+						allow_remote: t.Optional(t.Boolean()),
+						timeout_ms: t.Optional(t.Integer({ minimum: 0 })),
+					}),

104-111: Constrain numeric thumbnail params to integers; consider externalizing max values.

Width/height/timeout are integer knobs in practice. Hard‑coded maxima (800x600) may need tuning later; consider loading from config.

-					query: t.Object({
-						width: t.Optional(t.Number({ minimum: 1, maximum: 800 })),
-						height: t.Optional(t.Number({ minimum: 1, maximum: 600 })),
-						method: t.Optional(
-							t.Union([t.Literal('crop'), t.Literal('scale')]),
-						),
-						allow_remote: t.Optional(t.Boolean()),
-						timeout_ms: t.Optional(t.Number()),
-					}),
+					query: t.Object({
+						width: t.Optional(t.Integer({ minimum: 1, maximum: 800 })),
+						height: t.Optional(t.Integer({ minimum: 1, maximum: 600 })),
+						method: t.Optional(t.Union([t.Literal('crop'), t.Literal('scale')])),
+						allow_remote: t.Optional(t.Boolean()),
+						timeout_ms: t.Optional(t.Integer({ minimum: 0 })),
+					}),

If these limits are intentional, please confirm they align with the Matrix spec and client expectations.


35-58: Mark r0 as deprecated in docs for clarity.

Expose deprecation in generated OpenAPI to guide integrators toward v3.

 					detail: {
 						tags: ['Media'],
 						summary: 'Download media',
 						description: 'Download a file from the Matrix media repository',
+						deprecated: true,
 					},

60-87: Optional: add HEAD route for download parity (clients sometimes probe).

Some clients issue HEAD before GET to inspect headers. Even if unimplemented, a matching stub avoids a framework 404 differing from the JSON 404 shape.

Would you like me to draft the additional HEAD stubs mirroring these GET routes?


123-142: Consider returning a minimal 200 for /media/v3/config for better client UX.

Many clients hit /media/v3/config to size uploads/thumbnailing. A 200 with “no media repo here” semantics might be clearer than M_UNRECOGNIZED.

Proposed alternative (if acceptable):

-			.get(
-				'/media/v3/config',
-				({ set }) => notImplemented({ set }),
-				{
-					response: {
-						404: ErrorResponseSchema,
-					},
-					detail: {
-						tags: ['Media'],
-						summary: 'Get media configuration',
-						description: 'Get the media configuration for the homeserver',
-					},
-				},
-			),
+			.get(
+				'/media/v3/config',
+				() => ({ 'm.upload.size': 0 as const }),
+				{
+					response: {
+						200: t.Object({ 'm.upload.size': t.Integer({ minimum: 0 }) }),
+					},
+					detail: {
+						tags: ['Media'],
+						summary: 'Get media configuration',
+						description: 'This homeserver does not host a media repo; uploads disabled.',
+					},
+				},
+			),

Please confirm this aligns with client behavior you expect; otherwise keeping 404 is fine.


13-144: Add minimal tests to guard the contract and boost coverage.

  • One happy-path per route: 404 with M_UNRECOGNIZED and error string.
  • One validation test per route: invalid width/height/type rejected.

I can generate an Elysia supertest suite for these five routes; want me to add it?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b580320 and b995362.

📒 Files selected for processing (2)
  • packages/homeserver/src/controllers/federation/media.controller.ts (1 hunks)
  • packages/homeserver/src/homeserver.module.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/homeserver/src/homeserver.module.ts
🔇 Additional comments (1)
packages/homeserver/src/controllers/federation/media.controller.ts (1)

16-18: No change required — route matches Matrix Federation API (added in v1.11). GET /_matrix/federation/v1/media/download/{mediaId} is a server–server endpoint; client media downloads remain under /_matrix/media/.... Document the federation endpoint to clarify it's an S2S route.

@ggazzo ggazzo closed this Sep 15, 2025
@ggazzo ggazzo deleted the refactor/media-service-2 branch September 15, 2025 21:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants