-
Notifications
You must be signed in to change notification settings - Fork 13k
refactor(federation): create MentionPill structure internally #37148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Looks like this PR is ready to merge! 🎉 |
|
a6fae0b to
c6cf36f
Compare
WalkthroughRefactors Matrix federation message parsing to a batched mention-extraction/replacement pipeline with standardized Matrix-style HTML pills and updated quote handling. Adds sanitization-focused tests. Removes Matrix bot SDK and appservice dependencies. Cleans Docker and CI of Alpine-specific Matrix crypto build/caching and related steps. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Client
participant Parser as MessageParsers
participant Utils as extractAnchors/extractMentions
participant Replacer as replaceMentions
participant HTML as createMentionHtml
Client->>Parser: toExternalMessageFormat(raw, formatted)
Parser->>Utils: parse anchors, find mention targets
Utils-->>Parser: mention mappings (userId→pill)
Parser->>HTML: build Matrix-style <a> pills
HTML-->>Parser: pill HTML
Parser->>Replacer: apply batched replacements
Replacer-->>Client: formatted_body (Matrix HTML) + body
sequenceDiagram
autonumber
actor Client
participant Parser as toExternalQuoteMessageFormat
participant Sanitize as strip/escape HTML
participant Reply as createReplyContent
Client->>Parser: format quoted message
Parser->>Sanitize: remove quote tags, sanitize content
Sanitize-->>Parser: safe text/HTML
Parser->>Reply: construct in-reply-to + formatted_body
Reply-->>Client: Matrix reply content
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
c6cf36f to
9e4968b
Compare
3c8aeb9 to
130e2c8
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## release-7.11.0 #37148 +/- ##
===============================================
Coverage 66.34% 66.35%
===============================================
Files 3386 3386
Lines 115663 115619 -44
Branches 21358 21355 -3
===============================================
- Hits 76740 76720 -20
+ Misses 36319 36291 -28
- Partials 2604 2608 +4
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (1)
ee/packages/federation-matrix/src/helpers/message.parsers.ts (1)
29-31: HTML stripping is not a sanitizer; prior alert acknowledged
stripHtml(html, ['a'])preserves anchor tags entirely (including attributes). While you later only ingest anchors whose href containsmatrix.to, the function itself isn’t a sanitizer. Ensure this isn’t reused as a sanitizer elsewhere.
🧹 Nitpick comments (8)
packages/media-signaling/src/lib/services/webrtc/Processor.ts (1)
329-329: Consider using the serializeError utility for consistency.The PR introduces a new
serializeErrorutility (packages/media-signaling/src/lib/utils/serializeError.ts) designed to normalize error representations. However,onIceCandidateErrordirectly usesJSON.stringify(event), which may not capture all relevant properties ofRTCPeerConnectionIceErrorEventdue to non-enumerable properties.Apply this diff to use the serializeError utility:
+import { serializeError } from '../../utils/serializeError'; + private onIceCandidateError(event: RTCPeerConnectionIceErrorEvent) { if (this.stopped) { return; } this.config.logger?.debug('MediaCallWebRTCProcessor.onIceCandidateError'); this.config.logger?.error(event); - this.emitter.emit('internalError', { critical: false, error: 'ice-candidate-error', errorDetails: JSON.stringify(event) }); + this.emitter.emit('internalError', { critical: false, error: 'ice-candidate-error', errorDetails: serializeError(event) }); }packages/media-signaling/src/lib/utils/serializeError.ts (2)
12-18: Error spread may not capture all properties.The spread operator
{ ...error }on Error instances won't capture non-enumerable properties. While you explicitly addnameandmessage, other properties likestackare also non-enumerable and won't be included.Consider using
Object.getOwnPropertyNames()to capture all properties:if (typeof error === 'object') { if (error instanceof Error) { - return JSON.stringify({ - ...error, - name: error.name, - message: error.message, - }); + const errorData: Record<string, any> = { + name: error.name, + message: error.message, + stack: error.stack, + }; + for (const key of Object.getOwnPropertyNames(error)) { + if (key !== 'name' && key !== 'message' && key !== 'stack') { + errorData[key] = (error as any)[key]; + } + } + return JSON.stringify(errorData); }
1-37: Consider handling circular references.
JSON.stringifywill throw when encountering circular references. While the outer try-catch handles this, you could provide better error resilience by detecting or handling circular structures explicitly.Consider using a circular reference replacer:
export function serializeError(error: unknown): string | undefined { try { if (!error) { return undefined; } if (typeof error === 'string') { return error; } if (typeof error === 'object') { const seen = new WeakSet(); const replacer = (_key: string, value: any) => { if (typeof value === 'object' && value !== null) { if (seen.has(value)) { return '[Circular]'; } seen.add(value); } return value; }; if (error instanceof Error) { return JSON.stringify({ ...error, name: error.name, message: error.message, }, replacer); } const errorData: Record<string, any> = { ...error }; if ('name' in error) { errorData.name = error.name; } if ('message' in error) { errorData.message = error.message; } if (Object.keys(errorData).length > 0) { return JSON.stringify(errorData, replacer); } } } catch { // } return undefined; }ee/packages/federation-matrix/src/helpers/message.parsers.ts (3)
87-91: Avoid unconditional leading space when replacing with mention pillsPrepending a space can introduce double or leading spaces. Use the replace callback’s offset to insert a space only when needed.
-const replaceWithMentionPills = async (message: string, regex: RegExp, createPill: (match: string) => string): Promise<string> => { - const matches = Array.from(message.matchAll(regex), ([match]) => createPill(match.trimStart())); - let i = 0; - return message.replace(regex, () => ` ${matches[i++]}`); -}; +const replaceWithMentionPills = async (message: string, regex: RegExp, createPill: (match: string) => string): Promise<string> => { + const matches = Array.from(message.matchAll(regex), ([match]) => createPill(match.trimStart())); + let i = 0; + return message.replace(regex, (m, ...args) => { + const offset = args[args.length - 2] as number; + const needsSpace = offset > 0 && !/\s/.test(message[offset - 1]); + return `${needsSpace ? ' ' : ''}${matches[i++]}`; + }); +};
147-151: Dynamic RegExp is safe here but consider precompiling
tagcomes from a fixed allowlist (['mx-reply', 'blockquote']), so input isn’t attacker‑controlled; risk of ReDoS is minimal. You can silence tooling and improve readability by precompiling two static regexes.Example:
- MATRIX_QUOTE_TAGS.forEach((tag) => { - cleaned = cleaned.replace(new RegExp(`<${tag}[^>]*>.*?</${tag}>`, 'gis'), ''); - }); + cleaned = cleaned.replace(/<mx-reply[^>]*>.*?<\/mx-reply>/gis, '').replace(/<blockquote[^>]*>.*?<\/blockquote>/gis, '');
172-196: Quote builder uses two variants; simplify to avoid mixing bodies
reply1.bodyis not used whilereply2.bodyis returned. Keep a single construction to avoid confusion, and always pass plain text tobodyand HTML toformatted_body.- const markdownHtml = await marked.parse(message); - const withMentions = await toExternalMessageFormat({ message, externalRoomId, homeServerDomain }); - const withMentionsHtml = await marked.parse(withMentions); - - const reply1 = createReplyContent(externalRoomId, event, markdownHtml, withMentionsHtml); - const reply2 = createReplyContent(externalRoomId, event, message, withMentionsHtml); - - return { - message: reply2.body, - formattedMessage: reply1.formatted_body ?? '', - }; + const withMentionsHtml = await marked.parse(await toExternalMessageFormat({ message, externalRoomId, homeServerDomain })); + const reply = createReplyContent(externalRoomId, event, message, withMentionsHtml); + return { message: reply.body, formattedMessage: reply.formatted_body ?? '' };apps/meteor/client/components/GazzodownText.tsx (1)
69-76: Normalize team mention comparison as well (consistency)You normalize user mentions by stripping a leading '@'. Consider doing the same for teams to avoid mismatches like '@team' vs 'team'.
-const filterTeam = ({ name, type }: UserMention) => type === 'team' && name === mention; +const filterTeam = ({ name, type }: UserMention) => + type === 'team' && (name === mention || name === normalizedMention);apps/meteor/client/views/room/providers/ComposerPopupProvider.tsx (1)
156-156: Null-safe username normalization in getValue
ReplacegetValue: (item) => (item.username.startsWith('@') ? item.username.substring(1) : item.username),with
getValue: (item) => item.username?.replace(/^@/, '') ?? '',(or fallback to
item._idoritem.nameif an empty string isn’t acceptable)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Jira integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (16)
apps/meteor/app/mentions/server/Mentions.ts(2 hunks)apps/meteor/client/components/GazzodownText.tsx(1 hunks)apps/meteor/client/views/room/providers/ComposerPopupProvider.tsx(1 hunks)ee/packages/federation-matrix/package.json(0 hunks)ee/packages/federation-matrix/src/helpers/message.parsers.ts(5 hunks)ee/packages/media-calls/src/internal/agents/CallSignalProcessor.ts(2 hunks)packages/core-typings/src/mediaCalls/IMediaCall.ts(1 hunks)packages/media-signaling/src/definition/call/IClientMediaCall.ts(1 hunks)packages/media-signaling/src/definition/services/IServiceProcessor.ts(1 hunks)packages/media-signaling/src/definition/signals/client/error.ts(2 hunks)packages/media-signaling/src/lib/Call.ts(9 hunks)packages/media-signaling/src/lib/Session.ts(1 hunks)packages/media-signaling/src/lib/TransportWrapper.ts(1 hunks)packages/media-signaling/src/lib/services/webrtc/Processor.ts(1 hunks)packages/media-signaling/src/lib/utils/serializeError.ts(1 hunks)packages/models/src/models/MediaCalls.ts(2 hunks)
💤 Files with no reviewable changes (1)
- ee/packages/federation-matrix/package.json
🧰 Additional context used
🧬 Code graph analysis (3)
packages/media-signaling/src/lib/TransportWrapper.ts (1)
packages/media-signaling/src/definition/signals/client/error.ts (1)
ClientMediaSignalError(4-14)
ee/packages/media-calls/src/internal/agents/CallSignalProcessor.ts (3)
packages/media-signaling/src/definition/signals/client/error.ts (1)
ClientMediaSignalError(4-14)packages/media-signaling/src/definition/call/IClientMediaCall.ts (1)
CallHangupReason(29-41)ee/packages/media-calls/src/server/CallDirector.ts (1)
mediaCallDirector(416-416)
packages/media-signaling/src/lib/Call.ts (1)
packages/media-signaling/src/lib/utils/serializeError.ts (1)
serializeError(1-37)
🪛 ast-grep (0.39.5)
ee/packages/federation-matrix/src/helpers/message.parsers.ts
[warning] 147-147: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(<${tag}[^>]*>.*?</${tag}>, 'gis')
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: 🔨 Test Storybook / Test Storybook
- GitHub Check: 🔨 Test Unit / Unit Tests
- GitHub Check: 🔎 Code Check / TypeScript
- GitHub Check: 🔎 Code Check / Code Lint
- GitHub Check: 📦 Meteor Build - coverage
- GitHub Check: CodeQL-Build
- GitHub Check: CodeQL-Build
🔇 Additional comments (8)
packages/media-signaling/src/definition/call/IClientMediaCall.ts (1)
39-39: LGTM!The new
input-errorhangup reason is well-documented and aligns with its usage in Session.ts for handling audio input track failures.packages/media-signaling/src/definition/services/IServiceProcessor.ts (1)
16-16: LGTM!Adding the optional
errorDetailsfield enhances error observability without breaking existing code. The field is properly typed and aligns with its usage across the signaling layer.packages/media-signaling/src/lib/Session.ts (1)
429-429: LGTM!Changing the hangup reason from
'service-error'to'input-error'provides more specific error classification for audio input failures, improving error diagnostics.packages/models/src/models/MediaCalls.ts (2)
85-85: LGTM!Recording the acceptance timestamp provides valuable call lifecycle tracking and aligns with the updated IMediaCall type definition.
101-101: LGTM!Recording the activation timestamp complements the acceptance timestamp and provides a complete picture of call state transitions.
packages/core-typings/src/mediaCalls/IMediaCall.ts (1)
54-57: LGTM!The new optional timestamp fields are well-documented and provide valuable insights into call lifecycle progression. They align with the implementation in MediaCalls.ts.
packages/media-signaling/src/definition/signals/client/error.ts (1)
12-13: LGTM!The new optional fields enhance error reporting capabilities while maintaining backward compatibility. The type definition and JSON schema are properly synchronized.
Also applies to: 46-53
ee/packages/federation-matrix/src/helpers/message.parsers.ts (1)
155-170: Sanitization of markdown→HTML
marked.parseoutputs HTML. Confirm upstream sanitization guarantees formessage. If not guaranteed, consider sanitizing the output with your standard sanitizer before sending to Matrix.
130e2c8 to
079f418
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (3)
ee/packages/federation-matrix/src/helpers/message.parsers.ts (3)
27-27: @here/@ALL distinction lost: pill text should preserve the original label.The current implementation always uses the ID as both href and text, which causes general mentions (@all/@here) to render as room IDs in the pill text. This prevents round-tripping the original label.
Past review flagged this critical issue with a suggested fix: add a
labelparameter tocreateMentionHtmland pass the original match text when creating general mention pills.Apply this diff to add label support:
-const createMentionHtml = (id: string): string => `<a href="${MATRIX_TO_URL}${id}">${id}</a>`; +const createMentionHtml = (id: string, label = id): string => `<a href="${MATRIX_TO_URL}${id}">${label}</a>`;Then update line 140 to preserve the match text:
- result = await replaceWithMentionPills(result, REGEX.general, () => createMentionHtml(externalRoomId)); + result = await replaceWithMentionPills(result, REGEX.general, (match) => createMentionHtml(externalRoomId, match));
31-45: @here collapsed to @ALL: preserve the pill text label.Line 44 always returns
@allfor room mentions, losing the distinction between @ALL and @here. The fix requires checking the pill's visible text and preserving it.Past review suggested:
Apply this diff to preserve @here:
- return { mention: '@all', realName: text }; + const m = text === '@here' ? '@here' : '@all'; + return { mention: m, realName: text };For backward compatibility with older messages where pill text was the room ID:
return { - mention: '@all', + mention: text === '@here' ? '@here' : '@all', realName: text, };
55-57: Room mention replacement needs to handle both @ALL and @here.Lines 55-57 only replace
@allfor room mentions. After fixingextractMentionsto preserve @here, this logic must handle both labels.Past review suggested adding backward-compat fallback:
Apply this diff:
- } else if (realName.startsWith('!')) { - result = result.replace(/(?<!\w)@all(?!\w)/, mention); + } else if (realName.startsWith('!') || realName.startsWith('#')) { + // backward-compat: older pills used roomId as label; try @here first, then @all + const index = result.indexOf('@here') !== -1 ? result.indexOf('@here') : result.indexOf('@all'); + if (index !== -1) { + const target = result.includes('@here') ? '@here' : '@all'; + result = result.replace(new RegExp(`(?<!\\w)${target}(?!\\w)`), mention); + } }
🧹 Nitpick comments (2)
ee/packages/federation-matrix/src/helpers/message.parsers.ts (2)
25-25: LGTM: HTML stripping works correctly.The function correctly uses
sanitizeHtmlto strip tags while optionally preserving anchors.For more flexibility, consider directly mapping the keep array:
-const stripHtml = (html: string, keep: string[] = []): string => sanitizeHtml(html, { allowedTags: keep.includes('a') ? ['a'] : [] }); +const stripHtml = (html: string, keep: string[] = []): string => sanitizeHtml(html, { allowedTags: keep });
121-124: Replace the custom regex loop with a sanitize-html exclusiveFilter
RegExp risk is low (tags are fixed), but you can more efficiently strip<mx-reply>and<blockquote>using sanitize-html:cleaned = sanitizeHtml(formattedMessage, { allowedTags: sanitizeHtml.defaults.allowedTags.filter((t) => t !== 'blockquote'), exclusiveFilter: (frame) => frame.tag === 'mx-reply', });
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Jira integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
ee/packages/federation-matrix/package.json(1 hunks)ee/packages/federation-matrix/src/helpers/message.parsers.ts(5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- ee/packages/federation-matrix/package.json
🧰 Additional context used
🪛 ast-grep (0.39.5)
ee/packages/federation-matrix/src/helpers/message.parsers.ts
[warning] 51-51: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp((?<!\\w)${realName.replace(/[.*+?^${}()|[\]\\]/g, '\\$&')}(?!\\w))
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
[warning] 122-122: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(<${tag}[^>]*>.*?</${tag}>, 'gis')
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
🔇 Additional comments (6)
ee/packages/federation-matrix/src/helpers/message.parsers.ts (6)
147-172: LGTM: Quote reply structure is correct.The dual reply creation (reply1 with markdown HTML, reply2 with raw message) correctly populates both
bodyandformatted_bodyfields for Matrix reply format.
77-94: LGTM: Matrix reply structure follows spec correctly.The function correctly implements Matrix's reply format with
mx-replytags,m.in_reply_tometadata, and proper fallbacks for missing formatted_body.
29-29: LGTM: Anchor extraction is correct.The implementation correctly uses
matchAllto extract href and text from HTML anchors, handling both single and double quotes.
22-23: LGTM: HTML escaping is correct.The function correctly escapes all necessary HTML special characters.
62-66: Verify whitespace handling in mention pill replacement.Line 63 calls
trimStart()on matches, and line 65 prepends a space to the replacement. This assumes:
- Matches may have leading whitespace that should be removed
- A single space should always precede the pill
However, this could lead to incorrect spacing if:
- The original match had no leading space (space gets added)
- The original match had multiple spaces (all collapsed to one)
Please verify this logic handles all cases correctly:
// Test cases 'foo@all bar' // Should become 'foo <pill> bar' 'foo @all bar' // Should become 'foo <pill> bar' (original space preserved) 'foo @all bar' // Should become 'foo <pill> bar' (extra space removed?)Consider preserving the original leading whitespace:
-const replaceWithMentionPills = async (message: string, regex: RegExp, createPill: (match: string) => string): Promise<string> => { - const matches = Array.from(message.matchAll(regex), ([match]) => createPill(match.trimStart())); - let i = 0; - return message.replace(regex, () => ` ${matches[i++]}`); +const replaceWithMentionPills = async (message: string, regex: RegExp, createPill: (match: string) => string): Promise<string> => { + const matches = Array.from(message.matchAll(regex), ([match]) => ({ + pill: createPill(match.trimStart()), + leadingSpace: match.match(/^\s*/)?.[0] || '' + })); + let i = 0; + return message.replace(regex, () => `${matches[i].leadingSpace}${matches[i++].pill}`); };
68-75: LGTM: Quote prefix stripping is correct.The function correctly identifies and removes quote prefix lines (starting with '>'), returning the first non-quoted section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
ee/packages/federation-matrix/src/helpers/message.parsers.ts (1)
27-27: Preserve @here vs @ALL label in pills to round‑trip correctlyGeneral mentions are collapsed to @ALL: pills are created with label = room id (Line 153), and extractMentions maps any room pill to '@ALL' (Lines 37-38). Keep the visible label ('@here' or '@ALL') in the pill text and round‑trip it back.
Apply these diffs:
-const createMentionHtml = (id: string): string => `<a href="${MATRIX_TO_URL}${id}">${id}</a>`; +const createMentionHtml = (id: string, label = id): string => `<a href="${MATRIX_TO_URL}${id}">${label}</a>`;- .map(({ href, text }) => { + .map(({ href, text }) => { const userMatch = href.match(/@([^:]+):(.+)/); if (!userMatch) { - return { mention: '@all', realName: text }; + const label = (text || '').trim(); + const m = label === '@here' ? '@here' : '@all'; + return { mention: m, realName: text }; }- result = await replaceWithMentionPills(result, REGEX.general, () => createMentionHtml(externalRoomId)); + result = await replaceWithMentionPills(result, REGEX.general, (match) => createMentionHtml(externalRoomId, match));Also applies to: 31-46, 153-156
🧹 Nitpick comments (6)
ee/packages/federation-matrix/src/helpers/message.parsers.ts (5)
48-73: Avoid dynamic lookbehind regex; use indexOf with explicit word-boundary checksCurrent replaceMentions builds regex from variable input with lookbehinds, risking ReDoS and unnecessary complexity. Switch to a deterministic indexOf-based replacer with explicit boundary checks and handle general mentions via the same helper.
Based on static analysis hints
Apply this diff:
-const replaceMentions = (message: string, mentions: Array<{ mention: string; realName: string }>): string => { - if (!mentions.length) return message; - - let parsedMessage = ''; - let remaining = message; - - for (const { mention, realName } of mentions) { - const regex = new RegExp(`(?<!\\w)${realName.replace(/[.*+?^${}()|[\]\\]/g, '\\$&')}(?!\\w)`); - const position = remaining.search(regex); - - if (position !== -1) { - parsedMessage += remaining.slice(0, position) + mention; - remaining = remaining.slice(position + realName.length); - } else if (realName.startsWith('!')) { - const allRegex = /(?<!\w)@all(?!\w)/; - const allPosition = remaining.search(allRegex); - if (allPosition !== -1) { - parsedMessage += remaining.slice(0, allPosition) + mention; - remaining = remaining.slice(allPosition + 4); // length of '@all' - } - } - } - - parsedMessage += remaining; - return parsedMessage.trim(); -}; +const replaceMentions = (message: string, mentions: Array<{ mention: string; realName: string }>): string => { + if (!mentions.length) return message; + + // Replace longer realNames first to avoid partial overlaps + const sorted = [...mentions].sort((a, b) => b.realName.length - a.realName.length); + + const isWord = (ch: string) => /\w/.test(ch); + const replaceWord = (src: string, token: string, replacement: string): string => { + if (!token) return src; + let i = 0; + let out = ''; + while (i < src.length) { + const at = src.indexOf(token, i); + if (at === -1) { + out += src.slice(i); + break; + } + const before = at === 0 ? '' : src[at - 1]; + const after = at + token.length >= src.length ? '' : src[at + token.length]; + if (!isWord(before) && !isWord(after)) { + out += src.slice(i, at) + replacement; + i = at + token.length; + } else { + // Not a word-bounded match; keep it and continue + out += src.slice(i, at + token.length); + i = at + token.length; + } + } + return out; + }; + + let result = message; + for (const { mention, realName } of sorted) { + // Room-level mentions: prefer @here over @all if present + if (realName.startsWith('!') || realName.startsWith('#')) { + result = replaceWord(result, '@here', mention); + result = replaceWord(result, '@all', mention); + continue; + } + result = replaceWord(result, realName, mention); + } + return result.trim(); +};
25-26: Constrain allowed attributes/schemes for during sanitizationstripHtml allows tags but doesn’t restrict attributes or schemes here. Explicitly allow only href, only http/https, and optionally drop non-Matrix targets to reduce XSS/phishing surface.
-const stripHtml = (html: string, keep: string[] = []): string => sanitizeHtml(html, { allowedTags: keep.includes('a') ? ['a'] : [] }); +const stripHtml = (html: string, keep: string[] = []): string => + sanitizeHtml(html, { + allowedTags: keep.includes('a') ? ['a'] : [], + allowedAttributes: keep.includes('a') ? { a: ['href'] } : {}, + allowProtocolRelative: false, + allowedSchemes: ['http', 'https'], + transformTags: keep.includes('a') + ? { + a: (tagName, attribs) => ({ + tagName, + attribs: attribs.href?.startsWith(MATRIX_TO_URL) ? { href: attribs.href } : {}, // drop non-matrix.to hrefs + }), + } + : {}, + });
75-79: Avoid unconditional leading spaces when inserting pillsPrepending a space for every replacement can introduce odd spacing. Compute a leading space only when the previous character isn’t whitespace.
-const replaceWithMentionPills = async (message: string, regex: RegExp, createPill: (match: string) => string): Promise<string> => { - const matches = Array.from(message.matchAll(regex), ([match]) => createPill(match.trimStart())); - let i = 0; - return message.replace(regex, () => ` ${matches[i++]}`); -}; +const replaceWithMentionPills = async (message: string, regex: RegExp, createPill: (match: string) => string): Promise<string> => { + const matches = Array.from(message.matchAll(regex), ([match]) => createPill(match.trimStart())); + let i = 0; + return message.replace(regex, (_m, ...args) => { + const offset = (args[args.length - 2] as number) ?? 0; + const needsSpace = offset > 0 && /\S/.test(message[offset - 1]); + return `${needsSpace ? ' ' : ''}${matches[i++]}`; + }); +};
174-180: Remove double parse and duplicate reply assembly in toExternalQuoteMessageFormatYou parse Markdown twice and build two reply payloads to return one pair. Build once using plain text for body and the already‑formatted HTML (from toExternalMessageFormat) for formatted_body.
- const markdownHtml = await marked.parse(message); - const withMentions = await toExternalMessageFormat({ message, externalRoomId, homeServerDomain }); - const withMentionsHtml = await marked.parse(withMentions); - - const reply1 = createReplyContent(externalRoomId, event, markdownHtml, withMentionsHtml); - const reply2 = createReplyContent(externalRoomId, event, message, withMentionsHtml); - - return { - message: reply2.body, - formattedMessage: reply1.formatted_body ?? '', - }; + const formattedWithMentions = await toExternalMessageFormat({ message, externalRoomId, homeServerDomain }); + const reply = createReplyContent(externalRoomId, event, message, formattedWithMentions); + return { + message: reply.body, + formattedMessage: reply.formatted_body ?? '', + };Also applies to: 181-184
135-137: Prefer sanitizer to drop quote tags instead of regexYou build a regex from tag names to strip /
. Since tags are known constants, it’s safe, but using sanitizeHtml to drop them avoids edge cases with nested/irregular markup.Based on static analysis hints
Example:
let cleaned = sanitizeHtml(formattedMessage, { allowedTags: ['a'], // drop mx-reply/blockquote implicitly allowedAttributes: { a: ['href'] }, allowedSchemes: ['http','https'], });ee/packages/federation-matrix/src/helpers/message.parsers.spec.ts (1)
78-87: Future: add a round‑trip test to preserve @here vs @ALL (if/when implemented)If you adopt label‑preserving pills, add cases ensuring '@here' stays '@here' across toExternal → toInternal and in quotes.
Also applies to: 1064-1116, 1412-1444
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Jira integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
ee/packages/federation-matrix/src/helpers/message.parsers.spec.ts(1 hunks)ee/packages/federation-matrix/src/helpers/message.parsers.ts(5 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
ee/packages/federation-matrix/src/helpers/message.parsers.spec.ts (1)
ee/packages/federation-matrix/src/helpers/message.parsers.ts (1)
toInternalQuoteMessageFormat(121-141)
🪛 ast-grep (0.39.5)
ee/packages/federation-matrix/src/helpers/message.parsers.ts
[warning] 54-54: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp((?<!\\w)${realName.replace(/[.*+?^${}()|[\]\\]/g, '\\$&')}(?!\\w))
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
[warning] 135-135: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(<${tag}[^>]*>.*?</${tag}>, 'gis')
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: CodeQL-Build
- GitHub Check: CodeQL-Build
🔇 Additional comments (1)
ee/packages/federation-matrix/src/helpers/message.parsers.spec.ts (1)
1020-1036: Good addition; closes the split‑script/onerror gapThis test meaningfully verifies sanitization against tricky split <script> and onerror/img payloads. LGTM.
6fc8f40 to
928973e
Compare
89ebe54 to
94445d8
Compare
c252629 to
47fd833
Compare
344b975 to
89612f8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
ee/packages/federation-matrix/src/helpers/message.parsers.ts (1)
31-46: @here is collapsed to @ALL; no distinction in extractMentions.The current implementation returns
@allfor any anchor that doesn't match a user pattern (line 37). This means@hereand@allmentions are indistinguishable after extraction, which can lead to incorrect behavior if@hereis expected to have different semantics (e.g., notifying only online users vs. all users).This concern was previously raised in past review comments.
🧹 Nitpick comments (1)
ee/packages/federation-matrix/src/helpers/message.parsers.ts (1)
134-137: Dynamic regex from constant array is safe but consider precompilation.The dynamic regex at line 136 is constructed from the
MATRIX_QUOTE_TAGSconstant array (not user input), so it's safe from injection. However, reconstructing the regex on every call is inefficient.Consider precompiling the regex patterns:
+const QUOTE_TAG_REGEXES = MATRIX_QUOTE_TAGS.map(tag => new RegExp(`<${tag}[^>]*>.*?</${tag}>`, 'gis')); + export const toInternalQuoteMessageFormat = async ({ homeServerDomain, formattedMessage, rawMessage, messageToReplyToUrl, senderExternalId, }: { messageToReplyToUrl: string; formattedMessage: string; rawMessage: string; homeServerDomain: string; senderExternalId: string; }): Promise<string> => { let cleaned = formattedMessage; - MATRIX_QUOTE_TAGS.forEach((tag) => { - cleaned = cleaned.replace(new RegExp(`<${tag}[^>]*>.*?</${tag}>`, 'gis'), ''); + QUOTE_TAG_REGEXES.forEach((regex) => { + cleaned = cleaned.replace(regex, ''); }); cleaned = stripHtml(cleaned, ['a']); return `[ ](${messageToReplyToUrl}) ${replaceMentions(stripQuotePrefix(rawMessage), extractMentions(cleaned, homeServerDomain, senderExternalId))}`; };
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Jira integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (1)
yarn.lockis excluded by!**/yarn.lock,!**/*.lock
📒 Files selected for processing (7)
.github/actions/build-docker/action.yml(0 hunks).github/workflows/ci.yml(0 hunks)apps/meteor/.docker/Dockerfile.alpine(0 hunks)apps/meteor/package.json(0 hunks)ee/packages/federation-matrix/package.json(1 hunks)ee/packages/federation-matrix/src/helpers/message.parsers.spec.ts(1 hunks)ee/packages/federation-matrix/src/helpers/message.parsers.ts(5 hunks)
💤 Files with no reviewable changes (4)
- .github/actions/build-docker/action.yml
- apps/meteor/package.json
- apps/meteor/.docker/Dockerfile.alpine
- .github/workflows/ci.yml
🚧 Files skipped from review as they are similar to previous changes (1)
- ee/packages/federation-matrix/package.json
🧰 Additional context used
🧬 Code graph analysis (1)
ee/packages/federation-matrix/src/helpers/message.parsers.spec.ts (1)
ee/packages/federation-matrix/src/helpers/message.parsers.ts (1)
toInternalQuoteMessageFormat(121-141)
🪛 ast-grep (0.39.6)
ee/packages/federation-matrix/src/helpers/message.parsers.ts
[warning] 54-54: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp((?<!\\w)${realName.replace(/[.*+?^${}()|[\]\\]/g, '\\$&')}(?!\\w))
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
[warning] 135-135: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(<${tag}[^>]*>.*?</${tag}>, 'gis')
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: CodeQL-Build
- GitHub Check: CodeQL-Build
🔇 Additional comments (3)
ee/packages/federation-matrix/src/helpers/message.parsers.spec.ts (1)
1020-1036: Excellent addition of XSS sanitization test.This test verifies that malicious HTML (including obfuscated script tags and event handlers) is properly sanitized in
toInternalQuoteMessageFormat. The test confirms that the sanitization layer strips dangerous content while preserving the plain text message, which is essential for security in federated environments.ee/packages/federation-matrix/src/helpers/message.parsers.ts (2)
75-79: Add tests for replaceWithMentionPills edge cases.No existing tests cover replaceWithMentionPills. Please add tests for:
- Multiple consecutive matches
- Matches at string boundaries
- Empty or whitespace-only matches
48-73: Verify realName length constraints No upstream limits found onrealNameprior to its use in the negative lookarounds; confirm or enforce a max length/sanitization to mitigate ReDoS risk.
5ce10a1 to
1a2264e
Compare
1a2264e to
3b3edce
Compare
8441b4f to
daa8f05
Compare
daa8f05 to
193fd19
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
ee/packages/federation-matrix/src/helpers/message.parsers.ts (1)
31-46: @here is collapsed into @ALL; preserve the original label.This issue was previously flagged: when
hrefpoints to a room ID (doesn't match the user pattern at line 35), the function returns@allfor the mention regardless of whether the original pill text was@hereor@all. This causes@herementions to be lost.The suggested fix from the previous review:
- Modify
createMentionHtmlto accept an optional label parameter- Pass the match text when calling
replaceWithMentionPillsfor general mentions- Preserve
@herevs@allby checking thetextparameter at line 37-const createMentionHtml = (id: string): string => `<a href="${MATRIX_TO_URL}${id}">${id}</a>`; +const createMentionHtml = (id: string, label = id): string => `<a href="${MATRIX_TO_URL}${id}">${label}</a>`;And at line 37:
- return { mention: '@all', realName: text }; + const mention = text === '@here' ? '@here' : '@all'; + return { mention, realName: text };And at line 153 in
toExternalMessageFormat:- result = await replaceWithMentionPills(result, REGEX.general, () => createMentionHtml(externalRoomId)); + result = await replaceWithMentionPills(result, REGEX.general, (match) => createMentionHtml(externalRoomId, match));
🧹 Nitpick comments (2)
ee/packages/federation-matrix/src/helpers/message.parsers.ts (2)
48-73: ReDoS risk still flagged by static analysis.While the previous review marked the ReDoS concern as addressed, the static analysis tool continues to flag line 55 where a dynamic regex with negative lookbehind/lookahead is constructed from user-controlled
realName. AlthoughrealNameis escaped, the pattern complexity with lookarounds can still cause performance issues on certain inputs.Consider validating
realNamelength before constructing the regex or switching to a simpler replacement strategy without lookarounds:const replaceMentions = (message: string, mentions: Array<{ mention: string; realName: string }>): string => { if (!mentions.length) return message; let result = message; // Sort by length descending to avoid partial replacements const sorted = [...mentions].sort((a, b) => b.realName.length - a.realName.length); for (const { mention, realName } of sorted) { const escaped = realName.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'); const regex = new RegExp(`\\b${escaped}\\b`, 'g'); result = result.replace(regex, mention); } return result.trim(); };This approach uses word boundaries (
\b) instead of lookarounds, which are more efficient and less prone to ReDoS.
160-185: LGTM! Properly formatted Matrix replies.The rewritten
toExternalQuoteMessageFormatcorrectly generates Matrix reply content with both plain text and HTML formatted bodies, including proper mention handling and reply metadata.Optional optimization: Lines 174-176 call
marked.parsemultiple times. If performance becomes a concern, consider caching or restructuring to reduce redundant parsing:const markdownHtml = await marked.parse(message); const withMentionsHtml = await marked.parse(await toExternalMessageFormat({ message, externalRoomId, homeServerDomain })); const reply1 = createReplyContent(externalRoomId, event, markdownHtml, withMentionsHtml); const reply2 = createReplyContent(externalRoomId, event, message, withMentionsHtml);This reduces the calls from 3 to 2 by reusing
withMentionsHtml.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Jira integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (1)
yarn.lockis excluded by!**/yarn.lock,!**/*.lock
📒 Files selected for processing (7)
.github/actions/build-docker/action.yml(0 hunks).github/workflows/ci.yml(1 hunks)apps/meteor/.docker/Dockerfile.alpine(0 hunks)apps/meteor/package.json(0 hunks)ee/packages/federation-matrix/package.json(1 hunks)ee/packages/federation-matrix/src/helpers/message.parsers.spec.ts(1 hunks)ee/packages/federation-matrix/src/helpers/message.parsers.ts(5 hunks)
💤 Files with no reviewable changes (3)
- apps/meteor/.docker/Dockerfile.alpine
- apps/meteor/package.json
- .github/actions/build-docker/action.yml
🚧 Files skipped from review as they are similar to previous changes (2)
- ee/packages/federation-matrix/src/helpers/message.parsers.spec.ts
- .github/workflows/ci.yml
🧰 Additional context used
🪛 ast-grep (0.39.6)
ee/packages/federation-matrix/src/helpers/message.parsers.ts
[warning] 54-54: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp((?<!\\w)${realName.replace(/[.*+?^${}()|[\]\\]/g, '\\$&')}(?!\\w))
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
[warning] 135-135: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(<${tag}[^>]*>.*?</${tag}>, 'gis')
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: 📦 Build Packages
- GitHub Check: CodeQL-Build
- GitHub Check: CodeQL-Build
🔇 Additional comments (8)
ee/packages/federation-matrix/package.json (2)
13-13: LGTM! Type version is more specific.The update to
@types/sanitize-htmlfrom^2to^2.13.0improves reproducibility and is compatible with the[email protected]dependency at line 52.
37-55: All references to @vector-im/matrix-bot-sdk removed
Search found no remaining imports, requires, or mentions in the codebase.ee/packages/federation-matrix/src/helpers/message.parsers.ts (6)
1-20: LGTM! Well-structured Matrix types and patterns.The new type definitions and regex constants provide a solid foundation for the batched mention pipeline. The patterns correctly handle Matrix.to URLs and various mention formats.
22-29: LGTM! Clean HTML utility functions.The
escapeHtml,stripHtml,createMentionHtml, andextractAnchorshelpers are well-implemented and provide clear, focused functionality for the mention pipeline.
75-107: LGTM! Helper functions are well-implemented.
replaceWithMentionPills,stripQuotePrefix, andcreateReplyContentprovide clean, focused functionality. The Matrix reply format increateReplyContentfollows the standard specification correctly.
109-119: LGTM! Simplified and clean.The refactored
toInternalMessageFormatdelegates to the new batched pipeline, making the logic clear and maintainable.
121-141: LGTM! Quote handling is correct.The refactored
toInternalQuoteMessageFormatproperly strips Matrix quote tags, preserves anchor tags, and applies mention replacement.Note on static analysis warning: Line 136 is flagged for ReDoS risk, but this is a false positive since
tagcomes from the constant arrayMATRIX_QUOTE_TAGS = ['mx-reply', 'blockquote'], not user input. The pattern is safe.
143-158: LGTM! Batched mention replacement pipeline.The refactored
toExternalMessageFormatcorrectly applies the batched replacement pipeline for general, external, and internal mentions, then converts to Matrix HTML format.
Proposed changes (including videos or screenshots)
Issue(s)
Steps to test or reproduce
Further comments
Summary by CodeRabbit
Bug Fixes
Refactor
Tests
Chores