feat: added logging to worker for link unfurling#48
Conversation
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 16 minutes and 29 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds structured logging and a redirect-resolution helper to the link-unfurl worker; URL safety checks now log DNS/private-IP failures and errors; redirects are manually followed (limit 5) and validated; OG fetching uses resolved targets; job lifecycle and DB/update events are logged. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Job as Worker Job
participant Safe as isSafeUrl
participant Redirect as resolveRedirects
participant Fetch as OG Scraper (ogs)
participant DB as Database
participant Emit as EventEmitter
Job->>Safe: Validate initial URL
alt unsafe
Safe-->>Job: unsafe -> log warning
else safe
Safe-->>Job: safe
Job->>Redirect: HEAD (manual) follow up to 5
alt redirect invalid / exceeds limit
Redirect-->>Job: null -> log & abort
else resolvedUrl
Redirect-->>Job: resolvedUrl
Job->>Fetch: scrape OG from resolvedUrl
alt scrape error / !success
Fetch-->>Job: error -> log warning/error
else success
Fetch-->>Job: embeds
Job->>DB: update message embeds
DB-->>Job: rowsAffected
Job->>Emit: emit message:embeds:updated
Emit-->>Job: ack
end
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/worker/src/jobs/link-unfurl.ts`:
- Around line 46-47: The logs currently emit raw URLs (e.g., urlString,
fetchUrl) which can leak PII/tokens; create a small helper (e.g., sanitizeUrl or
redactQueryParams) that strips or masks the query string and use it wherever
URLs are logged (replace uses in the catch block logging logger.warn({ err, url:
urlString }, ...), and the other logging sites referencing fetchUrl and
urlString around lines 84-85, 101, 108-110, 123). Update all logger calls to
pass the sanitized URL value instead of the raw URL and keep the helper shared
across the file so every URL-bearing field is consistently redacted before
logging.
- Line 96: The fetch call currently uses redirect: "follow", which allows an
attacker to bypass isSafeUrl(fetchUrl) via redirect chains; change the fetch
option to redirect: "error" and then handle redirect responses explicitly: when
the fetch returns a 3xx status, read the Location header and run isSafeUrl() on
that target (and any subsequent candidate locations) before following or
rejecting the request. Update the code around the fetch invocation that uses
fetchUrl (and the surrounding logic that references isSafeUrl) to either reject
on redirects or validate Location headers and only perform a new validated fetch
when isSafeUrl returns true.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 245563e7-f2ad-4018-aa2e-03f43dc9c79d
📒 Files selected for processing (1)
apps/worker/src/jobs/link-unfurl.ts
safe HEAD requests
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/worker/src/jobs/link-unfurl.ts`:
- Around line 183-186: The early return when embeds.length === 0 leaves stale
data and skips emitting the required RealtimeMessageEmbedsUpdated event; instead
remove the early return in link-unfurl job logic and ensure you always update
the message.embeds column (set to [] when no embeds) and emit the
"message:embeds:updated" event with embeds: [] (matching
RealtimeMessageEmbedsUpdated) even when no embeds are produced; keep the logger
call (e.g., logger.info with job.id and messageId) but change it to note "No
embeds produced, clearing stored embeds" before performing the DB update and
event emission so clients (like use-message-sending.ts) receive the update to
clear their cache.
- Around line 80-111: resolveRedirects currently lets fetch() exceptions bubble
up; wrap the per-iteration fetch call in a try/catch inside resolveRedirects
(around the fetch in the for loop) so any network/DNS/timeout errors (including
AbortError) are caught, log a warning via logger.warn including the
startUrl/current URL and the caught error, and return null instead of throwing;
keep the existing redirect handling (location resolution, isSafeUrl check,
MAX_REDIRECTS logic) unchanged so callers like fetchOgEmbed receive null on
failure rather than an exception.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 0a034836-47f3-48ee-8612-f6c71bf308ef
📒 Files selected for processing (1)
apps/worker/src/jobs/link-unfurl.ts
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/worker/src/jobs/link-unfurl.ts`:
- Around line 101-104: The redirect handling can throw when constructing `next`
with `new URL(location, current)`, causing `resolveRedirects()` (called from
`fetchOgEmbed()`) to reject the whole `Promise.all()`; update
`resolveRedirects()` to guard against malformed Location headers by validating
or wrapping the `new URL(location, current)` creation in a try/catch and
returning null (or skipping that redirect) when it fails, so a single bad
redirect does not propagate an exception and instead yields a null/ignored
result for that URL.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 2a4f5324-98bb-4527-8075-4265a2602388
📒 Files selected for processing (1)
apps/worker/src/jobs/link-unfurl.ts
Summary
This PR adds comprehensive structured logging and safer redirect handling to the link unfurling worker job, improving observability and protecting against SSRF/private-IP targets during OG metadata scraping.
Changes
apps/worker/src/jobs/link-unfurl.ts
Added structured logging via imported
loggeracross the file (info/warn/error) with contextual objects.URL safety validation (
isSafeUrl)Redirect resolution (
resolveRedirects)MAX_REDIRECTS = 5.HEADrequests withredirect: "manual"and a user-agent, follows up to MAX_REDIRECTS, resolves relative Location headers, validates each hop withisSafeUrl, logs and returns null for unsafe targets, missing Location, fetch failures, or too many redirects.OG scraping (
fetchOgEmbed)open-graph-scraper(OGS) using the resolved URL with a timeout and fetchOptions (user-agent,redirect: "error").siteNamewhen applicable.Job processing (
createLinkUnfurlProcessor)message:embeds:updatedvia the provided emitter and logs completion with embedCount.Code Quality / Notes
resolveRedirects.Confidence Score: 4/5
Well-scoped and improves safety and observability. Minor concerns: no tests added for redirect resolution, DNS edge cases, or proxy-rule behavior; consider adding tests and documenting redirect-resolution behavior.