feat: scrape [html + markdown] #889

amhsirak · 2025-11-17T14:22:52Z

Summary by CodeRabbit

New Features
- Create "scrape" robots from a URL with selectable output formats (Markdown/HTML).
- Tabbed Robot creation UI to choose Extract or Scrape modes.
- View and download Markdown and HTML outputs from run results.
Updates
- Robot metadata now tracks type, URL, and output formats.
- Google Sheets and Airtable integrations hidden for scrape-type robots.
- Recording URL can be updated and used for scrape workflows.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-11-17T14:23:10Z

Walkthrough

Adds scrape-type robots: Playwright-based page fetch/cleanup, Turndown-based HTML→Markdown conversion, new scrape endpoints and UI, and propagation of requested output formats (markdown/html) through API, worker/scheduler, run serialization, sockets, and webhooks.

Changes

Cohort / File(s)	Summary
Dependencies `package.json`	Added `turndown`, `joplin-turndown-plugin-gfm` to dependencies and `@types/turndown` to devDependencies.
Markdown utilities `server/src/markdownify/markdown.ts`, `server/src/markdownify/scrape.ts`, `server/src/markdownify/test.ts`	New Turndown-based `parseMarkdown(html, baseUrl)` with custom rules and link cleaning; Playwright-backed `convertPageToMarkdown(url)` / `convertPageToHTML(url)` that sanitize DOM before conversion; small integration test script.
Server run API `server/src/api/record.ts`	Extended run handlers to accept `requestedFormats` (markdown/html), carry conversion outputs into `serializableOutput`, include optional `markdown`/`html` in responses, sockets and webhook payloads.
Worker & scheduler `server/src/pgboss-worker.ts`, `server/src/workflow-management/scheduler/index.ts`	Added scrape execution path for `recording_meta.type === 'scrape'`: validate URL, run conversions per formats, populate `serializableOutput`/`binaryOutput`, update run status, emit socket events, send webhooks, capture metrics, and ensure cleanup on success/failure.
Storage routes `server/src/routes/storage.ts`	PUT `/recordings/:id` updates `recording_meta.url` when `targetUrl` provided and rewrites workflow goto targets; added POST `/recordings/scrape` endpoint to create scrape-type robots with format validation (duplicate insertion visible in diff).
Models & types (server + client) `server/src/models/Robot.ts`, `src/context/globalInfo.tsx`, `src/components/robot/pages/*` (Edit/Duplicate/Settings)	Added optional `type?: 'extract'
Frontend API `src/api/storage.ts`	Added `createScrapeRobot(url, name?, formats = ['markdown'])` which POSTs to `/storage/recordings/scrape`.
Frontend UI `src/components/robot/RecordingsTable.tsx`, `src/components/robot/pages/RobotCreate.tsx`, `src/components/robot/pages/RobotIntegrationPage.tsx`, `src/components/run/RunContent.tsx`	RobotCreate: new Extract / Scrape tabs with scrape form and format selection wired to `createScrapeRobot`. RecordingsTable: pass `robotType` prop and hide Retrain/Duplicate for `scrape`. IntegrationPage: hide Google Sheets/Airtable for scrape robots. RunContent: render/download Markdown/HTML outputs when present.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant FE as Frontend
    participant API as Backend (/storage)
    participant DB as Database

    User->>FE: Create scrape robot (url, name, formats)
    FE->>API: POST /recordings/scrape { url, name, formats }
    API->>API: Validate url & formats
    API->>DB: Insert Robot (meta.type = scrape, url, formats)
    DB-->>API: Created Robot
    API-->>FE: 201 Created (robot)

sequenceDiagram
    participant Client
    participant Scheduler as scheduler/index.ts
    participant Worker as pgboss-worker.ts
    participant Scrape as markdownify/scrape.ts
    participant Markdown as markdownify/markdown.ts
    participant Socket as WebSocket
    participant Webhook as Webhook
    rect rgba(200,230,255,0.5)
      Client->>Scheduler: POST /robots/:id/runs (requestedFormats?)
      Scheduler->>Worker: start run (includes formats)
      Worker->>Socket: emit run-started
    end
    rect rgba(220,255,200,0.5)
      Worker->>Scrape: convertPageToMarkdown/convertPageToHTML(url)
      Scrape->>Scrape: Playwright fetch + sanitize DOM
      Scrape->>Markdown: parseMarkdown(cleanedHtml, baseUrl)
      Markdown-->>Scrape: markdown text
      Scrape-->>Worker: conversion results (markdown/html)
    end
    rect rgba(255,230,200,0.5)
      Worker->>DB: update run (serializableOutput, status: success/failed)
      Worker->>Socket: emit run-completed
      Worker->>Webhook: send run_completed payload (with markdown/html if present)
      Worker-->>Client: run result
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Areas to focus review on:
- server/src/markdownify/markdown.ts — Turndown rules, link handling, URL cleaning, and error handling.
- server/src/markdownify/scrape.ts — Playwright usage, DOM sanitization script, navigation timeouts, and browser closure on all paths.
- server/src/pgboss-worker.ts and server/src/workflow-management/scheduler/index.ts — scrape branch integration with run lifecycle, socket/webhook payload composition, and correct serialization of outputs.
- server/src/routes/storage.ts — validation, duplicated endpoint insertion in diff, and error handling for /recordings/scrape.
- Frontend type changes (RobotMeta) and UI branching: ensure no regressions in existing robot flows.

Possibly related PRs

#485 — Overlaps with PUT /recordings/:id changes and goto/targetUrl handling in server/src/routes/storage.ts.
#610 — Related changes to run execution flow and webhook payloads; may overlap with worker/API webhook logic.
#754 — Overlaps on RobotMeta shape changes across client components (type/formats).

Suggested labels

Type: Feature, Type: Integration, Scope: UI/UX

Suggested reviewers

RohitR311

Poem

🐰 I hopped through tidy pages, nibbled scripts and tags away,
Playwright swept the windows, Turndown turned the mess to hay,
Scrape robots hum and gather markdown and HTML in a crate,
I stamp my paw and celebrate — outputs neat upon my plate! 🎋

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 19.05% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat: scrape [html + markdown]' clearly describes the main feature added in this PR: support for scraping web pages and converting them to HTML and Markdown formats.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch markdownify

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…markdownify

coderabbitai

Actionable comments posted: 15

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/components/robot/pages/RobotEditPage.tsx (1)
723-734: Fix potential runtime error in getTargetUrl when what is missing

getTargetUrl currently does:
const lastPair =
  robot?.recording.workflow[robot?.recording.workflow.length - 1];
url = lastPair?.what.find((action) => action.action === "goto")?.args?.[0];
If lastPair exists but lastPair.what is undefined/null, lastPair?.what evaluates to undefined and the subsequent .find will throw. A safer version would guard both workflow length and what:
-  const lastPair =
-    robot?.recording.workflow[robot?.recording.workflow.length - 1];
-  url = lastPair?.what.find((action) => action.action === "goto")
-    ?.args?.[0];
+  const workflow = robot?.recording?.workflow;
+  const lastPair =
+    workflow && workflow.length > 0
+      ? workflow[workflow.length - 1]
+      : undefined;
+  const gotoAction = lastPair?.what?.find(
+    (action) => action.action === "goto"
+  );
+  url = gotoAction?.args?.[0];
Same pattern appears in RobotDuplicatePage.tsx; consider applying the same fix (or factoring this into a shared helper) to avoid similar crashes in both places.

Also applies to: 817-823

🧹 Nitpick comments (10)

src/components/run/RunContent.tsx (1)
727-735: Extract inline download logic to improve consistency.

The HTML download button contains inline download logic, while the Markdown download uses the extracted downloadMarkdown function. For consistency and maintainability, extract the HTML download logic to a separate function.

Add this function near the downloadMarkdown function (after line 413):
const downloadHTML = (content: string, filename: string) => {
  const blob = new Blob([content], { type: 'text/html;charset=utf-8;' });
  const url = URL.createObjectURL(blob);

  const link = document.createElement("a");
  link.href = url;
  link.setAttribute("download", filename);
  document.body.appendChild(link);
  link.click();
  document.body.removeChild(link);

  setTimeout(() => {
    URL.revokeObjectURL(url);
  }, 100);
};
Then update the button onClick:
                     <Box sx={{ mt: 2 }}>
                       <Button
-                        onClick={() => {
-                          const blob = new Blob([htmlContent], { type: 'text/html;charset=utf-8;' });
-                          const url = URL.createObjectURL(blob);
-                          const link = document.createElement("a");
-                          link.href = url;
-                          link.download = "output.html";
-                          link.click();
-                          setTimeout(() => URL.revokeObjectURL(url), 100);
-                        }}
+                        onClick={() => downloadHTML(htmlContent, 'output.html')}
                         sx={{ color: '#FF00C3', textTransform: 'none' }}
                       >
                         Download
                       </Button>
                     </Box>
src/components/robot/pages/RobotIntegrationPage.tsx (1)
734-781: Add user feedback for disabled integrations.

While the buttons are correctly disabled for scrape robots with appropriate visual feedback, users receive no explanation for why these integrations are unavailable. Consider adding a tooltip or informational message.

Add a tooltip to inform users:
               <Button
                 variant="outlined"
                 disabled={isScrapeRobot}
+                title={isScrapeRobot ? "Google Sheets integration is not available for scrape-type robots" : ""}
                 onClick={() => {
                   if (isScrapeRobot) return;
Alternatively, add an informational alert above the integration buttons when isScrapeRobot is true:
{isScrapeRobot && (
  <Alert severity="info" sx={{ mb: 2, maxWidth: "900px" }}>
    Google Sheets and Airtable integrations are not available for scrape-type robots. 
    Use Webhooks or MCP for scrape robot integrations.
  </Alert>
)}
src/components/robot/pages/RobotDuplicatePage.tsx (1)

19-30: RobotMeta type narrowing and formats look consistent; consider centralizing the type

The additions of type?: 'extract' | 'scrape' and formats?: ('markdown' | 'html')[] are aligned with the new scrape/markdown flow and are type-safe here. Since the same RobotMeta shape now exists in multiple pages (e.g., RobotEditPage.tsx), consider moving this interface to a shared types module to avoid drift the next time metadata fields change.

server/src/pgboss-worker.ts (1)

192-303: Avoid double failure handling for scrape runs in processRunExecution

The scrape branch already updates the run to failed, emits run-completed (failed) events, and destroys the browser inside its catch. Re‑throwing the error here causes the outer catch (executionError) to run as well, which:

updates the run status/log again,

emits another run-completed failure event,

and sends a separate run_failed webhook.

This duplication makes failure behavior harder to reason about and can spam downstream consumers.

Consider either:

Not rethrowing from the scrape catch (return { success: false } after local handling), or

Adding a flag to skip the generic failure path when the scrape branch has already fully handled the error.

That keeps scrape failures single-sourced and easier to maintain.

server/src/workflow-management/scheduler/index.ts (1)

328-358: Simplify scrape failure handling to avoid duplicate failure events

On scrape failure you:

update the run to failed,

emit run-completed (failed) over both namespaces,

destroy the browser,

and then throw error;.

The outer catch at the bottom of executeRun will then:

update the run to failed again,

send run_failed webhooks,

and emit another run-completed failure event.

This can cause double failure events for listeners (and makes it harder to reason about what fired when), even if the second emit happens after the namespace has been cleaned up.

A cleaner pattern would be to:

keep all scrape-specific failure handling (including webhooks if needed) inside this branch and return false instead of rethrowing, or

tag the error as “already handled” and branch accordingly in the outer catch.

That way each scheduled scrape run failure results in exactly one set of notifications.
src/components/robot/RecordingsTable.tsx (1)
776-783: Consider defensive check for undefined robotType.

The conditional checks robotType !== 'scrape' without verifying that robotType is defined. If row.type is undefined (e.g., for older robots created before this field was added), the menu items will still render, which might not be the intended behavior.

Consider adding a defensive check:
-        {robotType !== 'scrape' && (
+        {robotType && robotType !== 'scrape' && (
           <MenuItem onClick={() => { handleRetrain(); handleClose(); }}>

-        {robotType !== 'scrape' && (
+        {robotType && robotType !== 'scrape' && (
           <MenuItem onClick={() => { handleDuplicate(); handleClose(); }}>
Or provide a default value in the row mapping (lines 252-273) to ensure type is always defined.

Also applies to: 795-800
src/components/robot/pages/RobotCreate.tsx (2)
402-402: Accessibility: FormControl label should use FormLabel component.

The label "Output Format (Select at least one)" is rendered as a plain <p> tag instead of using the Material-UI FormLabel component, which would provide better accessibility and consistent styling.

Apply this diff:
                 <FormControl component="fieldset" sx={{ width: '100%', textAlign: 'left' }}>
-                  <p>Output Format (Select at least one)</p>
+                  <FormLabel component="legend">Output Format (Select at least one)</FormLabel>
                   <FormControlLabel
381-399: Consider adding URL validation before form submission.

While the onClick handler validates the URL at line 441-443, there's no real-time feedback when the user is typing. Consider adding URL validation on the TextField itself or showing a helper text to guide the user.

Example pattern:
const [urlError, setUrlError] = useState('');

const validateUrl = (value: string) => {
  try {
    new URL(value);
    setUrlError('');
  } catch {
    setUrlError('Please enter a valid URL');
  }
};
Then add to the TextField:
error={!!urlError}
helperText={urlError}
onBlur={(e) => validateUrl(e.target.value)}
server/src/api/record.ts (1)
692-695: Consider error handling for conversion failures.

The markdown and HTML conversions are called independently, but if one fails while the other succeeds, the entire run fails and neither result is saved. Consider handling partial success scenarios where at least one format conversion succeeds.

Consider this approach:
const serializableOutput: any = {};
const errors: string[] = [];

// Markdown conversion
if (formats.includes('markdown')) {
    try {
        markdown = await convertPageToMarkdown(url);
        serializableOutput.markdown = [{ content: markdown }];
    } catch (err: any) {
        errors.push(`Markdown conversion failed: ${err.message}`);
    }
}

// HTML conversion
if (formats.includes('html')) {
    try {
        html = await convertPageToHTML(url);
        serializableOutput.html = [{ content: html }];
    } catch (err: any) {
        errors.push(`HTML conversion failed: ${err.message}`);
    }
}

if (Object.keys(serializableOutput).length === 0) {
    throw new Error(`All conversions failed: ${errors.join('; ')}`);
}
Also applies to: 698-701
server/src/markdownify/scrape.ts (1)
12-12: Very long timeout without justification or fallback.

Both functions use a 100-second (100000ms) timeout for page load, which is unusually long. This could cause runs to hang for extended periods if a page is slow or unresponsive. Consider using a shorter timeout with proper error handling.

Apply this diff to use a more reasonable timeout:
-  await page.goto(url, { waitUntil: "networkidle", timeout: 100000 });
+  await page.goto(url, { waitUntil: "networkidle", timeout: 30000 });
Or make it configurable:
const SCRAPE_TIMEOUT = process.env.SCRAPE_TIMEOUT_MS 
  ? parseInt(process.env.SCRAPE_TIMEOUT_MS) 
  : 30000;

await page.goto(url, { waitUntil: "networkidle", timeout: SCRAPE_TIMEOUT });
Also applies to: 67-67

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 334fbbc and 6477fee.

📒 Files selected for processing (18)

package.json (3 hunks)
server/src/api/record.ts (11 hunks)
server/src/markdownify/markdown.ts (1 hunks)
server/src/markdownify/scrape.ts (1 hunks)
server/src/markdownify/test.ts (1 hunks)
server/src/models/Robot.ts (1 hunks)
server/src/pgboss-worker.ts (2 hunks)
server/src/routes/storage.ts (4 hunks)
server/src/workflow-management/scheduler/index.ts (3 hunks)
src/api/storage.ts (1 hunks)
src/components/robot/RecordingsTable.tsx (3 hunks)
src/components/robot/pages/RobotCreate.tsx (2 hunks)
src/components/robot/pages/RobotDuplicatePage.tsx (1 hunks)
src/components/robot/pages/RobotEditPage.tsx (2 hunks)
src/components/robot/pages/RobotIntegrationPage.tsx (2 hunks)
src/components/robot/pages/RobotSettingsPage.tsx (1 hunks)
src/components/run/RunContent.tsx (5 hunks)
src/context/globalInfo.tsx (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (8)

server/src/markdownify/markdown.ts (1)

maxun-core/src/browserSide/scraper.js (1)

depth (956-956)

server/src/markdownify/scrape.ts (1)

server/src/markdownify/markdown.ts (1)

parseMarkdown (1-98)

server/src/pgboss-worker.ts (4)

maxun-core/src/interpret.ts (1)

run (1620-1656)

server/src/markdownify/scrape.ts (2)

convertPageToMarkdown (8-57)

convertPageToHTML (63-111)

server/src/routes/webhook.ts (1)

sendWebhook (404-434)

server/src/browser-management/controller.ts (1)

destroyRemoteBrowser (96-139)

server/src/routes/storage.ts (1)

server/src/routes/record.ts (1)

AuthenticatedRequest (25-27)

src/components/robot/pages/RobotCreate.tsx (2)

src/context/globalInfo.tsx (1)

useGlobalInfoStore (189-189)

src/api/storage.ts (1)

createScrapeRobot (31-59)

server/src/markdownify/test.ts (1)

server/src/markdownify/scrape.ts (1)

convertPageToMarkdown (8-57)

server/src/workflow-management/scheduler/index.ts (3)

server/src/markdownify/scrape.ts (2)

convertPageToMarkdown (8-57)

convertPageToHTML (63-111)

server/src/routes/webhook.ts (1)

sendWebhook (404-434)

server/src/browser-management/controller.ts (1)

destroyRemoteBrowser (96-139)

server/src/api/record.ts (4)

server/src/markdownify/scrape.ts (2)

convertPageToMarkdown (8-57)

convertPageToHTML (63-111)

server/src/routes/webhook.ts (1)

sendWebhook (404-434)

server/src/browser-management/controller.ts (1)

destroyRemoteBrowser (96-139)

server/src/workflow-management/scheduler/index.ts (1)

handleRunRecording (597-622)

🔇 Additional comments (7)

src/components/robot/pages/RobotSettingsPage.tsx (1)

19-21: LGTM! Type extensions are clean and backward-compatible.

The optional fields added to RobotMeta (type, url, formats) align with the broader scrape functionality introduced in this PR and maintain backward compatibility.

src/context/globalInfo.tsx (1)

30-32: LGTM! RobotMeta extensions are consistent.

The optional fields added to RobotMeta align with the same changes made across multiple files (RobotSettingsPage.tsx, Robot.ts model). The consistency ensures type safety throughout the application.

server/src/models/Robot.ts (1)

12-14: LGTM! Server-side type extensions align with frontend.

The optional fields added to the RobotMeta interface on the server side correctly mirror the frontend type definitions, ensuring type consistency across the full stack.

src/components/robot/pages/RobotIntegrationPage.tsx (1)

131-132: Good scrape robot detection.

The isScrapeRobot flag cleanly derives from the robot metadata and is used consistently to disable incompatible integrations.

package.json (1)

49-49: All dependencies are current and secure.

The three dependencies specified—turndown (^7.2.2), joplin-turndown-plugin-gfm (^1.0.12), and @types/turndown (^5.0.6)—match their latest available versions and have no known security vulnerabilities according to GitHub's security advisory database.

src/components/robot/pages/RobotEditPage.tsx (1)

19-30: RobotMeta type/formats update is sound and consistent

The constrained type?: 'extract' | 'scrape' and added formats?: ('markdown' | 'html')[] align with the backend model and other UI code. No runtime impact here; just a safer, more explicit shape.

server/src/api/record.ts (1)

1047-1047: Review comment is incorrect: waitForRunCompletion still returns raw JSON, not a model instance.

The function uses raw: true in the Sequelize query at line 1043, which means it returns raw JSON data, not a Run model instance. The return value at line 1050 is this raw JSON object, which is correctly passed to formatRunResponse that expects plain object properties. There is no breaking change here.

Likely an incorrect or invalid review comment.

coderabbitai · 2025-11-20T14:13:44Z

server/src/api/record.ts

+        if (recording.recording_meta.type === 'scrape') {
+            logger.log('info', `Executing scrape robot for API run ${id}`);
+
+            let formats = recording.recording_meta.formats || ['markdown'];
+
+            // Override if API request defines formats
+            if (requestedFormats && Array.isArray(requestedFormats) && requestedFormats.length > 0) {
+                formats = requestedFormats.filter((f): f is 'markdown' | 'html' => ['markdown', 'html'].includes(f));
+            }
+
+            await run.update({
+                status: 'running',
+                log: `Converting page to: ${formats.join(', ')}`
+            });
+
+            try {
+                const url = recording.recording_meta.url;
+
+                if (!url) {
+                    throw new Error('No URL specified for markdown robot');
+                }
+
+                let markdown = '';
+                let html = '';
+                const serializableOutput: any = {};
+
+                // Markdown conversion
+                if (formats.includes('markdown')) {
+                    markdown = await convertPageToMarkdown(url);
+                    serializableOutput.markdown = [{ content: markdown }];
+                }
+
+                // HTML conversion
+                if (formats.includes('html')) {
+                    html = await convertPageToHTML(url);
+                    serializableOutput.html = [{ content: html }];
+                }
+
+                await run.update({
+                    status: 'success',
+                    finishedAt: new Date().toLocaleString(),
+                    log: `${formats.join(', ')} conversion completed successfully`,
+                    serializableOutput,
+                    binaryOutput: {},
+                });
+
+                logger.log('info', `Markdown robot execution completed for API run ${id}`);
+
+                // Push success socket event
+                try {
+                    const completionData = {
+                        runId: plainRun.runId,
+                        robotMetaId: plainRun.robotMetaId,
+                        robotName: recording.recording_meta.name,
+                        status: 'success',
+                        finishedAt: new Date().toLocaleString()
+                    };
+
+                    serverIo
+                        .of('/queued-run')
+                        .to(`user-${userId}`)
+                        .emit('run-completed', completionData);
+                } catch (socketError: any) {
+                    logger.log(
+                        'warn',
+                        `Failed to send run-completed notification for markdown robot run ${id}: ${socketError.message}`
+                    );
+                }
+
+                // Build webhook payload
+                const webhookPayload: any = {
+                    robot_id: plainRun.robotMetaId,
+                    run_id: plainRun.runId,
+                    robot_name: recording.recording_meta.name,
+                    status: 'success',
+                    started_at: plainRun.startedAt,
+                    finished_at: new Date().toLocaleString(),
+                    metadata: {
+                        browser_id: plainRun.browserId,
+                        user_id: userId,
+                    },
+                };
+
+                if (formats.includes('markdown')) webhookPayload.markdown = markdown;
+                if (formats.includes('html')) webhookPayload.html = html;
+
+                try {
+                    await sendWebhook(plainRun.robotMetaId, 'run_completed', webhookPayload);
+                    logger.log(
+                        'info',
+                        `Webhooks sent successfully for markdown robot API run ${plainRun.runId}`
+                    );
+                } catch (webhookError: any) {
+                    logger.log(
+                        'warn',
+                        `Failed to send webhooks for markdown robot run ${plainRun.runId}: ${webhookError.message}`
+                    );
+                }
+
+                await destroyRemoteBrowser(plainRun.browserId, userId);
+
+                return {
+                    success: true,
+                    interpretationInfo: run.toJSON()
+                };
+            } catch (error: any) {
+                logger.log(
+                    'error',
+                    `${formats.join(', ')} conversion failed for API run ${id}: ${error.message}`
+                );
+
+                await run.update({
+                    status: 'failed',
+                    finishedAt: new Date().toLocaleString(),
+                    log: `${formats.join(', ')} conversion failed: ${error.message}`,
+                });
+
+                // Send failure socket event
+                try {
+                    const failureData = {
+                        runId: plainRun.runId,
+                        robotMetaId: plainRun.robotMetaId,
+                        robotName: recording.recording_meta.name,
+                        status: 'failed',
+                        finishedAt: new Date().toLocaleString()
+                    };
+
+                    serverIo
+                        .of('/queued-run')
+                        .to(`user-${userId}`)
+                        .emit('run-completed', failureData);
+                } catch (socketError: any) {
+                    logger.log(
+                        'warn',
+                        `Failed to send run-failed notification for markdown robot run ${id}: ${socketError.message}`
+                    );
+                }
+
+                await destroyRemoteBrowser(plainRun.browserId, userId);
+
+                throw error;
+            }
+        }


⚠️ Potential issue | 🟡 Minor

Inconsistent naming: logs reference "markdown robot" but supports multiple formats.

Throughout the scrape execution block, logging messages refer to "markdown robot" even though the code correctly handles both markdown and html formats based on the formats array. This creates confusion about the actual functionality.

Update log messages to be format-agnostic:

- logger.log('info', `Executing scrape robot for API run ${id}`); + logger.log('info', `Executing scrape robot for API run ${id} with formats: ${formats.join(', ')}`); try { const url = recording.recording_meta.url; if (!url) { - throw new Error('No URL specified for markdown robot'); + throw new Error('No URL specified for scrape robot'); } - logger.log('info', `Markdown robot execution completed for API run ${id}`); + logger.log('info', `Scrape robot execution completed for API run ${id}`); logger.log( 'warn', - `Failed to send run-completed notification for markdown robot run ${id}: ${socketError.message}` + `Failed to send run-completed notification for scrape robot run ${id}: ${socketError.message}` ); try { await sendWebhook(plainRun.robotMetaId, 'run_completed', webhookPayload); logger.log( 'info', - `Webhooks sent successfully for markdown robot API run ${plainRun.runId}` + `Webhooks sent successfully for scrape robot API run ${plainRun.runId}` ); } catch (webhookError: any) { logger.log( 'warn', - `Failed to send webhooks for markdown robot run ${plainRun.runId}: ${webhookError.message}` + `Failed to send webhooks for scrape robot run ${plainRun.runId}: ${webhookError.message}` ); }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (recording.recording_meta.type === 'scrape') {

logger.log('info', `Executing scrape robot for API run ${id}`);

let formats = recording.recording_meta.formats || ['markdown'];

// Override if API request defines formats

if (requestedFormats && Array.isArray(requestedFormats) && requestedFormats.length > 0) {

formats = requestedFormats.filter((f): f is 'markdown' | 'html' => ['markdown', 'html'].includes(f));

}

await run.update({

status: 'running',

log: `Converting page to: ${formats.join(', ')}`

});

try {

const url = recording.recording_meta.url;

if (!url) {

throw new Error('No URL specified for markdown robot');

}

let markdown = '';

let html = '';

const serializableOutput: any = {};

// Markdown conversion

if (formats.includes('markdown')) {

markdown = await convertPageToMarkdown(url);

serializableOutput.markdown = [{ content: markdown }];

}

// HTML conversion

if (formats.includes('html')) {

html = await convertPageToHTML(url);

serializableOutput.html = [{ content: html }];

}

await run.update({

status: 'success',

finishedAt: new Date().toLocaleString(),

log: `${formats.join(', ')} conversion completed successfully`,

serializableOutput,

binaryOutput: {},

});

logger.log('info', `Markdown robot execution completed for API run ${id}`);

// Push success socket event

try {

const completionData = {

runId: plainRun.runId,

robotMetaId: plainRun.robotMetaId,

robotName: recording.recording_meta.name,

status: 'success',

finishedAt: new Date().toLocaleString()

};

serverIo

.of('/queued-run')

.to(`user-${userId}`)

.emit('run-completed', completionData);

} catch (socketError: any) {

logger.log(

'warn',

`Failed to send run-completed notification for markdown robot run ${id}: ${socketError.message}`

);

}

// Build webhook payload

const webhookPayload: any = {

robot_id: plainRun.robotMetaId,

run_id: plainRun.runId,

robot_name: recording.recording_meta.name,

status: 'success',

started_at: plainRun.startedAt,

finished_at: new Date().toLocaleString(),

metadata: {

browser_id: plainRun.browserId,

user_id: userId,

},

};

if (formats.includes('markdown')) webhookPayload.markdown = markdown;

if (formats.includes('html')) webhookPayload.html = html;

try {

await sendWebhook(plainRun.robotMetaId, 'run_completed', webhookPayload);

logger.log(

'info',

`Webhooks sent successfully for markdown robot API run ${plainRun.runId}`

);

} catch (webhookError: any) {

logger.log(

'warn',

`Failed to send webhooks for markdown robot run ${plainRun.runId}: ${webhookError.message}`

);

}

await destroyRemoteBrowser(plainRun.browserId, userId);

return {

success: true,

interpretationInfo: run.toJSON()

};

} catch (error: any) {

logger.log(

'error',

`${formats.join(', ')} conversion failed for API run ${id}: ${error.message}`

);

await run.update({

status: 'failed',

finishedAt: new Date().toLocaleString(),

log: `${formats.join(', ')} conversion failed: ${error.message}`,

});

// Send failure socket event

try {

const failureData = {

runId: plainRun.runId,

robotMetaId: plainRun.robotMetaId,

robotName: recording.recording_meta.name,

status: 'failed',

finishedAt: new Date().toLocaleString()

};

serverIo

.of('/queued-run')

.to(`user-${userId}`)

.emit('run-completed', failureData);

} catch (socketError: any) {

logger.log(

'warn',

`Failed to send run-failed notification for markdown robot run ${id}: ${socketError.message}`

);

}

await destroyRemoteBrowser(plainRun.browserId, userId);

throw error;

}

}

if (recording.recording_meta.type === 'scrape') {

logger.log('info', `Executing scrape robot for API run ${id} with formats: ${formats.join(', ')}`);

let formats = recording.recording_meta.formats || ['markdown'];

// Override if API request defines formats

if (requestedFormats && Array.isArray(requestedFormats) && requestedFormats.length > 0) {

formats = requestedFormats.filter((f): f is 'markdown' | 'html' => ['markdown', 'html'].includes(f));

}

await run.update({

status: 'running',

log: `Converting page to: ${formats.join(', ')}`

});

try {

const url = recording.recording_meta.url;

if (!url) {

throw new Error('No URL specified for scrape robot');

}

let markdown = '';

let html = '';

const serializableOutput: any = {};

// Markdown conversion

if (formats.includes('markdown')) {

markdown = await convertPageToMarkdown(url);

serializableOutput.markdown = [{ content: markdown }];

}

// HTML conversion

if (formats.includes('html')) {

html = await convertPageToHTML(url);

serializableOutput.html = [{ content: html }];

}

await run.update({

status: 'success',

finishedAt: new Date().toLocaleString(),

log: `${formats.join(', ')} conversion completed successfully`,

serializableOutput,

binaryOutput: {},

});

logger.log('info', `Scrape robot execution completed for API run ${id}`);

// Push success socket event

try {

const completionData = {

runId: plainRun.runId,

robotMetaId: plainRun.robotMetaId,

robotName: recording.recording_meta.name,

status: 'success',

finishedAt: new Date().toLocaleString()

};

serverIo

.of('/queued-run')

.to(`user-${userId}`)

.emit('run-completed', completionData);

} catch (socketError: any) {

logger.log(

'warn',

`Failed to send run-completed notification for scrape robot run ${id}: ${socketError.message}`

);

}

// Build webhook payload

const webhookPayload: any = {

robot_id: plainRun.robotMetaId,

run_id: plainRun.runId,

robot_name: recording.recording_meta.name,

status: 'success',

started_at: plainRun.startedAt,

finished_at: new Date().toLocaleString(),

metadata: {

browser_id: plainRun.browserId,

user_id: userId,

},

};

if (formats.includes('markdown')) webhookPayload.markdown = markdown;

if (formats.includes('html')) webhookPayload.html = html;

try {

await sendWebhook(plainRun.robotMetaId, 'run_completed', webhookPayload);

logger.log(

'info',

`Webhooks sent successfully for scrape robot API run ${plainRun.runId}`

);

} catch (webhookError: any) {

logger.log(

'warn',

`Failed to send webhooks for scrape robot run ${plainRun.runId}: ${webhookError.message}`

);

}

await destroyRemoteBrowser(plainRun.browserId, userId);

return {

success: true,

interpretationInfo: run.toJSON()

};

} catch (error: any) {

logger.log(

'error',

`${formats.join(', ')} conversion failed for API run ${id}: ${error.message}`

);

await run.update({

status: 'failed',

finishedAt: new Date().toLocaleString(),

log: `${formats.join(', ')} conversion failed: ${error.message}`,

});

// Send failure socket event

try {

const failureData = {

runId: plainRun.runId,

robotMetaId: plainRun.robotMetaId,

robotName: recording.recording_meta.name,

status: 'failed',

finishedAt: new Date().toLocaleString()

};

serverIo

.of('/queued-run')

.to(`user-${userId}`)

.emit('run-completed', failureData);

} catch (socketError: any) {

logger.log(

'warn',

`Failed to send run-failed notification for markdown robot run ${id}: ${socketError.message}`

);

}

await destroyRemoteBrowser(plainRun.browserId, userId);

throw error;

}

}

coderabbitai · 2025-11-20T14:13:44Z

server/src/api/record.ts

+            // Override if API request defines formats
+            if (requestedFormats && Array.isArray(requestedFormats) && requestedFormats.length > 0) {
+                formats = requestedFormats.filter((f): f is 'markdown' | 'html' => ['markdown', 'html'].includes(f));
+            }


⚠️ Potential issue | 🟡 Minor

Missing validation: requestedFormats should be validated before use.

When requestedFormats is provided, line 672 filters it to only include valid formats, but there's no check to ensure the resulting array isn't empty. If a caller passes formats: ['invalid'], the filtered array would be empty and no output would be generated, but the run would still succeed.

Apply this diff to add validation:

// Override if API request defines formats if (requestedFormats && Array.isArray(requestedFormats) && requestedFormats.length > 0) { formats = requestedFormats.filter((f): f is 'markdown' | 'html' => ['markdown', 'html'].includes(f)); + if (formats.length === 0) { + throw new Error('No valid formats specified. Supported formats: markdown, html'); + } }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Override if API request defines formats

if (requestedFormats && Array.isArray(requestedFormats) && requestedFormats.length > 0) {

formats = requestedFormats.filter((f): f is 'markdown' | 'html' => ['markdown', 'html'].includes(f));

}

// Override if API request defines formats

if (requestedFormats && Array.isArray(requestedFormats) && requestedFormats.length > 0) {

formats = requestedFormats.filter((f): f is 'markdown' | 'html' => ['markdown', 'html'].includes(f));

if (formats.length === 0) {

throw new Error('No valid formats specified. Supported formats: markdown, html');

}

}

🤖 Prompt for AI Agents

In server/src/api/record.ts around lines 670 to 673, the code filters requestedFormats but does not validate the result; after filtering, check whether the resulting formats array is non-empty and if it is empty return a 400 Bad Request (or throw a validation error) with a clear message like "requestedFormats contains no supported formats; supported: markdown, html"; ensure you stop further processing when invalid so the run does not succeed with no outputs.

coderabbitai · 2025-11-20T14:13:44Z

server/src/markdownify/markdown.ts

+  // Convert HTML → Markdown
+  try {
+    let out = await t.turndown(tidiedHtml);
+    out = fixBrokenLinks(out);
+    out = stripSkipLinks(out);
+    return out.trim();
+  } catch (err) {
+    console.error("HTML→Markdown failed", { err });
+    return "";
+  }


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

Guard against tidyHtml returning null before passing to Turndown

tidyHtml currently does:

function tidyHtml(html: string): string { const $ = cheerio.load(html); // ... manuallyCleanedElements.forEach((tag) => $(tag).remove()); return $("body").html(); }

But $("body").html() can be null, so tidiedHtml in parseMarkdown may be null even though the function is typed as returning string. That value is then passed into t.turndown(...), which can throw and push you into the catch path unnecessarily.

Two concrete improvements:

-function tidyHtml(html: string): string { +function tidyHtml(html: string): string { const cheerio = require("cheerio"); const $ = cheerio.load(html); // ... - manuallyCleanedElements.forEach((tag) => $(tag).remove()); - return $("body").html(); + manuallyCleanedElements.forEach((tag) => $(tag).remove()); + + const bodyHtml = $("body").html(); + return bodyHtml || html; }

and/or in parseMarkdown:

- const tidiedHtml = tidyHtml(html); + const tidiedHtml = tidyHtml(html) || html;

This avoids spurious conversion failures and aligns the implementation with the declared return type.

Also applies to: 124-144

I'll verify the review comment by examining the actual code and confirming the Cheerio behavior.

🏁 Script executed:

#!/bin/bash # Examine the markdown.ts file to see the parseMarkdown function and tidyHtml implementation fd -e ts -n "markdown.ts" | head -5 | xargs -I {} cat "{}"

Length of output: 230

🏁 Script executed:

#!/bin/bash # Search for tidyHtml function definition to confirm the implementation rg -n "function tidyHtml" --type ts -A 15

Length of output: 920

🏁 Script executed:

#!/bin/bash # Search for parseMarkdown to see how tidiedHtml is used rg -n "const tidiedHtml" --type ts -B 5 -A 10

Length of output: 1063

🏁 Script executed:

#!/bin/bash # Check if cheerio is in package.json and what version fd -e json "package.json" | head -1 | xargs -I {} grep -A 50 '"dependencies"' {} | grep -i cheerio

Length of output: 0

🌐 Web query:

Cheerio .html() method return type null or string

💡 Result:

Cheerio's .html() returns string | null — it returns the HTML of the first matched element, or null if nothing is matched. [1][2]

Example (JS/TS):
const $ = cheerio.load('');
const v = $('div').html(); // v is string | null — here null if no
exists

Sources: Cheerio API/docs. [1] [2]

References
[1] Cheerio — Classes: Cheerio.html() signature. [2] Cheerio — CheerioAPI / docs.

--- 🏁 Script executed: ```shell #!/bin/bash # Get the complete tidyHtml function (should be around lines 124-144) sed -n '124,150p' server/src/markdownify/markdown.ts

Length of output: 501

🏁 Script executed:

#!/bin/bash # Also check the full parseMarkdown function around line 88-97 sed -n '1,100p' server/src/markdownify/markdown.ts | tail -30

Length of output: 600

Guard against tidyHtml returning null before passing to Turndown

tidyHtml at line 124 declares a return type of string, but currently returns $("body").html() directly. Cheerio's .html() method returns string | null, so if the body element is missing or empty, the function returns null despite the type annotation.

This null value is then passed to t.turndown(tidiedHtml) at line 91, which expects a string. While the try/catch will handle the resulting error, this creates unnecessary failures and obscures actual issues.

Two concrete improvements:

At line 124–144 in tidyHtml:

function tidyHtml(html: string): string { const cheerio = require("cheerio"); const $ = cheerio.load(html); // ... manuallyCleanedElements.forEach((tag) => $(tag).remove()); const bodyHtml = $("body").html(); return bodyHtml || html; }

Alternatively at line 12 in parseMarkdown:

const tidiedHtml = tidyHtml(html) || html;

This aligns the implementation with the declared return type and prevents unnecessary conversion failures.

🤖 Prompt for AI Agents

In server/src/markdownify/markdown.ts around lines 88–97 (and tidyHtml implementation around lines 124–144), tidyHtml can return null because cheerio's $("body").html() is string | null; update tidyHtml to coerce a non-null string (e.g., const bodyHtml = $("body").html(); return bodyHtml || html;) or, alternatively, change the caller at line ~88 to use const tidiedHtml = tidyHtml(html) || html before passing to t.turndown; ensure the value passed to turndown is always a string to prevent runtime errors.

coderabbitai · 2025-11-20T14:13:45Z

server/src/markdownify/scrape.ts

+export async function convertPageToMarkdown(url: string): Promise<string> {
+  const browser = await chromium.launch();
+  const page = await browser.newPage();
+
+  await page.goto(url, { waitUntil: "networkidle", timeout: 100000 });
+
+  await page.addInitScript(() => {
+    const selectors = [
+      "script",
+      "style",
+      "link[rel='stylesheet']",
+      "noscript",
+      "meta",
+      "svg",
+      "img",
+      "picture",
+      "source",
+      "video",
+      "audio",
+      "iframe",
+      "object",
+      "embed"
+    ];
+
+    selectors.forEach(sel => {
+      document.querySelectorAll(sel).forEach(e => e.remove());
+    });
+
+    // Remove inline event handlers (onclick, onload…)
+    const all = document.querySelectorAll("*");
+    all.forEach(el => {
+      [...el.attributes].forEach(attr => {
+        if (attr.name.startsWith("on")) {
+          el.removeAttribute(attr.name);
+        }
+      });
+    });
+  });
+
+  // Re-extract HTML after cleanup
+  const cleanedHtml = await page.evaluate(() => {
+    return document.documentElement.outerHTML;
+  });
+
+  await browser.close();
+
+  // Convert cleaned HTML → Markdown
+  const markdown = await parseMarkdown(cleanedHtml, url);
+  return markdown;
+}


⚠️ Potential issue | 🟠 Major

Missing resource cleanup: browser should be closed in a finally block.

If an error occurs during page navigation or evaluation, the browser instance won't be closed, leading to resource leaks. Both functions should wrap browser operations in try-finally blocks.

Apply this diff to both functions:

export async function convertPageToMarkdown(url: string): Promise<string> { const browser = await chromium.launch(); - const page = await browser.newPage(); + try { + const page = await browser.newPage(); - await page.goto(url, { waitUntil: "networkidle", timeout: 100000 }); + await page.goto(url, { waitUntil: "networkidle", timeout: 100000 }); - await page.addInitScript(() => { - // ... cleanup script - }); + // ... rest of the logic - const cleanedHtml = await page.evaluate(() => { - return document.documentElement.outerHTML; - }); + const cleanedHtml = await page.evaluate(() => { + return document.documentElement.outerHTML; + }); - await browser.close(); + const markdown = await parseMarkdown(cleanedHtml, url); + return markdown; + } finally { + await browser.close(); + } +}

Also applies to: 63-111

🤖 Prompt for AI Agents

In server/src/markdownify/scrape.ts around lines 8-57 (and similarly at 63-111), the browser is launched and operations are performed but browser.close() is only called on the happy path; wrap the Playwright flow in a try/finally so the browser is closed regardless of errors. Specifically, after launching the browser (and creating the page), move the navigation/evaluation logic into a try block and put await browser.close() in a finally block (guarding for a possibly undefined browser), optionally closing the page if created; rethrow the error after cleanup so behavior is preserved.

coderabbitai · 2025-11-20T14:13:45Z

server/src/markdownify/scrape.ts

+  await page.addInitScript(() => {
+    const selectors = [
+      "script",
+      "style",
+      "link[rel='stylesheet']",
+      "noscript",
+      "meta",
+      "svg",
+      "img",
+      "picture",
+      "source",
+      "video",
+      "audio",
+      "iframe",
+      "object",
+      "embed"
+    ];
+
+    selectors.forEach(sel => {
+      document.querySelectorAll(sel).forEach(e => e.remove());
+    });
+
+    // Remove inline event handlers (onclick, onload…)
+    const all = document.querySelectorAll("*");
+    all.forEach(el => {
+      [...el.attributes].forEach(attr => {
+        if (attr.name.startsWith("on")) {
+          el.removeAttribute(attr.name);
+        }
+      });
+    });
+  });


🛠️ Refactor suggestion | 🟠 Major

Significant code duplication: cleanup logic is identical in both functions.

The element removal and event handler cleanup code (lines 14-45 and 69-100) is duplicated verbatim between convertPageToMarkdown and convertPageToHTML. This violates the DRY principle and makes maintenance harder.

Extract the cleanup logic into a shared helper function:

async function cleanPage(page: Page): Promise<string> { await page.addInitScript(() => { const selectors = [ "script", "style", "link[rel='stylesheet']", "noscript", "meta", "svg", "img", "picture", "source", "video", "audio", "iframe", "object", "embed" ]; selectors.forEach(sel => { document.querySelectorAll(sel).forEach(e => e.remove()); }); // Remove inline event handlers const all = document.querySelectorAll("*"); all.forEach(el => { [...el.attributes].forEach(attr => { if (attr.name.startsWith("on")) { el.removeAttribute(attr.name); } }); }); }); return await page.evaluate(() => { return document.documentElement.outerHTML; }); } export async function convertPageToMarkdown(url: string): Promise<string> { const browser = await chromium.launch(); const page = await browser.newPage(); await page.goto(url, { waitUntil: "networkidle", timeout: 100000 }); const cleanedHtml = await cleanPage(page); await browser.close(); const markdown = await parseMarkdown(cleanedHtml, url); return markdown; } export async function convertPageToHTML(url: string): Promise<string> { const browser = await chromium.launch(); const page = await browser.newPage(); await page.goto(url, { waitUntil: "networkidle", timeout: 100000 }); const cleanedHtml = await cleanPage(page); await browser.close(); return cleanedHtml; }

Also applies to: 69-100

🤖 Prompt for AI Agents

In server/src/markdownify/scrape.ts around lines 14-45 and 69-100, the DOM cleanup logic is duplicated; extract it into a single async helper (e.g., cleanPage(page: Page): Promise<string>) that calls page.addInitScript with the selector removal and inline event-attribute cleanup, then returns await page.evaluate(() => document.documentElement.outerHTML); replace the duplicated blocks in convertPageToMarkdown and convertPageToHTML with a call to this helper, keep launching/closing the browser and page handling the same, and add any necessary imports/types for Playwright's Page.

⚠️ Potential issue | 🔴 Critical

Critical issue: addInitScript runs AFTER initial page load, so cleanup doesn't affect the fetched content.

The addInitScript is called after page.goto, which means it only affects subsequent navigations, not the current page that was just loaded. The cleanup script never runs on the initial page load, so all scripts, styles, images, etc. are still present in the extracted HTML.

Move the cleanup to execute on the loaded page using evaluate or call addInitScript before goto:

export async function convertPageToMarkdown(url: string): Promise<string> { const browser = await chromium.launch(); const page = await browser.newPage(); + await page.addInitScript(() => { + // Cleanup script here... + }); + await page.goto(url, { waitUntil: "networkidle", timeout: 100000 }); - await page.addInitScript(() => { - // Cleanup script here... - }); - - // Re-extract HTML after cleanup - const cleanedHtml = await page.evaluate(() => { + const cleanedHtml = await page.evaluate(() => { + // Run cleanup directly in evaluate + const selectors = ["script", "style", ...]; + selectors.forEach(sel => { + document.querySelectorAll(sel).forEach(e => e.remove()); + }); + + const all = document.querySelectorAll("*"); + all.forEach(el => { + [...el.attributes].forEach(attr => { + if (attr.name.startsWith("on")) { + el.removeAttribute(attr.name); + } + }); + }); + return document.documentElement.outerHTML; });

Also applies to: 69-100

🤖 Prompt for AI Agents

In server/src/markdownify/scrape.ts around lines 14 to 45 (and similarly for lines ~69-100), the cleanup script is being registered with page.addInitScript after page.goto so it doesn't run on the initial navigation; move the cleanup to run on the loaded page by either registering the addInitScript before calling page.goto OR running the same DOM-removal logic immediately after navigation using page.evaluate; update both places to ensure the selectors and inline-event removal execute on the current document (call addInitScript before any navigation or replace the post-load addInitScript with page.evaluate after await page.goto).

coderabbitai · 2025-11-20T14:13:46Z

server/src/workflow-management/scheduler/index.ts

+    if (recording.recording_meta.type === 'scrape') {
+      logger.log('info', `Executing scrape robot for scheduled run ${id}`);
+
+      const formats = recording.recording_meta.formats || ['markdown'];
+
+      await run.update({
+        status: 'running',
+        log: `Converting page to: ${formats.join(', ')}`
+      });
+
+      try {
+        const runStartedData = {
+          runId: plainRun.runId,
+          robotMetaId: plainRun.robotMetaId,
+          robotName: recording.recording_meta.name,
+          status: 'running',
+          startedAt: plainRun.startedAt
+        };
+
+        serverIo.of('/queued-run').to(`user-${userId}`).emit('run-started', runStartedData);
+        logger.log(
+          'info',
+          `Markdown robot run started notification sent for run: ${plainRun.runId} to user-${userId}`
+        );
+      } catch (socketError: any) {
+        logger.log(
+          'warn',
+          `Failed to send run-started notification for markdown robot run ${plainRun.runId}: ${socketError.message}`
+        );
+      }
+
+      try {
+        const url = recording.recording_meta.url;
+
+        if (!url) {
+          throw new Error('No URL specified for markdown robot');
+        }
+
+        let markdown = '';
+        let html = '';
+        const serializableOutput: any = {};
+
+        // Markdown conversion
+        if (formats.includes('markdown')) {
+          markdown = await convertPageToMarkdown(url);
+          serializableOutput.markdown = [{ content: markdown }];
+        }
+
+        // HTML conversion
+        if (formats.includes('html')) {
+          html = await convertPageToHTML(url);
+          serializableOutput.html = [{ content: html }];
+        }
+
+        await run.update({
+          status: 'success',
+          finishedAt: new Date().toLocaleString(),
+          log: `${formats.join(', ')} conversion completed successfully`,
+          serializableOutput,
+          binaryOutput: {},
+        });
+
+        logger.log('info', `Markdown robot execution completed for scheduled run ${id}`);
+
+        // Run-completed socket notifications
+        try {
+          const completionData = {
+            runId: plainRun.runId,
+            robotMetaId: plainRun.robotMetaId,
+            robotName: recording.recording_meta.name,
+            status: 'success',
+            finishedAt: new Date().toLocaleString()
+          };
+
+          serverIo.of(plainRun.browserId).emit('run-completed', completionData);
+          serverIo.of('/queued-run').to(`user-${userId}`).emit('run-completed', completionData);
+        } catch (socketError: any) {
+          logger.log(
+            'warn',
+            `Failed to send run-completed notification for markdown robot run ${id}: ${socketError.message}`
+          );
+        }
+
+        // Webhook payload
+        const webhookPayload: any = {
+          robot_id: plainRun.robotMetaId,
+          run_id: plainRun.runId,
+          robot_name: recording.recording_meta.name,
+          status: 'success',
+          started_at: plainRun.startedAt,
+          finished_at: new Date().toLocaleString(),
+          metadata: {
+            browser_id: plainRun.browserId,
+            user_id: userId,
+          }
+        };
+
+        if (formats.includes('markdown')) webhookPayload.markdown = markdown;
+        if (formats.includes('html')) webhookPayload.html = html;
+
+        try {
+          await sendWebhook(plainRun.robotMetaId, 'run_completed', webhookPayload);
+          logger.log(
+            'info',
+            `Webhooks sent successfully for markdown robot scheduled run ${plainRun.runId}`
+          );
+        } catch (webhookError: any) {
+          logger.log(
+            'warn',
+            `Failed to send webhooks for markdown robot run ${plainRun.runId}: ${webhookError.message}`
+          );
+        }
+
+        await destroyRemoteBrowser(plainRun.browserId, userId);
+
+        return true;
+
+      } catch (error: any) {
+        logger.log('error', `${formats.join(', ')} conversion failed for scheduled run ${id}: ${error.message}`);
+
+        await run.update({
+          status: 'failed',
+          finishedAt: new Date().toLocaleString(),
+          log: `${formats.join(', ')} conversion failed: ${error.message}`,
+        });
+
+        try {
+          const failureData = {
+            runId: plainRun.runId,
+            robotMetaId: plainRun.robotMetaId,
+            robotName: recording.recording_meta.name,
+            status: 'failed',
+            finishedAt: new Date().toLocaleString()
+          };
+
+          serverIo.of(plainRun.browserId).emit('run-completed', failureData);
+          serverIo.of('/queued-run').to(`user-${userId}`).emit('run-completed', failureData);
+        } catch (socketError: any) {
+          logger.log(
+            'warn',
+            `Failed to send run-failed notification for markdown robot run ${id}: ${socketError.message}`
+          );
+        }
+
+        await destroyRemoteBrowser(plainRun.browserId, userId);
+
+        throw error;
+      }
+    }


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

Scrape path bypasses proxy/remote browser config and duplicates logic

The new scrape branch works functionally, but a few aspects are worth tightening:

Proxy and browser config are ignored

Earlier in this file you resolve proxy settings and create a remote browser for the run, but convertPageToMarkdown/HTML each launch their own Playwright browser with default options. That means:

user-specific proxy settings are not applied for scrape runs;

stealth/plugins or other browser configuration used for normal runs are bypassed.

For users who rely on proxies to reach target sites, scrape runs will likely fail even though extract-type robots would succeed.

Consider either:

passing proxy options into convertPageToMarkdown/HTML, or

refactoring them to operate on the already-initialized remote browser/page rather than spinning up a separate browser.

Logic is duplicated with server/src/pgboss-worker.ts

The scrape success path (formats handling, serializableOutput shape, socket notifications, webhook payload structure) is almost identical to the one added in pgboss-worker.ts. Extracting a shared helper (e.g., runScrapeConversion(recording, formats, run, userId, browserId, context)) would keep behavior consistent and make future changes (like the proxy handling above) less error-prone.

I'll verify the key claims in the review comment by examining the implementation and comparing with pgboss-worker.ts.

--- 🏁 Script executed: ```shell #!/bin/bash # Find implementations of convertPageToMarkdown and convertPageToHTML rg -n "convertPageToMarkdown|convertPageToHTML" -A 10 -B 2 --type=ts

Length of output: 9526

🏁 Script executed:

#!/bin/bash # Check how proxy options are set up earlier in scheduler/index.ts rg -n "proxy" server/src/workflow-management/scheduler/index.ts -B 3 -A 3 --type=ts

Length of output: 850

🏁 Script executed:

#!/bin/bash # Extract the scrape success path logic from scheduler/index.ts and compare with pgboss-worker sed -n '211,359p' server/src/workflow-management/scheduler/index.ts > /tmp/scheduler_scrape.ts cat /tmp/scheduler_scrape.ts | head -150

Length of output: 4956

🏁 Script executed:

#!/bin/bash # Look for similar scrape handling in pgboss-worker.ts rg -n "convertPageToMarkdown|convertPageToHTML|scrape" server/src/pgboss-worker.ts -B 2 -A 15 --type=ts | head -100

Length of output: 3827

I need to verify one more claim about the failure handling doubling. Let me check the outer catch block.

--- 🏁 Script executed: ```shell #!/bin/bash # Look for executeRun function and its outer catch block to verify double failure handling rg -n "executeRun|catch.*error" server/src/workflow-management/scheduler/index.ts -B 2 -A 5 | tail -80

Length of output: 2075

🏁 Script executed:

#!/bin/bash # Check the full function structure around executeRun to understand error propagation sed -n '130,220p' server/src/workflow-management/scheduler/index.ts

Length of output: 2960

Scrape path bypasses proxy/remote browser config and duplicates logic

The new scrape branch works functionally, but has three real concerns worth addressing:

Proxy and browser config are ignored

The proxy config is retrieved and configured earlier in this file (lines 37-47), but convertPageToMarkdown and convertPageToHTML launch their own Playwright browser instances with default options. This means:

user-specific proxy settings are not applied for scrape runs

stealth plugins or other browser configuration used for normal runs are bypassed

Users who rely on proxies to reach target sites will see scrape runs fail where extract-type robots would succeed.

Consider passing proxyOptions into these functions or refactoring them to operate on the already-initialized remote browser instead.

Logic is duplicated across multiple files

The scrape success path (formats handling, serializableOutput shape, socket notifications, webhook payload structure) is nearly identical in:

server/src/workflow-management/scheduler/index.ts (lines 254–270)

server/src/pgboss-worker.ts (lines 214–232)

server/src/api/record.ts (lines 691–709)

Extracting a shared helper would keep behavior consistent and reduce maintenance risk.

Double failure status updates on error

When conversion fails, the inner catch (line 328) updates run.status to 'failed' and throws, which is then caught by the outer handler (line 508), updating the status again. This causes duplicate notifications and database writes.

🤖 Prompt for AI Agents

In server/src/workflow-management/scheduler/index.ts around lines 211–359 the new scrape branch bypasses user proxy/remote-browser config, duplicates result/notification/webhook logic found in other files, and performs a redundant run.status update on conversion error; fix by 1) changing the markdown/html conversion calls to accept and use the existing proxy/browser options (or accept a browser instance) so they reuse the remote browser/proxy/stealth config initialized earlier (pass proxyOptions or browser handle into convertPageToMarkdown/convertPageToHTML), 2) extract the shared success/failure handling (building serializableOutput, socket notifications, webhook payload and run.update calls) into a single helper used by scheduler, pgBoss worker and api/record to avoid duplication, and 3) remove the inner catch’s run.update that sets status to 'failed' (or convert it to only prepare failure data without persisting) so that the outer error handler is the single place that updates run status and emits failure notifications to prevent double updates.

coderabbitai · 2025-11-20T14:13:46Z

src/api/storage.ts

+export const createScrapeRobot = async (
+  url: string,
+  name?: string,
+  formats: string[] = ['markdown']
+): Promise<any> => {
+  try {
+    const response = await axios.post(
+      `${apiUrl}/storage/recordings/scrape`,
+      {
+        url,
+        name,
+        formats,
+      },
+      {
+        headers: { 'Content-Type': 'application/json' },
+        withCredentials: true,
+      }
+    );
+
+    if (response.status === 201) {
+      return response.data;
+    } else {
+      throw new Error('Failed to create markdown robot');
+    }
+  } catch (error: any) {
+    console.error('Error creating markdown robot:', error);
+    return null;
+  }
+};


⚠️ Potential issue | 🟡 Minor

Fix error message and consider error handling consistency.

The error message on line 53 references "markdown robot" but should reference "scrape robot" for consistency with the function name and broader terminology.

Apply this diff:

} else { - throw new Error('Failed to create markdown robot'); + throw new Error('Failed to create scrape robot'); } } catch (error: any) { - console.error('Error creating markdown robot:', error); + console.error('Error creating scrape robot:', error); return null; }

Additionally, consider if returning null on error is the best approach. Other functions in this file (e.g., getStoredRecordings, getStoredRecording) also return null on error, which is consistent, but this pattern forces callers to check for null. Consider whether throwing errors would provide better error propagation.

🤖 Prompt for AI Agents

In src/api/storage.ts around lines 31 to 59, the error messages refer to "markdown robot" but the function creates a "scrape robot" — update the thrown Error and the console.error message to reference "scrape robot" instead; additionally decide on consistent error handling: either keep returning null on error to match getStoredRecordings/getStoredRecording (no behavioral change), or switch to rethrowing the caught error (or throw a new Error with context) so callers can handle exceptions — implement one approach consistently and update the function signature/JSdoc/comments accordingly.

coderabbitai · 2025-11-20T14:13:46Z

src/components/robot/pages/RobotCreate.tsx

+                    notify('success', `${scrapeRobotName} created successfully!`);
+                    navigate('/robots');
+                  } else {
+                    notify('error', 'Failed to create markdown robot');


⚠️ Potential issue | 🟡 Minor

Inconsistent error message: references "markdown robot" instead of "scrape robot".

The error message at line 463 says "Failed to create markdown robot" but the Scrape tab supports both markdown and HTML formats. The message should be more generic.

Apply this diff:

- notify('error', 'Failed to create markdown robot'); + notify('error', 'Failed to create scrape robot');

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

notify('error', 'Failed to create markdown robot');

notify('error', 'Failed to create scrape robot');

🤖 Prompt for AI Agents

In src/components/robot/pages/RobotCreate.tsx around line 463, the notify error message incorrectly says "Failed to create markdown robot"; update this to a generic message that covers both markdown and HTML (e.g., "Failed to create scrape robot" or "Failed to create robot") so the Scrape tab error is accurate; replace the string in the notify('error', ...) call and run/adjust any tests or snapshots that assert the old message.

coderabbitai · 2025-11-20T14:13:47Z

src/components/robot/RecordingsTable.tsx

+            <ListItemIcon>
+              <Refresh fontSize="small" />
+            </ListItemIcon>
+            <ListItemText>Retrain</ListItemText>


🛠️ Refactor suggestion | 🟠 Major

Use translation keys instead of hardcoded English strings.

The menu item labels were changed from translated strings (using t(...)) to hardcoded English text. This breaks internationalization support that exists throughout the rest of the component.

Apply this diff to restore i18n support:

- <ListItemText>Retrain</ListItemText> + <ListItemText>{t('recordingtable.options.retrain')}</ListItemText> <MenuItem onClick={() => { handleEdit(); handleClose(); }}> <ListItemIcon><Edit fontSize="small" /></ListItemIcon> - <ListItemText>Edit</ListItemText> + <ListItemText>{t('recordingtable.options.edit')}</ListItemText> </MenuItem> <MenuItem onClick={() => { handleDelete(); handleClose(); }}> <ListItemIcon><DeleteForever fontSize="small" /></ListItemIcon> - <ListItemText>Delete</ListItemText> + <ListItemText>{t('recordingtable.options.delete')}</ListItemText> </MenuItem> {robotType !== 'scrape' && ( <MenuItem onClick={() => { handleDuplicate(); handleClose(); }}> <ListItemIcon><ContentCopy fontSize="small" /></ListItemIcon> - <ListItemText>Duplicate</ListItemText> + <ListItemText>{t('recordingtable.options.duplicate')}</ListItemText> </MenuItem> )}

Also applies to: 787-787, 792-792, 798-798

🤖 Prompt for AI Agents

In src/components/robot/RecordingsTable.tsx around lines 781 (and similarly at lines 787, 792, 798), several MenuItem labels were changed to hardcoded English strings like "Retrain", breaking i18n; replace those hardcoded strings with the translation function calls (e.g. t('your.translation.key') for each label), using the existing translation keys used elsewhere in the component (or add appropriate new keys) so each MenuItemText uses t(...) instead of a literal string, and ensure the component imports/has access to the t function from react-i18next as used elsewhere in this file.

coderabbitai · 2025-11-20T14:13:47Z

src/components/run/RunContent.tsx

+  useEffect(() => {
+    setMarkdownContent('');
+    setHtmlContent('');
+
+    if (row.serializableOutput?.markdown && Array.isArray(row.serializableOutput.markdown)) {
+      const markdownData = row.serializableOutput.markdown[0];
+      if (markdownData?.content) {
+        setMarkdownContent(markdownData.content);
+      }
+    }
+
+    if (row.serializableOutput?.html && Array.isArray(row.serializableOutput.html)) {
+      const htmlData = row.serializableOutput.html[0];
+      if (htmlData?.content) {
+        setHtmlContent(htmlData.content);
+      }
+    }
+  }, [row.serializableOutput]);
+


🛠️ Refactor suggestion | 🟠 Major

Add safety checks for array access.

The code assumes the markdown and html arrays have at least one element without validation. If the arrays are empty, accessing [0] will return undefined, which could cause issues.

Apply this diff to add safety checks:

useEffect(() => { setMarkdownContent(''); setHtmlContent(''); - if (row.serializableOutput?.markdown && Array.isArray(row.serializableOutput.markdown)) { + if (row.serializableOutput?.markdown && Array.isArray(row.serializableOutput.markdown) && row.serializableOutput.markdown.length > 0) { const markdownData = row.serializableOutput.markdown[0]; if (markdownData?.content) { setMarkdownContent(markdownData.content); } } - if (row.serializableOutput?.html && Array.isArray(row.serializableOutput.html)) { + if (row.serializableOutput?.html && Array.isArray(row.serializableOutput.html) && row.serializableOutput.html.length > 0) { const htmlData = row.serializableOutput.html[0]; if (htmlData?.content) { setHtmlContent(htmlData.content); } } }, [row.serializableOutput]);

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

useEffect(() => {

setMarkdownContent('');

setHtmlContent('');

if (row.serializableOutput?.markdown && Array.isArray(row.serializableOutput.markdown)) {

const markdownData = row.serializableOutput.markdown[0];

if (markdownData?.content) {

setMarkdownContent(markdownData.content);

}

}

if (row.serializableOutput?.html && Array.isArray(row.serializableOutput.html)) {

const htmlData = row.serializableOutput.html[0];

if (htmlData?.content) {

setHtmlContent(htmlData.content);

}

}

}, [row.serializableOutput]);

useEffect(() => {

setMarkdownContent('');

setHtmlContent('');

if (row.serializableOutput?.markdown && Array.isArray(row.serializableOutput.markdown) && row.serializableOutput.markdown.length > 0) {

const markdownData = row.serializableOutput.markdown[0];

if (markdownData?.content) {

setMarkdownContent(markdownData.content);

}

}

if (row.serializableOutput?.html && Array.isArray(row.serializableOutput.html) && row.serializableOutput.html.length > 0) {

const htmlData = row.serializableOutput.html[0];

if (htmlData?.content) {

setHtmlContent(htmlData.content);

}

}

}, [row.serializableOutput]);

🤖 Prompt for AI Agents

In src/components/run/RunContent.tsx around lines 68 to 86 the effect assumes row.serializableOutput.markdown and .html arrays contain at least one element and accesses [0] directly; update both branches to first check that the array exists and has length > 0 (or use .find to locate the first element with a content property) before reading [0] or .content, and only call setMarkdownContent/setHtmlContent when a valid item with .content is found to avoid undefined access when arrays are empty.

…markdownify

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

src/components/robot/pages/RobotCreate.tsx (1)
464-464: Inconsistent error message: references "markdown robot" instead of "scrape robot".

The error message says "Failed to create markdown robot" but the Scrape tab supports both markdown and HTML formats. The message should be generic.

Apply this diff:
-                    notify('error', 'Failed to create markdown robot');
+                    notify('error', 'Failed to create scrape robot');

🧹 Nitpick comments (2)

src/components/robot/pages/RobotCreate.tsx (2)
403-403: Consider using FormLabel for consistency.

The label uses a plain <p> tag instead of Material-UI's FormLabel component, which would be more consistent with the FormControl usage.

Apply this diff:
-                <FormControl component="fieldset" sx={{ width: '100%', textAlign: 'left' }}>
-                  <p>Output Format (Select at least one)</p>
+                <FormControl component="fieldset" sx={{ width: '100%', textAlign: 'left' }}>
+                  <FormLabel component="legend">Output Format (Select at least one)</FormLabel>
264-271: Consider extracting duplicated button styles.

The button styles are duplicated between the Extract and Scrape tabs. Consider extracting to a shared constant for maintainability.

Example refactor:
const actionButtonStyles = {
  bgcolor: '#ff00c3',
  py: 1.4,
  fontSize: '1rem',
  textTransform: 'none',
  maxWidth: 700,
  borderRadius: 2
};

// Then use: sx={actionButtonStyles}
Also applies to: 468-475

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 467ffe3 and a1b2117.

📒 Files selected for processing (1)

src/components/robot/pages/RobotCreate.tsx (2 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

src/components/robot/pages/RobotCreate.tsx (2)

src/context/globalInfo.tsx (1)

useGlobalInfoStore (189-189)

src/api/storage.ts (1)

createScrapeRobot (31-59)

🔇 Additional comments (1)

src/components/robot/pages/RobotCreate.tsx (1)

442-445: Backend validates URL format properly—no action needed.

The frontend validation (line 442-445) only checks for non-empty input, but the backend endpoint at server/src/routes/storage.ts:456-459 validates URL format using the new URL() constructor, which properly parses and rejects invalid URLs. This backend validation is called before any processing occurs, adequately addressing malformed or invalid URL inputs.

amhsirak added 4 commits November 17, 2025 19:45

wip: get html

01a2678

fix: define browser context

994142a

feat: get input text for llm

0c9dc89

feat: get llm ready text

560f5a3

amhsirak marked this pull request as draft November 17, 2025 14:22

amhsirak added 3 commits November 17, 2025 19:54

fix: return empty empty str on error

9b71cfc

fix: return empty empty str on error

191ac52

fix: get important content

af95706

amhsirak changed the title ~~wip: markdownify~~ wip: markdownify + plain text Nov 17, 2025

amhsirak added 9 commits November 17, 2025 21:14

wip: markdown + plain text

a3891f6

wip: markdown + plain text

dae4e83

fix: better markdown output

28f1bf8

fix: better markdown output

1651763

chore(deps): install cheerio, turndown

6e6d6c6

debug(temporary): turndown x amzn

f22f6ef

debug(temporary): test url -> llm text

0fa5397

chore: link replace

4158896

chore: link replace

6c8850a

amhsirak changed the title ~~wip: markdownify + plain text~~ wip: markdownify Nov 19, 2025

amhsirak added 10 commits November 19, 2025 22:50

wip: to markdown

7da4647

chore(deps): install koffi

dd1a9a8

chore: ignore build files

ec49565

chore: build

f0d6712

chore: build

da48d46

feat: to markdown

713d374

feat: html -> markdown

6c93cbc

fix: go parser path

0837ac5

fix: export convert fxn

66d8291

feat: use parser to scrape

1d65f90

amhsirak added 7 commits November 20, 2025 16:24

chore: lint

606790e

feat: change mui default tabs

3dac1a0

feat: turn to markdown

9601905

fix: lesser restrictions

930c7b6

fix: lesser restrictions

691dedc

chore: lint

7f48e27

chore: cleanup wanted deps

fef038b

amhsirak changed the title ~~feat: markdownify~~ feat: scrape [html + markdown] Nov 20, 2025

amhsirak and others added 13 commits November 20, 2025 17:55

feat: add html

5aafe6e

feat: scrape robot

418100c

feat: scrape robot

f3c79bd

feat: add html scrape support

e90cd99

fix: resolve merge conflicts

a9a8e20

feat: modify scrape api to support html

c89b2af

chore: increase goto timeout scrape 100s

0987183

feat: disable sheets and airtable scrape robot

ac0c70e

fix: format

f646713

fix: use p instead of formlabel

174a09f

fix: remove margin

565c858

Merge branch 'markdownify' of https://github.com/getmaxun/maxun into …

7f22a77

…markdownify

fix: dont show selected output format

6477fee

amhsirak marked this pull request as ready for review November 20, 2025 14:03

chore: add telemetry for scrape robots and runs

b2b5a91

coderabbitai bot reviewed Nov 20, 2025

View reviewed changes

RohitR311 and others added 4 commits November 20, 2025 20:38

feat: rm display integrations scrape robot

467ffe3

feat: center tabs

25fd74e

Merge branch 'markdownify' of https://github.com/getmaxun/maxun into …

5cfbd1e

…markdownify

fix: less gap

a1b2117

coderabbitai bot reviewed Nov 20, 2025

View reviewed changes

amhsirak merged commit a1515c2 into develop Nov 20, 2025
1 check passed

	notify('error', 'Failed to create markdown robot');
	notify('error', 'Failed to create scrape robot');

feat: scrape [html + markdown] #889

feat: scrape [html + markdown] #889

Uh oh!

Conversation

amhsirak commented Nov 17, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Nov 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

amhsirak commented Nov 17, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 17, 2025 •

edited

Loading