Skip to content

Conversation

vikalluru
Copy link

@vikalluru vikalluru commented Jun 16, 2025

Renders PlotlyJS HTML plots in the Chat message frontend

Overview

This feature enhances the NeMo Agent Toolkit UI to automatically detect and render PlotlyJS HTML plots within chat messages, providing users with seamless visualization capabilities directly in the conversation interface.

Background

This feature was developed as part of the Predictive Maintenance workflow on NAT. In agentic workflows, AI agents frequently generate data visualizations using the PlotlyJS library to help users understand complex datasets, trends, and insights. Previously, these plots were only accessible as HTML file paths mentioned in chat responses, requiring users to manually navigate to separate files to view the visualizations.

Technical Implementation

The agentic workflow generates plots using the PlotlyJS library and surfaces the HTML file paths to the user via the main LLM, which appears in the chat message. This PR introduces the following key changes:

  • HTML File Detection: Automatically detects HTML file paths in assistant messages using pattern matching
  • Plot Rendering: Renders PlotlyJS plots directly within the chat interface using a new HtmlFileRenderer component
  • Content Separation: Cleanly separates plot rendering from regular message content to avoid duplication
  • Backward Compatibility: Maintains all existing chat functionality while adding the new visualization capabilities

Key Features

  • ✅ Automatic detection of HTML file references in chat messages
  • ✅ Inline rendering of PlotlyJS plots without leaving the chat interface
  • ✅ Support for both file-based and inline HTML content
  • ✅ Seamless integration with existing chat message components
  • ✅ No impact on existing chat functionality or performance

Use Cases

  • Predictive Maintenance: Visualize equipment health trends, failure predictions, and maintenance schedules
  • Data Analysis: Display charts, graphs, and statistical visualizations generated by AI agents
  • Workflow Insights: Present complex data relationships in an easily digestible visual format

Files Changed

  • components/Chat/ChatMessage.tsx: Enhanced to detect and render HTML plots
  • components/Chat/HtmlFileRenderer.tsx: New component for rendering HTML content
  • utils/app/htmlFileDetector.ts: Utility functions for detecting and parsing HTML file links
  • jest.config.js: Updated to handle ES module dependencies
  • __mocks__/: Added mocks for rehype-raw, remark-gfm, and remark-math plugins

Unit Testing

Test Coverage Summary

This PR includes comprehensive unit tests to ensure reliability and maintainability of the HTML plot rendering feature. All tests pass successfully with robust coverage of core functionality and edge cases.

Metric Count Status
Total Test Suites 14 ✅ All Passing
Total Tests 315 ✅ All Passing
New Test Files 2 htmlFileDetector.test.ts, ChatMessage.test.tsx
New Mock Files 3 rehype-raw.js, remark-gfm.js, remark-math.js
Test Execution Time ~4.5s Fast

Jest Configuration Updates

To support the markdown rendering libraries used in the chat interface, we added mock implementations for ES modules that were incompatible with Jest's CommonJS environment:

// jest.config.js additions
moduleNameMapper: {
  '^rehype-raw$': '<rootDir>/__mocks__/rehype-raw.js',
  '^remark-gfm$': '<rootDir>/__mocks__/remark-gfm.js',
  '^remark-math$': '<rootDir>/__mocks__/remark-math.js',
}

These mocks provide no-op transformers that are sufficient for component testing, as we're testing component behavior rather than markdown parsing specifics.

Example Test Output

PASS  __tests__/utils/app/htmlFileDetector.test.ts
PASS  __tests__/components/Chat/ChatMessage.test.tsx
PASS  __tests__/components/Chat/HtmlFileRenderer.test.tsx

Test Suites: 14 passed, 14 total
Tests:       315 passed, 315 total
Snapshots:   0 total
Time:        4.486 s

Testing

  • Automated testing: 315 unit tests covering all functionality
  • Manual testing: Various PlotlyJS plot types and edge cases
  • Backward compatibility: Verification with existing chat messages
  • Cross-browser compatibility: Testing across major browsers

Example Images:

test_prompt_1 test_prompt_2 test_prompt_3 test_prompt_4

Summary by CodeRabbit

  • New Features

    • Assistant messages now detect HTML file links and render them below the message instead of showing raw links.
    • Inline HTML content is supported with show/hide controls, open-in-browser, and copy-to-clipboard for file path or content.
    • Robust loading with existence checks, timeouts, and user-friendly error messages.
    • Secure backend endpoint serves HTML files from permitted locations.
  • Tests

    • Added comprehensive tests for message rendering, HTML detection/cleanup, and the HTML viewer.
    • Improved test stability via configuration and environment mocks.

Signed-off-by: Vineeth Kalluru <[email protected]>
Signed-off-by: Vineeth Kalluru <[email protected]>
@vikalluru vikalluru force-pushed the vikalluru/plot-renderer branch from 59c0287 to 6b5ece9 Compare September 26, 2025 15:21
Copy link

copy-pr-bot bot commented Sep 26, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Copy link

coderabbitai bot commented Sep 26, 2025

Walkthrough

Adds HTML file rendering support in chat: detects HTML links/inline blocks in assistant messages, removes them from displayed text, and renders them via a new HtmlFileRenderer component. Introduces a Next.js API to serve HTML files, adds detection/cleanup utilities, corresponding unit tests, Jest mocks/mapping for remark/rehype plugins, and config updates.

Changes

Cohort / File(s) Summary
Jest plugin mocks
__mocks__/rehype-raw.js, __mocks__/remark-gfm.js, __mocks__/remark-math.js
Add CommonJS no-op mocks for rehype-raw, remark-gfm, and remark-math; default exports aliased for compatibility.
Chat component updates
components/Chat/ChatMessage.tsx, components/Chat/HtmlFileRenderer.tsx
Integrate HTML link detection into assistant message rendering; remove links from content and render HtmlFileRenderer instances; add HtmlFileRenderer component with loading, error, copy, and toggle behavior for file and inline HTML.
HTML loading API
pages/api/load-html-file.ts
New GET endpoint to validate, authorize, read, and return .html file content with appropriate status codes.
HTML link utilities
utils/app/htmlFileDetector.ts
Add HtmlFileLink interface; implement detectHtmlFileLinks and removeHtmlFileLinksFromContent with multiple pattern matchers and overlap/duplicate handling.
Tests: Chat + utils
__tests__/components/Chat/ChatMessage.test.tsx, __tests__/components/Chat/HtmlFileRenderer.test.tsx, __tests__/utils/app/htmlFileDetector.test.ts
Add unit tests covering ChatMessage HTML handling, HtmlFileRenderer states/actions, and htmlFileDetector detection/cleanup across edge cases.
Jest config
jest.config.js
Map remark/rehype modules to mocks; broaden transformIgnorePatterns for unified processing.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant ChatUI as ChatMessage (React)
  participant Detector as htmlFileDetector
  participant Renderer as HtmlFileRenderer

  User->>ChatUI: View assistant message
  ChatUI->>Detector: detectHtmlFileLinks(content)
  Detector-->>ChatUI: HtmlFileLink[]
  ChatUI->>ChatUI: removeHtmlFileLinksFromContent(content)
  ChatUI-->>User: Render cleaned markdown
  loop For each HtmlFileLink
    ChatUI->>Renderer: Render HtmlFileRenderer(props)
  end
Loading
sequenceDiagram
  autonumber
  actor User
  participant Renderer as HtmlFileRenderer
  participant API as /api/load-html-file
  participant FS as Filesystem (server)

  Note over Renderer: Initial render
  Renderer->>Renderer: normalize path / detect inline HTML
  User->>Renderer: Click "Show"
  alt Inline HTML
    Renderer-->>User: Display iframe via srcDoc
  else External .html file
    Renderer->>API: GET ?filePath=...
    API->>FS: Validate + read .html
    alt Exists and readable
      FS-->>API: HTML content
      API-->>Renderer: 200 text/html
      Renderer-->>User: Display iframe via srcDoc
    else Not found / invalid / error
      API-->>Renderer: 4xx/5xx
      Renderer-->>User: Show error guidance
    end
  end

  User->>Renderer: Copy path/content
  Renderer-->>User: Clipboard confirmation
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

In burrows of bytes, I twitch with delight,
Found shiny .html, glowing so bright—
I nibble the links, clean crumbs from the chat,
Then pop up a window: “Behold, look at that!”
With tests as my carrots and mocks as my hay,
I hop through the pipeline—hip hop, hooray! 🐇✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "Renders PlotlyJS HTML plots in the Chat message frontend" accurately captures the main change in the changeset. The core functionality introduced is the detection and rendering of HTML plots (specifically PlotlyJS-generated) within chat messages through a new HtmlFileRenderer component, HTML file detection utilities, and integration into the ChatMessage component. While the changeset also includes supporting infrastructure like Jest mocks, API endpoints, and tests, these are all in service of the primary feature: enabling HTML plot visualization in the chat interface. The title is concise, clear, and specific enough for a teammate to understand the primary purpose of this change.
Docstring Coverage ✅ Passed Docstring coverage is 91.67% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🧪 Early access (Sonnet 4.5): enabled

We are currently testing the Sonnet 4.5 model, which is expected to improve code review quality. However, this model may lead to increased noise levels in the review comments. Please disable the early access features if the noise level causes any inconvenience.

Note:

  • Public repositories are always opted into early access features.
  • You can enable or disable early access features from the CodeRabbit UI or by updating the CodeRabbit configuration file.

Comment @coderabbitai help to get the list of available commands and usage tips.

@vikalluru vikalluru changed the title Vikalluru/plot renderer Renders PlotlyJS HTML plots in the Chat message frontend Sep 26, 2025
@vikalluru vikalluru marked this pull request as ready for review September 29, 2025 18:21
Signed-off-by: Vineeth Kalluru <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🧹 Nitpick comments (3)
pages/api/load-html-file.ts (1)

5-11: Consider adding rate limiting for DOS protection.

The endpoint has no rate limiting, making it vulnerable to abuse where attackers could repeatedly request large HTML files to exhaust server resources.

Consider adding rate limiting middleware or using a solution like next-rate-limit:

import rateLimit from 'express-rate-limit';

const limiter = rateLimit({
  windowMs: 60 * 1000, // 1 minute
  max: 30 // limit each IP to 30 requests per minute
});
components/Chat/ChatMessage.tsx (1)

217-241: Consider reducing duplication between prepareContent and prepareContentWithoutHtmlLinks.

The two functions share identical logic except for lines 234-236 (removal of HTML file links). This duplication makes maintenance harder—changes to formatting or processing must be applied in both places.

Refactor to eliminate duplication:

-    const prepareContent = ({
+    const prepareContentCore = ({
       message = {} as Message,
       responseContent = true,
       intermediateStepsContent = false,
       role = 'assistant',
+      removeHtmlLinks = false,
     } = {}) => {
       const { content = '', intermediateSteps = [] } = message;
 
       if (role === 'user') return content.trim();
 
       let result = '';
       if (intermediateStepsContent) {
         result += generateContentIntermediate(intermediateSteps);
       }
 
       if (responseContent) {
-        result += result ? `\n\n${content}` : content;
+        const processedContent = removeHtmlLinks 
+          ? removeHtmlFileLinksFromContent(content) 
+          : content;
+        result += result ? `\n\n${processedContent}` : processedContent;
       }
 
       // fixing malformed html and removing extra spaces to avoid markdown issues
       return fixMalformedHtml(result)?.trim()?.replace(/\n\s+/, '\n ');
     };

+    const prepareContent = (options = {}) => prepareContentCore(options);
-    // Prepare content with HTML file links removed to avoid duplicate display
-    const prepareContentWithoutHtmlLinks = ({
-      message = {} as Message,
-      responseContent = true,
-      intermediateStepsContent = false,
-      role = 'assistant',
-    } = {}) => {
-      const { content = '', intermediateSteps = [] } = message;
-
-      if (role === 'user') return content.trim();
-
-      let result = '';
-      if (intermediateStepsContent) {
-        result += generateContentIntermediate(intermediateSteps);
-      }
-
-      if (responseContent) {
-        // Remove HTML file links from content
-        const cleanContent = removeHtmlFileLinksFromContent(content);
-        result += result ? `\n\n${cleanContent}` : cleanContent;
-      }
-
-      // fixing malformed html and removing extra spaces to avoid markdown issues
-      return fixMalformedHtml(result)?.trim()?.replace(/\n\s+/, '\n ');
-    };
+    const prepareContentWithoutHtmlLinks = (options = {}) => 
+      prepareContentCore({ ...options, removeHtmlLinks: true });
utils/app/htmlFileDetector.ts (1)

67-71: Consider nested tag edge case in title extraction.

The regex patterns assume <title> and <h1> content doesn't contain nested HTML tags. For example, <title>My <em>Title</em></title> would only capture "My ". This is likely acceptable for typical PlotlyJS HTML files, but be aware of this limitation.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fc982f6 and af4d0c5.

📒 Files selected for processing (11)
  • __mocks__/rehype-raw.js (1 hunks)
  • __mocks__/remark-gfm.js (1 hunks)
  • __mocks__/remark-math.js (1 hunks)
  • __tests__/components/Chat/ChatMessage.test.tsx (1 hunks)
  • __tests__/components/Chat/HtmlFileRenderer.test.tsx (1 hunks)
  • __tests__/utils/app/htmlFileDetector.test.ts (1 hunks)
  • components/Chat/ChatMessage.tsx (5 hunks)
  • components/Chat/HtmlFileRenderer.tsx (1 hunks)
  • jest.config.js (1 hunks)
  • pages/api/load-html-file.ts (1 hunks)
  • utils/app/htmlFileDetector.ts (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (5)
components/Chat/ChatMessage.tsx (4)
utils/app/htmlFileDetector.ts (3)
  • HtmlFileLink (1-7)
  • detectHtmlFileLinks (20-32)
  • removeHtmlFileLinksFromContent (227-261)
types/chat.ts (1)
  • Message (1-10)
utils/app/helper.ts (2)
  • generateContentIntermediate (260-303)
  • fixMalformedHtml (327-337)
components/Chat/HtmlFileRenderer.tsx (1)
  • HtmlFileRenderer (12-321)
__tests__/utils/app/htmlFileDetector.test.ts (1)
utils/app/htmlFileDetector.ts (2)
  • detectHtmlFileLinks (20-32)
  • removeHtmlFileLinksFromContent (227-261)
utils/app/htmlFileDetector.ts (1)
pages/api/load-html-file.ts (1)
  • handler (5-59)
__tests__/components/Chat/HtmlFileRenderer.test.tsx (1)
components/Chat/HtmlFileRenderer.tsx (1)
  • HtmlFileRenderer (12-321)
__tests__/components/Chat/ChatMessage.test.tsx (2)
types/chat.ts (1)
  • Message (1-10)
components/Chat/ChatMessage.tsx (1)
  • ChatMessage (45-447)
🔇 Additional comments (25)
__mocks__/remark-math.js (1)

1-16: LGTM! Mock implementation is appropriate.

The no-op transformer correctly bypasses LaTeX math parsing during tests while maintaining the expected plugin interface. The default export alias on line 16 ensures compatibility with both CommonJS and ESM import patterns.

components/Chat/HtmlFileRenderer.tsx (2)

206-263: Good: Comprehensive UI with proper visual hierarchy and accessibility.

The component header provides clear status indicators, appropriate action buttons, and good visual feedback. The gradient background, badges, and button states enhance UX.


304-304: The review comment incorrectly assumes PlotlyJS usage.

After verifying the codebase, PlotlyJS is not used anywhere in this application. The HtmlFileRenderer component is a generic HTML renderer that displays arbitrary HTML content via the srcDoc attribute, not specifically Plotly charts. The codebase uses recharts and react-force-graph-2d for visualizations instead.

The sandbox attribute "allow-scripts allow-same-origin" is appropriate for rendering generic HTML content. The concerns about PlotlyJS-specific features (zoom/pan interactions, modebar, export functionality) are not applicable here.

Likely an incorrect or invalid review comment.

jest.config.js (2)

21-23: LGTM! Mock mappings support ESM compatibility.

The new moduleNameMapper entries correctly route rehype-raw, remark-gfm, and remark-math to their respective CommonJS-compatible mocks, ensuring tests run without ESM import issues in Jest's CommonJS environment.


26-26: LGTM! Comprehensive transformIgnorePatterns for unified ecosystem.

The expanded pattern now covers the full range of unist/mdast/hast utilities and their dependencies, preventing Jest from attempting to transform these ESM-only packages. This is the correct approach for handling the unified markdown/HTML processing ecosystem.

__mocks__/rehype-raw.js (1)

1-16: LGTM! Mock implementation follows established patterns.

The no-op transformer appropriately bypasses HTML parsing during tests while maintaining the expected plugin interface. The implementation is consistent with the other mocks in this PR (remark-math, remark-gfm).

__mocks__/remark-gfm.js (1)

1-17: LGTM!

The mock implementation is appropriate for isolating component behavior in tests. The no-op transformer and dual export pattern correctly handle both CommonJS and default imports.

__tests__/utils/app/htmlFileDetector.test.ts (1)

1-140: LGTM!

The test suite provides comprehensive coverage of HTML file detection and cleanup functionality. The tests validate:

  • Multiple detection patterns (HTML anchors, Markdown links, file:// URLs, inline HTML)
  • Edge cases (duplicates, special characters, non-HTML links)
  • Content cleanup while preserving surrounding text

The assertions correctly verify both detection (link objects with proper fields) and removal (cleaned content).

__tests__/components/Chat/HtmlFileRenderer.test.tsx (4)

1-21: LGTM!

Test setup with mocked fetch and clipboard APIs is appropriate for isolating component behavior.


23-68: LGTM!

Tests correctly verify loading states, successful rendering with iframe srcDoc, non-existent file handling, and error scenarios. The use of waitFor properly handles async operations.


83-146: LGTM!

UI interaction tests correctly verify toggle, copy-to-clipboard for both file paths and inline HTML content, and proper alert messages.


70-81: Verified: inline HTML correctly skips fetch.

The component implementation confirms that checkFileExists returns early when isInlineHtml is true (lines 42-45), setting fileExists without calling fetch. The test assertion is correct.

components/Chat/ChatMessage.tsx (4)

36-37: LGTM!

Imports for the new HtmlFileRenderer component and HTML file detection utilities are correctly added.


188-192: LGTM!

HTML file detection is appropriately scoped to assistant messages only, avoiding unnecessary processing for user messages.


354-385: LGTM!

Correctly using prepareContentWithoutHtmlLinks for assistant messages to remove HTML file links from displayed content, preventing duplicate display alongside the HtmlFileRenderer.


387-400: LGTM!

The rendering logic correctly:

  • Conditionally renders HtmlFileRenderer only when htmlFileLinks is non-empty
  • Maps over detected links with proper key generation
  • Passes all necessary props (filePath, title, isInlineHtml, htmlContent)
__tests__/components/Chat/ChatMessage.test.tsx (3)

1-36: LGTM!

Test setup correctly mocks HtmlFileRenderer and BotAvatar, and provides a proper HomeContext wrapper for rendering ChatMessage components.


38-98: LGTM!

Integration tests correctly verify:

  • HtmlFileRenderer renders for assistant messages with HTML links
  • HTML links are removed from displayed content
  • No renderer for user messages (even with links)
  • No renderer for assistant messages without links
  • Inline HTML content is handled properly

100-112: LGTM!

Test correctly verifies that multiple HTML files in a single message result in multiple HtmlFileRenderer instances.

utils/app/htmlFileDetector.ts (6)

1-14: LGTM! Well-structured type definitions.

The interfaces and type alias provide clear contracts and good type safety for the detection logic.


20-32: LGTM! Well-structured detection flow.

The function properly orchestrates inline HTML detection, pattern matching, and deduplication. The use of processedRanges prevents overlapping matches effectively.


37-62: LGTM! Inline HTML detection handles complete blocks correctly.

The regex pattern correctly identifies complete <html>...</html> blocks with non-greedy matching. The synthetic filePath generation ensures unique identifiers for each inline block.


76-109: LGTM! Pattern matching with proper overlap handling.

The pattern/handler architecture provides flexibility while the overlap detection ensures each content region is processed only once.


171-221: LGTM! Helper functions are well-implemented.

The overlap detection logic is correct, and the path/title extraction functions provide good defensive programming and user-friendly formatting.


114-166: Fix character class in plain file path pattern.

Line 159: The character class [\w\/\-\.] has the hyphen between \/ and \., which could be interpreted as a range rather than a literal hyphen. This may cause unexpected matching behavior.

Apply this diff to fix the character class:

-      regex: /\s(\/[\w\/\-\.]+\.html)\b/g,
+      regex: /\s(\/[\w/.\-]+\.html)\b/g,

Alternatively, escape the forward slash (though unnecessary in character classes) and place hyphen at the end or escape it:

-      regex: /\s(\/[\w\/\-\.]+\.html)\b/g,
+      regex: /\s(\/[\w/\.\-]+\.html)\b/g,

Likely an incorrect or invalid review comment.

Comment on lines +52 to +60
// We'll abort quickly to avoid downloading large files
const controller = new AbortController();
const timeoutId = setTimeout(() => controller.abort(), 3000); // 3 second timeout

const response = await fetch(`/api/load-html-file?path=${encodeURIComponent(cleanPath)}`, {
signal: controller.signal
});

clearTimeout(timeoutId);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Abort controller timeout cleanup missing.

If the fetch completes before the 3-second timeout (line 54), the timeout is cleared (line 60). However, if the component unmounts while the fetch is pending, the timeout continues and setFileExists is called on an unmounted component.

Apply this diff to clean up on unmount:

   const checkFileExists = async () => {
     // For inline HTML, file always "exists"
     if (isInlineHtml && inlineHtmlContent) {
       setFileExists(true);
       return;
     }
     
     try {
       const cleanPath = cleanFilePath(filePath);
       console.log('Checking if HTML file exists:', cleanPath);
       
       // Use a simple GET request to check if file exists
       // We'll abort quickly to avoid downloading large files
       const controller = new AbortController();
       const timeoutId = setTimeout(() => controller.abort(), 3000); // 3 second timeout
       
-      const response = await fetch(`/api/load-html-file?path=${encodeURIComponent(cleanPath)}`, {
-        signal: controller.signal
-      });
-      
-      clearTimeout(timeoutId);
+      try {
+        const response = await fetch(`/api/load-html-file?path=${encodeURIComponent(cleanPath)}`, {
+          signal: controller.signal
+        });
+        
+        clearTimeout(timeoutId);
+        // ... rest of logic
+      } catch (err) {
+        clearTimeout(timeoutId);
+        throw err;
+      }

Better yet, return a cleanup function from the useEffect that calls controller.abort().

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In components/Chat/HtmlFileRenderer.tsx around lines 52-60, the AbortController
timeout is not cleaned up if the component unmounts while fetch is pending,
risking setFileExists being called on an unmounted component; fix by returning a
cleanup function from the useEffect that calls controller.abort() and
clearTimeout(timeoutId) so the pending fetch is aborted and the timer is
cancelled on unmount, ensuring no state updates occur after unmount.

Comment on lines +115 to +118
useEffect(() => {
// First check if file exists
checkFileExists();
}, [filePath, isInlineHtml, inlineHtmlContent]);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Missing dependency in useEffect causes stale closures.

The effect on line 115-118 calls checkFileExists() but doesn't include it in the dependency array. This violates the exhaustive-deps rule and can cause stale closures or missed updates.

Apply this diff:

   useEffect(() => {
     // First check if file exists
     checkFileExists();
-  }, [filePath, isInlineHtml, inlineHtmlContent]);
+  }, [filePath, isInlineHtml, inlineHtmlContent, checkFileExists]);

Alternatively, wrap checkFileExists in useCallback to stabilize its reference:

const checkFileExists = useCallback(async () => {
  // ... existing implementation
}, [filePath, isInlineHtml, inlineHtmlContent]);
🤖 Prompt for AI Agents
In components/Chat/HtmlFileRenderer.tsx around lines 115 to 118, the useEffect
calls checkFileExists() but does not include checkFileExists in the dependency
array which can lead to stale closures; fix by either adding checkFileExists to
the effect dependencies or stabilize its identity by wrapping it in useCallback
with the proper dependencies (filePath, isInlineHtml, inlineHtmlContent) and
then include that stable function in the useEffect deps so the effect runs
correctly when inputs change.

Comment on lines +120 to +125
useEffect(() => {
// Only load content if file exists and component is expanded
if (isExpanded && fileExists) {
loadHtmlContent();
}
}, [isExpanded, fileExists]);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Missing dependency in useEffect causes stale closures.

Similar issue: loadHtmlContent() is not in the dependency array, which can cause the effect to reference stale versions of the function.

Apply this diff:

   useEffect(() => {
     // Only load content if file exists and component is expanded
     if (isExpanded && fileExists) {
       loadHtmlContent();
     }
-  }, [isExpanded, fileExists]);
+  }, [isExpanded, fileExists, loadHtmlContent]);

And wrap loadHtmlContent in useCallback:

const loadHtmlContent = useCallback(async () => {
  // ... existing implementation
}, [isExpanded, htmlContent, error, fileExists, filePath, isInlineHtml, inlineHtmlContent]);
🤖 Prompt for AI Agents
In components/Chat/HtmlFileRenderer.tsx around lines 120 to 125, the useEffect
calls loadHtmlContent but doesn't include it in the dependency array which can
cause stale closures; wrap loadHtmlContent in useCallback and include the proper
dependencies and then add loadHtmlContent to the useEffect dependencies.
Specifically, refactor loadHtmlContent into useCallback(async () => { /*
existing implementation */ }, [isExpanded, htmlContent, error, fileExists,
filePath, isInlineHtml, inlineHtmlContent]) and update the useEffect dependency
array to [isExpanded, fileExists, loadHtmlContent].

Comment on lines +149 to +162
const openInSystemBrowser = () => {
if (isInlineHtml) {
// For inline HTML, create a blob URL and try to open it
try {
const blob = new Blob([inlineHtmlContent || ''], { type: 'text/html' });
const url = URL.createObjectURL(blob);
window.open(url, '_blank');
// Clean up the URL after a delay
setTimeout(() => URL.revokeObjectURL(url), 10000);
} catch (error) {
console.error('Error opening inline HTML:', error);
alert('Unable to open inline HTML content in new window.');
}
return;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Blob URL memory leak potential.

The blob URL created on line 154 is only revoked after 10 seconds (line 157). If the user closes the new window immediately, the blob remains in memory. Additionally, if this function is called multiple times, multiple blob URLs accumulate.

Consider tracking blob URLs and revoking them sooner:

const [activeBlobUrls, setActiveBlobUrls] = useState<string[]>([]);

const openInSystemBrowser = () => {
  if (isInlineHtml) {
    try {
      const blob = new Blob([inlineHtmlContent || ''], { type: 'text/html' });
      const url = URL.createObjectURL(blob);
      setActiveBlobUrls(prev => [...prev, url]);
      window.open(url, '_blank');
      // Clean up after a shorter delay or on unmount
      setTimeout(() => {
        URL.revokeObjectURL(url);
        setActiveBlobUrls(prev => prev.filter(u => u !== url));
      }, 5000);
    } catch (error) {
      // ...
    }
  }
};

// Cleanup on unmount
useEffect(() => {
  return () => {
    activeBlobUrls.forEach(url => URL.revokeObjectURL(url));
  };
}, [activeBlobUrls]);
🤖 Prompt for AI Agents
components/Chat/HtmlFileRenderer.tsx around lines 149-162: the current
implementation creates a blob URL and only revokes it after a fixed 10s timeout,
which can leak memory if windows are closed immediately or the function is
called repeatedly; fix by tracking created blob URLs (use a ref or state array)
and revoking each URL as soon as its opened window is closed (capture the window
handle from window.open and poll/attach unload to detect closure) or revoke
after a much shorter delay if detection fails, and ensure any remaining URLs are
revoked in a cleanup on unmount to prevent accumulation.

Comment on lines +301 to +308
<iframe
srcDoc={htmlContent}
className="w-full border-0"
sandbox="allow-scripts allow-same-origin"
title={title || (isInlineHtml ? 'Inline HTML Content' : 'HTML Content')}
style={{ height: '600px', minHeight: '500px' }}
/>
</div>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential XSS vulnerability with unsanitized HTML in iframe srcDoc.

The htmlContent is rendered directly in an iframe via srcDoc without sanitization. While the sandbox attribute provides some protection, a sophisticated XSS payload could still escape or exploit browser vulnerabilities, especially if the HTML comes from untrusted sources.

Consider adding HTML sanitization:

import DOMPurify from 'isomorphic-dompurify';

// In the render section:
<iframe
  srcDoc={DOMPurify.sanitize(htmlContent, { 
    ALLOWED_TAGS: ['html', 'head', 'body', 'script', 'div', 'svg', 'style', ...],
    ALLOWED_ATTR: ['class', 'id', 'style', ...]
  })}
  // ...
/>

However, for PlotlyJS plots, you may need to allow <script> tags, which limits sanitization effectiveness. Document this risk and ensure HTML files only come from trusted sources.

Comment on lines +25 to +36
// Security check: prevent directory traversal attacks
const resolvedPath = path.resolve(filePath);
const workspaceRoot = process.cwd();

// Check if the file is within the workspace or in expected output directories
const isInWorkspace = resolvedPath.startsWith(workspaceRoot) ||
resolvedPath.includes('/output_data/') ||
resolvedPath.includes('/plots/');

if (!isInWorkspace) {
return res.status(403).json({ error: 'Access to this file path is not allowed' });
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Path traversal vulnerability and weak access control.

The current implementation has multiple security issues:

  1. Path traversal not prevented: path.resolve() alone doesn't stop traversal attacks. An attacker can supply ../../../etc/passwd.html which resolves to an absolute path outside the workspace.

  2. Weak boundary check (lines 30-32): includes('/output_data/') matches anywhere in the path, not just allowed directories. An attacker could use /tmp/output_data/../../../etc/passwd.html.

  3. Race condition: Checking startsWith(workspaceRoot) AFTER allowing paths with /output_data/ creates inconsistent security boundaries.

Apply this diff to fix the path traversal vulnerability:

-    // Security check: prevent directory traversal attacks
-    const resolvedPath = path.resolve(filePath);
-    const workspaceRoot = process.cwd();
-    
-    // Check if the file is within the workspace or in expected output directories
-    const isInWorkspace = resolvedPath.startsWith(workspaceRoot) || 
-                         resolvedPath.includes('/output_data/') ||
-                         resolvedPath.includes('/plots/');
-    
-    if (!isInWorkspace) {
-      return res.status(403).json({ error: 'Access to this file path is not allowed' });
-    }
+    // Security check: prevent directory traversal attacks
+    const workspaceRoot = process.cwd();
+    const allowedDirs = [
+      path.join(workspaceRoot, 'output_data'),
+      path.join(workspaceRoot, 'plots'),
+      workspaceRoot
+    ];
+    
+    const resolvedPath = path.resolve(workspaceRoot, filePath);
+    const isInAllowedDir = allowedDirs.some(dir => resolvedPath.startsWith(dir + path.sep));
+    
+    if (!isInAllowedDir) {
+      return res.status(403).json({ error: 'Access to this file path is not allowed' });
+    }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Security check: prevent directory traversal attacks
const resolvedPath = path.resolve(filePath);
const workspaceRoot = process.cwd();
// Check if the file is within the workspace or in expected output directories
const isInWorkspace = resolvedPath.startsWith(workspaceRoot) ||
resolvedPath.includes('/output_data/') ||
resolvedPath.includes('/plots/');
if (!isInWorkspace) {
return res.status(403).json({ error: 'Access to this file path is not allowed' });
}
// Security check: prevent directory traversal attacks
const workspaceRoot = process.cwd();
const allowedDirs = [
path.join(workspaceRoot, 'output_data'),
path.join(workspaceRoot, 'plots'),
workspaceRoot
];
const resolvedPath = path.resolve(workspaceRoot, filePath);
const isInAllowedDir = allowedDirs.some(dir => resolvedPath.startsWith(dir + path.sep));
if (!isInAllowedDir) {
return res.status(403).json({ error: 'Access to this file path is not allowed' });
}
🤖 Prompt for AI Agents
pages/api/load-html-file.ts lines 25-36: the current path checks allow traversal
and misuse of includes() and create a race condition; instead compute
resolvedPath = path.resolve(filePath) and compute rel =
path.relative(workspaceRoot, resolvedPath) ensuring rel does not start with '..'
and is not equal to '..' (use path.sep normalization) to guarantee the file is
inside workspaceRoot, and separately pre-resolve absolute allowed output
directories (e.g., path.resolve(workspaceRoot, 'output_data'),
path.resolve(workspaceRoot, 'plots')) and check that resolvedPath startsWith one
of those canonical allowed dirs (no includes()); perform these checks before any
filesystem access and, to mitigate TOCTOU, open the file using a safe syscall
(or fs.promises.open) and validate the file descriptor's real path (fs.realpath)
against allowed canonical paths before reading.

Comment on lines +38 to +44
// Check if file exists
if (!fs.existsSync(resolvedPath)) {
return res.status(404).json({ error: 'File not found' });
}

// Read the HTML file
const htmlContent = fs.readFileSync(resolvedPath, 'utf-8');
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Major: Use async file system operations to avoid blocking.

Synchronous fs.existsSync() and fs.readFileSync() block the Node.js event loop, degrading performance under load. Next.js API routes should use async operations.

Apply this diff:

-    // Check if file exists
-    if (!fs.existsSync(resolvedPath)) {
-      return res.status(404).json({ error: 'File not found' });
-    }
-
-    // Read the HTML file
-    const htmlContent = fs.readFileSync(resolvedPath, 'utf-8');
+    // Check if file exists and read it
+    let htmlContent: string;
+    try {
+      htmlContent = await fs.promises.readFile(resolvedPath, 'utf-8');
+    } catch (err: any) {
+      if (err.code === 'ENOENT') {
+        return res.status(404).json({ error: 'File not found' });
+      }
+      throw err;
+    }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Check if file exists
if (!fs.existsSync(resolvedPath)) {
return res.status(404).json({ error: 'File not found' });
}
// Read the HTML file
const htmlContent = fs.readFileSync(resolvedPath, 'utf-8');
// Check if file exists and read it
let htmlContent: string;
try {
htmlContent = await fs.promises.readFile(resolvedPath, 'utf-8');
} catch (err: any) {
if (err.code === 'ENOENT') {
return res.status(404).json({ error: 'File not found' });
}
throw err;
}
🤖 Prompt for AI Agents
In pages/api/load-html-file.ts around lines 38 to 44, the code uses blocking
fs.existsSync and fs.readFileSync; replace them with non-blocking async
equivalents: use fs.promises.stat or fs.promises.access (with try/catch) to
check existence and permissions, and use await
fs.promises.readFile(resolvedPath, 'utf-8') to read the file; ensure the
surrounding handler is async, wrap the operations in try/catch to return 404 or
500 as appropriate, and avoid using synchronous APIs to prevent blocking the
event loop.

Comment on lines +46 to +48
// Set appropriate headers
res.setHeader('Content-Type', 'text/html');
res.setHeader('Cache-Control', 'no-cache, no-store, must-revalidate');
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Add security headers to prevent XSS and clickjacking.

HTML content from files should be served with Content Security Policy (CSP) and X-Frame-Options headers to mitigate XSS risks, especially since this content will be rendered in iframes.

Add these headers:

     // Set appropriate headers
     res.setHeader('Content-Type', 'text/html');
     res.setHeader('Cache-Control', 'no-cache, no-store, must-revalidate');
+    res.setHeader('X-Content-Type-Options', 'nosniff');
+    res.setHeader('Content-Security-Policy', "default-src 'self' 'unsafe-inline' 'unsafe-eval'; script-src 'self' 'unsafe-inline' 'unsafe-eval' https://cdn.plot.ly;");

Note: The CSP allows unsafe-inline and unsafe-eval for PlotlyJS functionality - adjust based on your exact security requirements.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Set appropriate headers
res.setHeader('Content-Type', 'text/html');
res.setHeader('Cache-Control', 'no-cache, no-store, must-revalidate');
// Set appropriate headers
res.setHeader('Content-Type', 'text/html');
res.setHeader('Cache-Control', 'no-cache, no-store, must-revalidate');
res.setHeader('X-Content-Type-Options', 'nosniff');
res.setHeader('Content-Security-Policy', "default-src 'self' 'unsafe-inline' 'unsafe-eval'; script-src 'self' 'unsafe-inline' 'unsafe-eval' https://cdn.plot.ly;");
🤖 Prompt for AI Agents
In pages/api/load-html-file.ts around lines 46 to 48, the response only sets
Content-Type and Cache-Control but lacks security headers; add a
Content-Security-Policy header that restricts sources while allowing
'unsafe-inline' and 'unsafe-eval' for PlotlyJS (e.g. default-src 'none';
script-src 'self' 'unsafe-inline' 'unsafe-eval' https:; style-src 'self'
'unsafe-inline' https:; img-src 'self' data: https:; connect-src 'self' https:;
font-src 'self' https:), and add X-Frame-Options set to SAMEORIGIN (or DENY if
not intended for iframes) to prevent clickjacking; set these headers on the
response before sending the HTML.

Comment on lines +227 to +261
export function removeHtmlFileLinksFromContent(content: string): string {
let cleanContent = content;

// Remove HTML anchor tags to HTML files
cleanContent = cleanContent.replace(
/<a\s+href=["']?(file:\/\/[^"'\s>]+\.html)["']?[^>]*>([^<]+)<\/a>/gi,
''
);

// Remove HTML img tags that reference HTML files
cleanContent = cleanContent.replace(/<img\s+[^>]*src=["']?(file:\/\/[^"'\s>]+\.html)["']?[^>]*\/?>/gi, '');

// Remove any complete HTML blocks
cleanContent = cleanContent.replace(/<html[^>]*>[\s\S]*?<\/html>/gi, '');

// Remove markdown links to HTML files
cleanContent = cleanContent.replace(/\[([^\]]+)\]\((file:\/\/[^)]+\.html)\)/g, '');

// Remove standalone file:// URLs
cleanContent = cleanContent.replace(/\s(file:\/\/[^\s"'<>`\)]+\.html)\s*(?=[.,!?;:\n]|$)/g, '');
cleanContent = cleanContent.replace(/^(file:\/\/[^\s"'<>`\)]+\.html)\s*(?=[.,!?;:\n]|$)/gm, '');

// Remove file paths in various quote styles
cleanContent = cleanContent.replace(/`([^`]+\.html)`/g, '');
cleanContent = cleanContent.replace(/"([^"]+\.html)"/g, '');
cleanContent = cleanContent.replace(/'([^']+\.html)'/g, '');

// Remove plain file paths
cleanContent = cleanContent.replace(/\s(\/[\w\/\-\.]+\.html)\s*(?=[.,!?;:\n]|$)/g, '');

// Clean up extra newlines
cleanContent = cleanContent.replace(/\n\s*\n\s*\n+/g, '\n\n');

return cleanContent.trim();
} No newline at end of file
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix character class consistency (line 255) and verify removal patterns.

Line 255 has the same character class issue as line 159 in the detection patterns. Additionally, ensure that the removal patterns stay synchronized with the detection patterns to prevent display inconsistencies.

Apply this diff to fix the character class:

-  cleanContent = cleanContent.replace(/\s(\/[\w\/\-\.]+\.html)\s*(?=[.,!?;:\n]|$)/g, '');
+  cleanContent = cleanContent.replace(/\s(\/[\w/.\-]+\.html)\s*(?=[.,!?;:\n]|$)/g, '');

Otherwise, the comprehensive removal logic correctly handles various formats and cleans up extra newlines effectively.

Consider extracting the regex patterns into shared constants used by both detectHtmlFileLinks and removeHtmlFileLinksFromContent to ensure they stay synchronized. This would prevent potential bugs where detection and removal patterns diverge.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export function removeHtmlFileLinksFromContent(content: string): string {
let cleanContent = content;
// Remove HTML anchor tags to HTML files
cleanContent = cleanContent.replace(
/<a\s+href=["']?(file:\/\/[^"'\s>]+\.html)["']?[^>]*>([^<]+)<\/a>/gi,
''
);
// Remove HTML img tags that reference HTML files
cleanContent = cleanContent.replace(/<img\s+[^>]*src=["']?(file:\/\/[^"'\s>]+\.html)["']?[^>]*\/?>/gi, '');
// Remove any complete HTML blocks
cleanContent = cleanContent.replace(/<html[^>]*>[\s\S]*?<\/html>/gi, '');
// Remove markdown links to HTML files
cleanContent = cleanContent.replace(/\[([^\]]+)\]\((file:\/\/[^)]+\.html)\)/g, '');
// Remove standalone file:// URLs
cleanContent = cleanContent.replace(/\s(file:\/\/[^\s"'<>`\)]+\.html)\s*(?=[.,!?;:\n]|$)/g, '');
cleanContent = cleanContent.replace(/^(file:\/\/[^\s"'<>`\)]+\.html)\s*(?=[.,!?;:\n]|$)/gm, '');
// Remove file paths in various quote styles
cleanContent = cleanContent.replace(/`([^`]+\.html)`/g, '');
cleanContent = cleanContent.replace(/"([^"]+\.html)"/g, '');
cleanContent = cleanContent.replace(/'([^']+\.html)'/g, '');
// Remove plain file paths
cleanContent = cleanContent.replace(/\s(\/[\w\/\-\.]+\.html)\s*(?=[.,!?;:\n]|$)/g, '');
// Clean up extra newlines
cleanContent = cleanContent.replace(/\n\s*\n\s*\n+/g, '\n\n');
return cleanContent.trim();
}
export function removeHtmlFileLinksFromContent(content: string): string {
let cleanContent = content;
// Remove HTML anchor tags to HTML files
cleanContent = cleanContent.replace(
/<a\s+href=["']?(file:\/\/[^"'\s>]+\.html)["']?[^>]*>([^<]+)<\/a>/gi,
''
);
// Remove HTML img tags that reference HTML files
cleanContent = cleanContent.replace(/<img\s+[^>]*src=["']?(file:\/\/[^"'\s>]+\.html)["']?[^>]*\/?>/gi, '');
// Remove any complete HTML blocks
cleanContent = cleanContent.replace(/<html[^>]*>[\s\S]*?<\/html>/gi, '');
// Remove markdown links to HTML files
cleanContent = cleanContent.replace(/\[([^\]]+)\]\((file:\/\/[^)]+\.html)\)/g, '');
// Remove standalone file:// URLs
cleanContent = cleanContent.replace(/\s(file:\/\/[^\s"'<>`\)]+\.html)\s*(?=[.,!?;:\n]|$)/g, '');
cleanContent = cleanContent.replace(/^(file:\/\/[^\s"'<>`\)]+\.html)\s*(?=[.,!?;:\n]|$)/gm, '');
// Remove file paths in various quote styles
cleanContent = cleanContent.replace(/`([^`]+\.html)`/g, '');
cleanContent = cleanContent.replace(/"([^"]+\.html)"/g, '');
cleanContent = cleanContent.replace(/'([^']+\.html)'/g, '');
// Remove plain file paths
cleanContent = cleanContent.replace(/\s(\/[\w/.\-]+\.html)\s*(?=[.,!?;:\n]|$)/g, '');
// Clean up extra newlines
cleanContent = cleanContent.replace(/\n\s*\n\s*\n+/g, '\n\n');
return cleanContent.trim();
}
🤖 Prompt for AI Agents
In utils/app/htmlFileDetector.ts around lines 227 to 261, the removal regex at
line 255 uses a different character class than the corresponding detection
pattern (same issue as at line 159) causing mismatched behavior; update the
offending regex to use the exact same character class/character set as the
detection pattern (make the file:// and path character classes identical, e.g.
mirror the allowed characters like [^\s"'<>`\)]+ used in detection), ensure all
other removal patterns match their detection counterparts, and optionally
refactor by extracting the shared regex patterns into constants used by both
detectHtmlFileLinks and removeHtmlFileLinksFromContent so they stay
synchronized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant