-
Notifications
You must be signed in to change notification settings - Fork 1.1k
feat: better, faster, smarter capture list #675
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Caution Review failedThe pull request is closed. WalkthroughThis update refactors backend DOM snapshotting to conditionally handle DOM streaming and screenshot modes with improved lifecycle management and cleanup. The frontend recorder and browser components gain group-aware DOM element highlighting and selection with XPath-based selectors. Extraction helpers unify and enhance XPath support and simplify list data extraction, improving robustness and flexibility in DOM mode. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant BrowserWindow (Frontend)
participant DOMBrowserRenderer
participant clientSelectorGenerator
participant RemoteBrowser (Backend)
User->>BrowserWindow: Initiate element selection in DOM mode
BrowserWindow->>DOMBrowserRenderer: Render highlights, pass cached selectors
DOMBrowserRenderer->>clientSelectorGenerator: Generate highlight data (with group info)
clientSelectorGenerator-->>DOMBrowserRenderer: Return selector, group info, rects
DOMBrowserRenderer-->>BrowserWindow: On highlight/select, send group info
BrowserWindow->>RemoteBrowser: Request DOM snapshot (debounced)
RemoteBrowser-->>BrowserWindow: Return raw snapshot (no resource embedding)
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (2)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 11
🔭 Outside diff range comments (1)
src/helpers/clientSelectorGenerator.ts (1)
532-987: Significant code duplication in getElementInformation method.The two branches (lines 539-848 and 850-980) contain nearly identical code for extracting element information. The only difference is the initial element selection method.
Extract the common element information extraction logic into a separate method:
+private extractElementInfo(element: HTMLElement): ElementInfo { + const ownerDocument = element.ownerDocument; + const frameElement = ownerDocument?.defaultView?.frameElement; + const isIframeContent = Boolean(frameElement); + const isFrameContent = frameElement?.tagName === "FRAME"; + + const containingShadowRoot = element.getRootNode() as ShadowRoot; + const isShadowRoot = containingShadowRoot instanceof ShadowRoot; + + let info: ElementInfo = { + tagName: element?.tagName ?? "", + isIframeContent, + isFrameContent, + isShadowRoot, + }; + + // ... rest of the extraction logic ... + + return info; +} public getElementInformation = ( iframeDoc: Document, coordinates: Coordinates, listSelector: string, getList: boolean ) => { try { let element: HTMLElement | null = null; if (!getList || listSelector !== "") { // Use existing getDeepestElementFromPoint logic const el = getDeepestElementFromPoint(coordinates.x, coordinates.y); if (el) { const { parentElement } = el; element = parentElement?.tagName === "A" ? parentElement : el; } } else { element = this.findGroupedContainerAtPoint( coordinates.x, coordinates.y, iframeDoc ); } if (!element) return null; return this.extractElementInfo(element); } catch (error) { const { message, stack } = error as Error; console.error("Error while retrieving selector:", message); console.error("Stack:", stack); } };
♻️ Duplicate comments (1)
src/helpers/clientSelectorGenerator.ts (1)
988-1326: Code duplication in getRect method similar to getElementInformation.This method also contains duplicated logic between the two branches. Consider applying the same refactoring pattern to extract common rectangle calculation logic.
🧹 Nitpick comments (9)
src/components/recorder/RightSidePanel.tsx (1)
250-261: Remove or properly configure debug logging.These console.log statements appear to be debug logging that should not be in production code. Consider using a proper logging framework with configurable log levels or removing these logs entirely.
If logging is needed for production debugging, use a logging utility:
- Object.entries(fields).forEach(([key, field]) => { - if (field.selectorObj?.selector) { - const isFieldXPath = - field.selectorObj.selector.startsWith("//") || - field.selectorObj.selector.startsWith("/"); - console.log( - `Field "${key}" selector:`, - field.selectorObj.selector, - `(XPath: ${isFieldXPath})` - ); - } - }); + if (process.env.NODE_ENV === 'development') { + Object.entries(fields).forEach(([key, field]) => { + if (field.selectorObj?.selector) { + const isFieldXPath = + field.selectorObj.selector.startsWith("//") || + field.selectorObj.selector.startsWith("/"); + console.log( + `Field "${key}" selector:`, + field.selectorObj.selector, + `(XPath: ${isFieldXPath})` + ); + } + }); + }src/helpers/clientListExtractor.ts (2)
91-100: Consider extracting XPath detection logic to reduce duplication.The XPath detection pattern
selector.startsWith("//") || selector.startsWith("/") || selector.startsWith("./")is repeated multiple times. Consider extracting this into a private helper method.Add a helper method:
+ private isXPathSelector(selector: string): boolean { + return selector.startsWith("//") || selector.startsWith("/") || selector.startsWith("./"); + }Then use it throughout the method:
- if ( - selector.startsWith("//") || - selector.startsWith("/") || - selector.startsWith("./") - ) { + if (this.isXPathSelector(selector)) {Also applies to: 123-131, 148-156, 164-177, 186-197
331-355: Good enhancements to text extraction with data attribute fallbacks.The logic for checking common data attributes is well thought out.
Apply optional chaining as suggested by static analysis:
- if (dataValue && dataValue.trim()) { + if (dataValue?.trim()) {server/src/browser-management/classes/RemoteBrowser.ts (1)
445-466: Simple and effective debounce mechanism for DOM snapshots.The 8-second quiet period approach is a good simplification from the previous complex network monitoring.
Consider adding a size limit to prevent excessive memory usage in edge cases:
this.pendingNetworkRequests.push(url); + // Limit array size to prevent memory issues + if (this.pendingNetworkRequests.length > 1000) { + this.pendingNetworkRequests = this.pendingNetworkRequests.slice(-500); + }src/helpers/clientSelectorGenerator.ts (5)
203-208: Add safeguard against potential circular references in parent traversal.While circular parent references are rare in the DOM, the depth calculation could theoretically loop infinitely. The current limit of 20 is good, but consider adding an additional check for circular references.
// Calculate element depth let depth = 0; let parent = element.parentElement; +const visited = new Set<Element>(); while (parent && depth < 20) { + if (visited.has(parent)) break; + visited.add(parent); depth++; parent = parent.parentElement; }
479-495: Document the table cell transformation behavior.The automatic transformation of table cells (TD/TH) to rows (TR) in list mode without a selector is a significant behavior that should be documented.
// In list mode without selector, transform table cells to rows and prioritize grouped elements if (this.getList === true && this.listSelector === "") { + // Transform table cells to their parent rows for better list extraction + // This allows selecting entire rows when clicking on individual cells const transformedElements: HTMLElement[] = []; elementsAtPoint.forEach((element) => { if (element.tagName === "TD" || element.tagName === "TH") {
2365-2755: Well-implemented XPath generation, but consider breaking down this large method.The XPath selector generation is robust with good handling of classes and positions. However, at nearly 400 lines, this method would benefit from being split into smaller, focused methods.
Consider extracting:
- XPath generation logic into a separate class/module
- Shadow DOM path handling into its own method
- The two main branches (with/without listSelector) into separate methods
2850-2887: Good fallback strategy for XPath evaluation.The fallback XPath evaluation handles common patterns well. However, the warning message uses emoji which might not be appropriate for all environments.
-console.warn("⚠️ Could not parse XPath pattern:", xpath); +console.warn("Could not parse XPath pattern:", xpath);and
-console.error("❌ Fallback XPath evaluation also failed:", error); +console.error("Fallback XPath evaluation also failed:", error);
1-3577: Consider splitting this large file into smaller, focused modules.At 3577 lines, this file has grown quite large and handles multiple responsibilities:
- Element fingerprinting and grouping
- XPath selector generation
- Element information extraction
- Rectangle calculations
- Child selector generation
Consider refactoring into separate modules:
ElementGroupAnalyzer- fingerprinting and grouping logicXPathSelectorGenerator- XPath generation and evaluationElementInfoExtractor- element information extractionDOMTraversalHelper- iframe/shadow DOM traversal utilitiesThis would improve maintainability, testability, and make the codebase easier to understand.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
server/src/browser-management/classes/RemoteBrowser.ts(11 hunks)src/components/browser/BrowserWindow.tsx(14 hunks)src/components/recorder/DOMBrowserRenderer.tsx(11 hunks)src/components/recorder/RightSidePanel.tsx(3 hunks)src/helpers/clientListExtractor.ts(10 hunks)src/helpers/clientSelectorGenerator.ts(15 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (4)
src/components/recorder/RightSidePanel.tsx (2)
src/helpers/clientListExtractor.ts (1)
clientListExtractor(559-559)src/helpers/clientSelectorGenerator.ts (1)
clientSelectorGenerator(3576-3576)
src/components/recorder/DOMBrowserRenderer.tsx (1)
src/helpers/clientSelectorGenerator.ts (1)
clientSelectorGenerator(3576-3576)
src/helpers/clientListExtractor.ts (1)
maxun-core/src/browserSide/scraper.js (9)
selector(28-28)nextElement(451-451)part(219-219)attrValue(558-558)dataAttr(559-559)style(563-563)bgImage(564-564)matches(566-566)containers(753-753)
src/helpers/clientSelectorGenerator.ts (2)
maxun-core/src/browserSide/scraper.js (10)
depth(587-587)allElements(692-692)current(638-638)parentElement(545-545)position(734-734)index(625-625)selector(28-28)sibling(626-626)part(219-219)lists(954-954)src/context/browserActions.tsx (1)
ActionType(9-9)
🪛 Biome (1.9.4)
src/helpers/clientListExtractor.ts
[error] 348-348: Change to an optional chain.
Unsafe fix: Change to an optional chain.
(lint/complexity/useOptionalChain)
🔇 Additional comments (13)
src/components/recorder/RightSidePanel.tsx (1)
272-280: Good improvements for user experience and state management.The warning notification when no data is extracted helps users identify selector issues, and the cleanup call ensures cached group data is properly cleared after list capture completes.
Also applies to: 530-530
src/components/recorder/DOMBrowserRenderer.tsx (2)
212-223: Good performance optimization with early returns.The early returns when not in capture mode prevent unnecessary DOM traversal and highlight calculations, improving performance during normal browsing.
Also applies to: 356-359
250-279: Clean and well-structured highlight logic.The simplified highlighting logic with clear conditions for groups, list selectors, and pagination modes makes the code much more maintainable and easier to understand.
src/components/browser/BrowserWindow.tsx (1)
1216-1297: Excellent implementation of group element highlighting.The visual distinction between individual elements (solid border) and group elements (dashed border with labels) provides clear feedback to users. The real-time coordinate mapping ensures accurate highlighting positions.
src/helpers/clientListExtractor.ts (3)
24-49: Well-implemented XPath evaluation method.The error handling and owner document retrieval logic are properly implemented.
51-84: Solid implementation for evaluating multiple XPath results.Proper use of XPath snapshot API with element type filtering.
419-500: Excellent refactoring of the extraction logic.The simplified approach using indexed XPath selectors is much cleaner than the previous table-specific logic. Good error handling and informative logging.
server/src/browser-management/classes/RemoteBrowser.ts (5)
619-620: Good optimization moving rrweb injection to context level.This ensures the script is available across all pages without repeated injection.
706-710: Good defensive check for closed pages.Prevents potential errors when trying to extract data from closed pages.
1600-1604: Proper cleanup of network request timeout.Good addition to prevent dangling timeouts when stopping DOM streaming.
1623-1652: Comprehensive error handling in shutdown sequence.The try-catch wrapper ensures cleanup continues even if individual operations fail, which is crucial for proper resource management.
1538-1542: Simplified snapshot capture without resource embedding.The removal of complex resource processing aligns well with the PR's goal of simplifying the capture mechanism.
src/helpers/clientSelectorGenerator.ts (1)
3492-3496: Verify cleanup usage across all consuming componentsWe found that
clientSelectorGenerator.cleanup()is invoked in the following location, which correctly prevents stale data in that component:
- src/components/recorder/RightSidePanel.tsx (called inside the
useEffectcleanup)Please review other parts of the app that import or use
clientSelectorGeneratorand ensurecleanup()is called appropriately (e.g., in unmount handlers or mode-change effects) to avoid potential memory leaks.
| 5 | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid hardcoding the extraction limit.
The extraction limit is hardcoded to 5, which reduces flexibility. This should be configurable based on user settings or use a named constant.
Consider using the limit from the list step configuration:
+ const listStep = browserSteps.find(step => step.id === currentListId);
+ const limit = listStep?.limit || 10; // Default to 10 if not set
+
const extractedData = clientListExtractor.extractListData(
iframeDoc,
listSelector,
fields,
- 5
+ limit
);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| 5 | |
| ); | |
| const listStep = browserSteps.find(step => step.id === currentListId); | |
| const limit = listStep?.limit || 10; // Default to 10 if not set | |
| const extractedData = clientListExtractor.extractListData( | |
| iframeDoc, | |
| listSelector, | |
| fields, | |
| limit | |
| ); |
🤖 Prompt for AI Agents
In src/components/recorder/RightSidePanel.tsx around lines 267 to 268, the
extraction limit is hardcoded to 5, which limits flexibility. Replace the
hardcoded value with a configurable parameter, such as retrieving the limit from
the list step configuration or defining a named constant that can be adjusted
based on user settings. This will make the extraction limit adaptable without
code changes.
| selector: string; | ||
| elementInfo: ElementInfo | null; | ||
| childSelectors?: string[]; | ||
| groupInfo?: any; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve type safety by defining proper types for groupInfo.
The groupInfo parameter uses any type, which reduces type safety and makes the code harder to maintain.
Define a proper interface for groupInfo:
+interface GroupInfo {
+ isGroupElement: boolean;
+ groupSize: number;
+ groupElements: HTMLElement[];
+ groupFingerprint?: ElementFingerprint;
+}
interface RRWebDOMBrowserRendererProps {
// ... other props
onHighlight?: (data: {
rect: DOMRect;
selector: string;
elementInfo: ElementInfo | null;
childSelectors?: string[];
- groupInfo?: any;
+ groupInfo?: GroupInfo;
}) => void;
onElementSelect?: (data: {
rect: DOMRect;
selector: string;
elementInfo: ElementInfo | null;
childSelectors?: string[];
- groupInfo?: any;
+ groupInfo?: GroupInfo;
}) => void;Also applies to: 117-117
🤖 Prompt for AI Agents
In src/components/recorder/DOMBrowserRenderer.tsx at lines 110 and 117, the
groupInfo parameter is typed as any, which reduces type safety. Define a proper
TypeScript interface describing the expected structure of groupInfo and replace
the any type with this interface to improve maintainability and type safety.
| // Get the group info for the current highlight | ||
| const highlighterData = | ||
| clientSelectorGenerator.generateDataForHighlighter( | ||
| { x: iframeX, y: iframeY }, | ||
| iframeDoc, | ||
| true, | ||
| cachedChildSelectors | ||
| ); | ||
|
|
||
| onElementSelect({ | ||
| rect: currentHighlight.rect, | ||
| selector: currentHighlight.selector, | ||
| elementInfo: currentHighlight.elementInfo, | ||
| childSelectors: currentHighlight.childSelectors || [], | ||
| childSelectors: | ||
| cachedChildSelectors.length > 0 | ||
| ? cachedChildSelectors | ||
| : highlighterData?.childSelectors || [], | ||
| groupInfo: highlighterData?.groupInfo, | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Remove redundant selector generation call.
The code retrieves highlighter data again to get groupInfo, but this information should already be available from the currentHighlight state that was set during the last mouse move.
Use the existing highlight data:
if (currentHighlight && onElementSelect) {
- // Get the group info for the current highlight
- const highlighterData =
- clientSelectorGenerator.generateDataForHighlighter(
- { x: iframeX, y: iframeY },
- iframeDoc,
- true,
- cachedChildSelectors
- );
-
onElementSelect({
rect: currentHighlight.rect,
selector: currentHighlight.selector,
elementInfo: currentHighlight.elementInfo,
childSelectors:
cachedChildSelectors.length > 0
? cachedChildSelectors
- : highlighterData?.childSelectors || [],
- groupInfo: highlighterData?.groupInfo,
+ : currentHighlight.childSelectors || [],
+ groupInfo: currentHighlight.groupInfo,
});
}To support this, update the currentHighlight state interface to include groupInfo.
🤖 Prompt for AI Agents
In src/components/recorder/DOMBrowserRenderer.tsx around lines 398 to 416,
remove the redundant call to clientSelectorGenerator.generateDataForHighlighter
since groupInfo is already present in currentHighlight. Update the
currentHighlight state interface to include groupInfo if it does not already,
then use currentHighlight.groupInfo directly in the onElementSelect call instead
of calling generateDataForHighlighter again.
| let cleanedSelector = highlighterData.selector; | ||
|
|
||
| setListSelector(cleanedSelector); | ||
| notify( | ||
| `info`, | ||
| t( | ||
| "browser_window.attribute_modal.notifications.group_select_success", | ||
| { | ||
| count: highlighterData.groupInfo.groupSize, | ||
| } | ||
| ) || | ||
| `Selected group with ${highlighterData.groupInfo.groupSize} similar elements` | ||
| ); | ||
| setCurrentListId(Date.now()); | ||
| setFields({}); | ||
|
|
||
| socket?.emit("setGetList", { getList: true }); | ||
| socket?.emit("listSelector", { selector: cleanedSelector }); | ||
|
|
||
| return; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Extract XPath cleaning logic to a utility function.
The selector cleaning logic for removing index patterns is duplicated in multiple places.
Create a utility function:
const cleanSelector = (selector: string): string => {
// Remove index patterns for grouping
if (selector.includes("[") && selector.match(/\[\d+\]/)) {
return selector.replace(/\[\d+\]/g, "");
}
// Clean XPath selectors
if (selector.includes("/")) {
const xpathParts = selector.split("/").filter(part => part);
const cleanedParts = xpathParts.map(part => part.replace(/\[\d+\]/g, ""));
if (cleanedParts.length > 0) {
return "//" + cleanedParts.join("/");
}
}
return selector;
};Then use it in both locations:
- let cleanedSelector = highlighterData.selector;
+ const cleanedSelector = cleanSelector(highlighterData.selector);Also applies to: 988-999
🤖 Prompt for AI Agents
In src/components/browser/BrowserWindow.tsx around lines 738 to 758 and also
lines 988 to 999, the logic for cleaning selectors by removing index patterns is
duplicated. Extract this logic into a reusable utility function named
cleanSelector that removes numeric index patterns from selectors and XPath
strings as described. Replace the existing inline cleaning code in both
locations with calls to this new cleanSelector function to avoid duplication and
improve maintainability.
| useEffect(() => { | ||
| document.addEventListener("mousemove", onMouseMove, false); | ||
| if (socket) { | ||
| socket.off("highlighter", highlighterHandler); | ||
| socket.on("highlighter", highlighterHandler); | ||
| } | ||
| return () => { | ||
| document.removeEventListener("mousemove", onMouseMove); | ||
| if (socket) { | ||
| socket.off("highlighter", highlighterHandler); | ||
| } | ||
| }; | ||
| }, [socket, highlighterHandler, getList, listSelector]); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove duplicate useEffect hook.
This useEffect hook is an exact duplicate of the one at lines 647-660. This will cause event listeners to be registered twice, leading to potential performance issues and unexpected behavior.
Remove this duplicate block entirely.
🤖 Prompt for AI Agents
In src/components/browser/BrowserWindow.tsx around lines 662 to 675, there is a
duplicate useEffect hook that repeats the same event listener and socket setup
logic as the one at lines 647 to 660. This causes duplicate event listeners and
socket handlers. Remove the entire useEffect block from lines 662 to 675 to
eliminate the duplication and prevent multiple registrations.
| const handleClick = (e: React.MouseEvent<HTMLDivElement>) => { | ||
| if (highlighterData) { | ||
| let shouldProcessClick = false; | ||
|
|
||
| if (!isDOMMode && canvasRef?.current) { | ||
| const canvasRect = canvasRef.current.getBoundingClientRect(); | ||
| const clickX = e.clientX - canvasRect.left; | ||
| const clickY = e.clientY - canvasRect.top; | ||
| const highlightRect = highlighterData.rect; | ||
| const mappedRect = | ||
| coordinateMapper.mapBrowserRectToCanvas(highlightRect); | ||
|
|
||
| shouldProcessClick = | ||
| clickX >= mappedRect.left && | ||
| clickX <= mappedRect.right && | ||
| clickY >= mappedRect.top && | ||
| clickY <= mappedRect.bottom; | ||
| } else { | ||
| shouldProcessClick = true; | ||
| } | ||
|
|
||
| if (shouldProcessClick) { | ||
| const options = getAttributeOptions( | ||
| highlighterData.elementInfo?.tagName || "", | ||
| highlighterData.elementInfo | ||
| ); | ||
|
|
||
| if (getText === true) { | ||
| if (options.length === 1) { | ||
| const attribute = options[0].value; | ||
| const data = | ||
| attribute === "href" | ||
| ? highlighterData.elementInfo?.url || "" | ||
| : attribute === "src" | ||
| ? highlighterData.elementInfo?.imageUrl || "" | ||
| : highlighterData.elementInfo?.innerText || ""; | ||
|
|
||
| addTextStep( | ||
| "", | ||
| data, | ||
| { | ||
| selector: highlighterData.selector, | ||
| tag: highlighterData.elementInfo?.tagName, | ||
| shadow: highlighterData.elementInfo?.isShadowRoot, | ||
| attribute, | ||
| }, | ||
| currentTextActionId || `text-${crypto.randomUUID()}` | ||
| ); | ||
| const attribute = options[0].value; | ||
| const data = | ||
| attribute === "href" | ||
| ? highlighterData.elementInfo?.url || "" | ||
| : attribute === "src" | ||
| ? highlighterData.elementInfo?.imageUrl || "" | ||
| : highlighterData.elementInfo?.innerText || ""; | ||
|
|
||
| addTextStep( | ||
| "", | ||
| data, | ||
| { | ||
| selector: highlighterData.selector, | ||
| tag: highlighterData.elementInfo?.tagName, | ||
| shadow: highlighterData.elementInfo?.isShadowRoot, | ||
| attribute, | ||
| }, | ||
| currentTextActionId || `text-${crypto.randomUUID()}` | ||
| ); | ||
| } else { | ||
| setAttributeOptions(options); | ||
| setSelectedElement({ | ||
| selector: highlighterData.selector, | ||
| info: highlighterData.elementInfo, | ||
| }); | ||
| setShowAttributeModal(true); | ||
| setAttributeOptions(options); | ||
| setSelectedElement({ | ||
| selector: highlighterData.selector, | ||
| info: highlighterData.elementInfo, | ||
| }); | ||
| setShowAttributeModal(true); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| if (paginationMode && getList) { | ||
| if (paginationMode && getList) { | ||
| if ( | ||
| paginationType !== "" && | ||
| paginationType !== "scrollDown" && | ||
| paginationType !== "scrollUp" && | ||
| paginationType !== "none" | ||
| paginationType !== "" && | ||
| paginationType !== "scrollDown" && | ||
| paginationType !== "scrollUp" && | ||
| paginationType !== "none" | ||
| ) { | ||
| setPaginationSelector(highlighterData.selector); | ||
| notify( | ||
| `info`, | ||
| t( | ||
| "browser_window.attribute_modal.notifications.pagination_select_success" | ||
| ) | ||
| ); | ||
| addListStep( | ||
| listSelector!, | ||
| fields, | ||
| currentListId || 0, | ||
| currentListActionId || `list-${crypto.randomUUID()}`, | ||
| { type: paginationType, selector: highlighterData.selector } | ||
| ); | ||
| socket?.emit("setPaginationMode", { pagination: false }); | ||
| setPaginationSelector(highlighterData.selector); | ||
| notify( | ||
| `info`, | ||
| t( | ||
| "browser_window.attribute_modal.notifications.pagination_select_success" | ||
| ) | ||
| ); | ||
| addListStep( | ||
| listSelector!, | ||
| fields, | ||
| currentListId || 0, | ||
| currentListActionId || `list-${crypto.randomUUID()}`, | ||
| { type: paginationType, selector: highlighterData.selector } | ||
| ); | ||
| socket?.emit("setPaginationMode", { pagination: false }); | ||
| } | ||
| return; | ||
| } | ||
| } | ||
|
|
||
| if (getList === true && !listSelector) { | ||
| if (getList === true && !listSelector) { | ||
| let cleanedSelector = highlighterData.selector; | ||
| if (cleanedSelector.includes("nth-child")) { | ||
| cleanedSelector = cleanedSelector.replace(/:nth-child\(\d+\)/g, ""); | ||
| if ( | ||
| cleanedSelector.includes("[") && | ||
| cleanedSelector.match(/\[\d+\]/) | ||
| ) { | ||
| cleanedSelector = cleanedSelector.replace(/\[\d+\]/g, ""); | ||
| } | ||
|
|
||
| setListSelector(cleanedSelector); | ||
| notify( | ||
| `info`, | ||
| t("browser_window.attribute_modal.notifications.list_select_success") | ||
| `info`, | ||
| t( | ||
| "browser_window.attribute_modal.notifications.list_select_success" | ||
| ) | ||
| ); | ||
| setCurrentListId(Date.now()); | ||
| setFields({}); | ||
| } else if (getList === true && listSelector && currentListId) { | ||
| const attribute = options[0].value; | ||
| const data = | ||
| attribute === "href" | ||
| ? highlighterData.elementInfo?.url || "" | ||
| : attribute === "src" | ||
| ? highlighterData.elementInfo?.imageUrl || "" | ||
| : highlighterData.elementInfo?.innerText || ""; | ||
|
|
||
| socket?.emit("setGetList", { getList: true }); | ||
| socket?.emit("listSelector", { selector: cleanedSelector }); | ||
| } else if (getList === true && listSelector && currentListId) { | ||
| if (options.length === 1) { | ||
| const attribute = options[0].value; | ||
| let currentSelector = highlighterData.selector; | ||
|
|
||
| if (currentSelector.includes(">")) { | ||
| const [firstPart, ...restParts] = currentSelector | ||
| .split(">") | ||
| .map((p) => p.trim()); | ||
| const listSelectorRightPart = listSelector | ||
| .split(">") | ||
| .pop() | ||
| ?.trim() | ||
| .replace(/:nth-child\(\d+\)/g, ""); | ||
|
|
||
| if ( | ||
| firstPart.includes("nth-child") && | ||
| firstPart.replace(/:nth-child\(\d+\)/g, "") === | ||
| listSelectorRightPart | ||
| ) { | ||
| currentSelector = `${firstPart.replace( | ||
| /:nth-child\(\d+\)/g, | ||
| "" | ||
| )} > ${restParts.join(" > ")}`; | ||
| } | ||
| } | ||
| let currentSelector = highlighterData.selector; | ||
|
|
||
| if (currentSelector.includes("/")) { | ||
| const xpathParts = currentSelector | ||
| .split("/") | ||
| .filter((part) => part); | ||
| const cleanedParts = xpathParts.map((part) => { | ||
| return part.replace(/\[\d+\]/g, ""); | ||
| }); | ||
|
|
||
| const data = | ||
| attribute === "href" | ||
| ? highlighterData.elementInfo?.url || "" | ||
| : attribute === "src" | ||
| ? highlighterData.elementInfo?.imageUrl || "" | ||
| : highlighterData.elementInfo?.innerText || ""; | ||
|
|
||
| const newField: TextStep = { | ||
| id: Date.now(), | ||
| type: "text", | ||
| label: `Label ${Object.keys(fields).length + 1}`, | ||
| data: data, | ||
| selectorObj: { | ||
| selector: currentSelector, | ||
| tag: highlighterData.elementInfo?.tagName, | ||
| shadow: highlighterData.elementInfo?.isShadowRoot, | ||
| attribute, | ||
| }, | ||
| }; | ||
|
|
||
| const updatedFields = { | ||
| ...fields, | ||
| [newField.id]: newField, | ||
| }; | ||
|
|
||
| setFields(updatedFields); | ||
|
|
||
| if (listSelector) { | ||
| addListStep( | ||
| listSelector, | ||
| updatedFields, | ||
| currentListId, | ||
| currentListActionId || `list-${crypto.randomUUID()}`, | ||
| { type: "", selector: paginationSelector } | ||
| ); | ||
| if (cleanedParts.length > 0) { | ||
| currentSelector = "//" + cleanedParts.join("/"); | ||
| } | ||
| } | ||
|
|
||
| const newField: TextStep = { | ||
| id: Date.now(), | ||
| type: "text", | ||
| label: `Label ${Object.keys(fields).length + 1}`, | ||
| data: data, | ||
| selectorObj: { | ||
| selector: currentSelector, | ||
| tag: highlighterData.elementInfo?.tagName, | ||
| shadow: highlighterData.elementInfo?.isShadowRoot, | ||
| attribute, | ||
| }, | ||
| }; | ||
|
|
||
| const updatedFields = { | ||
| ...fields, | ||
| [newField.id]: newField, | ||
| }; | ||
|
|
||
| setFields(updatedFields); | ||
|
|
||
| if (listSelector) { | ||
| addListStep( | ||
| listSelector, | ||
| updatedFields, | ||
| currentListId, | ||
| currentListActionId || `list-${crypto.randomUUID()}`, | ||
| { type: "", selector: paginationSelector } | ||
| ); | ||
| } | ||
| } else { | ||
| setAttributeOptions(options); | ||
| setSelectedElement({ | ||
| setAttributeOptions(options); | ||
| setSelectedElement({ | ||
| selector: highlighterData.selector, | ||
| info: highlighterData.elementInfo, | ||
| }); | ||
| setShowAttributeModal(true); | ||
| } | ||
| } | ||
| }, | ||
| [ | ||
| getText, | ||
| getList, | ||
| listSelector, | ||
| paginationMode, | ||
| paginationType, | ||
| fields, | ||
| currentListId, | ||
| currentTextActionId, | ||
| currentListActionId, | ||
| addTextStep, | ||
| addListStep, | ||
| notify, | ||
| socket, | ||
| t, | ||
| paginationSelector, | ||
| ] | ||
| ); | ||
|
|
||
|
|
||
| const handleClick = (e: React.MouseEvent<HTMLDivElement>) => { | ||
| if (highlighterData && canvasRef?.current) { | ||
| const canvasRect = canvasRef.current.getBoundingClientRect(); | ||
| const clickX = e.clientX - canvasRect.left; | ||
| const clickY = e.clientY - canvasRect.top; | ||
|
|
||
| const highlightRect = highlighterData.rect; | ||
|
|
||
| const mappedRect = coordinateMapper.mapBrowserRectToCanvas(highlightRect); | ||
| if ( | ||
| clickX >= mappedRect.left && | ||
| clickX <= mappedRect.right && | ||
| clickY >= mappedRect.top && | ||
| clickY <= mappedRect.bottom | ||
| ) { | ||
|
|
||
| const options = getAttributeOptions(highlighterData.elementInfo?.tagName || '', highlighterData.elementInfo); | ||
|
|
||
| if (getText === true) { | ||
| if (options.length === 1) { | ||
| // Directly use the available attribute if only one option is present | ||
| const attribute = options[0].value; | ||
| const data = attribute === 'href' ? highlighterData.elementInfo?.url || '' : | ||
| attribute === 'src' ? highlighterData.elementInfo?.imageUrl || '' : | ||
| highlighterData.elementInfo?.innerText || ''; | ||
|
|
||
| addTextStep('', data, { | ||
| selector: highlighterData.selector, | ||
| tag: highlighterData.elementInfo?.tagName, | ||
| shadow: highlighterData.elementInfo?.isShadowRoot, | ||
| attribute, | ||
| }, currentTextActionId || `text-${crypto.randomUUID()}`); | ||
| } else { | ||
| // Show the modal if there are multiple options | ||
| setAttributeOptions(options); | ||
| setSelectedElement({ | ||
| selector: highlighterData.selector, | ||
| info: highlighterData.elementInfo, | ||
| }); | ||
| setShowAttributeModal(true); | ||
| } | ||
| } | ||
|
|
||
| if (paginationMode && getList) { | ||
| // Only allow selection in pagination mode if type is not empty, 'scrollDown', or 'scrollUp' | ||
| if (paginationType !== '' && paginationType !== 'scrollDown' && paginationType !== 'scrollUp' && paginationType !== 'none') { | ||
| setPaginationSelector(highlighterData.selector); | ||
| notify(`info`, t('browser_window.attribute_modal.notifications.pagination_select_success')); | ||
| addListStep(listSelector!, fields, currentListId || 0, currentListActionId || `list-${crypto.randomUUID()}`, { type: paginationType, selector: highlighterData.selector }); | ||
| socket?.emit('setPaginationMode', { pagination: false }); | ||
| } | ||
| return; | ||
| } | ||
|
|
||
| if (getList === true && !listSelector) { | ||
| let cleanedSelector = highlighterData.selector; | ||
| if (cleanedSelector.includes('nth-child')) { | ||
| cleanedSelector = cleanedSelector.replace(/:nth-child\(\d+\)/g, ''); | ||
| } | ||
|
|
||
| setListSelector(cleanedSelector); | ||
| notify(`info`, t('browser_window.attribute_modal.notifications.list_select_success')); | ||
| setCurrentListId(Date.now()); | ||
| setFields({}); | ||
| } else if (getList === true && listSelector && currentListId) { | ||
| const attribute = options[0].value; | ||
| const data = attribute === 'href' ? highlighterData.elementInfo?.url || '' : | ||
| attribute === 'src' ? highlighterData.elementInfo?.imageUrl || '' : | ||
| highlighterData.elementInfo?.innerText || ''; | ||
| // Add fields to the list | ||
| if (options.length === 1) { | ||
| const attribute = options[0].value; | ||
| let currentSelector = highlighterData.selector; | ||
|
|
||
| if (currentSelector.includes('>')) { | ||
| const [firstPart, ...restParts] = currentSelector.split('>').map(p => p.trim()); | ||
| const listSelectorRightPart = listSelector.split('>').pop()?.trim().replace(/:nth-child\(\d+\)/g, ''); | ||
|
|
||
| if (firstPart.includes('nth-child') && | ||
| firstPart.replace(/:nth-child\(\d+\)/g, '') === listSelectorRightPart) { | ||
| currentSelector = `${firstPart.replace(/:nth-child\(\d+\)/g, '')} > ${restParts.join(' > ')}`; | ||
| } | ||
| } | ||
|
|
||
| const newField: TextStep = { | ||
| id: Date.now(), | ||
| type: 'text', | ||
| label: `Label ${Object.keys(fields).length + 1}`, | ||
| data: data, | ||
| selectorObj: { | ||
| selector: currentSelector, | ||
| tag: highlighterData.elementInfo?.tagName, | ||
| shadow: highlighterData.elementInfo?.isShadowRoot, | ||
| attribute | ||
| } | ||
| }; | ||
|
|
||
| const updatedFields = { | ||
| ...fields, | ||
| [newField.id]: newField | ||
| }; | ||
|
|
||
| setFields(updatedFields); | ||
|
|
||
| if (listSelector) { | ||
| addListStep( | ||
| listSelector, | ||
| updatedFields, | ||
| currentListId, | ||
| currentListActionId || `list-${crypto.randomUUID()}`, | ||
| { type: '', selector: paginationSelector } | ||
| ); | ||
| } | ||
|
|
||
| } else { | ||
| setAttributeOptions(options); | ||
| setSelectedElement({ | ||
| selector: highlighterData.selector, | ||
| info: highlighterData.elementInfo | ||
| }); | ||
| setShowAttributeModal(true); | ||
| } | ||
| } | ||
| }); | ||
| setShowAttributeModal(true); | ||
| } | ||
| } | ||
| } | ||
| } | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Significant code duplication with handleDOMElementSelection.
The handleClick function duplicates most of the logic from handleDOMElementSelection, including text/list selection, attribute handling, and pagination logic. This violates the DRY principle and makes maintenance difficult.
Since handleDOMElementSelection is already designed to handle element selection in DOM mode, consider:
- Remove handleClick entirely and use handleDOMElementSelection for both modes
- Or extract the common logic into reusable functions
For option 1, update the onClick handler:
- const handleClick = (e: React.MouseEvent<HTMLDivElement>) => {
- // ... entire function
- };
return (
- <div onClick={handleClick} style={{ width: browserWidth }} id="browser-window">
+ <div onClick={(e) => {
+ if (highlighterData && isDOMMode) {
+ handleDOMElementSelection(highlighterData);
+ }
+ }} style={{ width: browserWidth }} id="browser-window">Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In src/components/browser/BrowserWindow.tsx between lines 874 and 1041, the
handleClick function duplicates much of the logic found in
handleDOMElementSelection, causing maintenance issues. To fix this, remove the
handleClick function entirely and update the onClick handler to use
handleDOMElementSelection for both DOM and non-DOM modes. Alternatively, extract
the shared logic into reusable helper functions and call those from both
handlers to avoid duplication.
| if (selector.startsWith("//") || selector.startsWith("/")) { | ||
| return this.evaluateXPathAll(rootElement, selector); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent XPath detection pattern compared to queryElement.
The XPath detection in queryElementAll only checks for // and / prefixes but misses ./ (relative XPath). This is inconsistent with queryElement which checks all three patterns.
Update all XPath detection patterns to include ./:
- if (selector.startsWith("//") || selector.startsWith("/")) {
+ if (selector.startsWith("//") || selector.startsWith("/") || selector.startsWith("./")) {Also applies to: 242-242, 262-262, 270-270, 290-290
🤖 Prompt for AI Agents
In src/helpers/clientListExtractor.ts at lines 215-216, 242, 262, 270, and 290,
the XPath detection logic in queryElementAll only checks for selectors starting
with '//' or '/', missing the './' prefix for relative XPath expressions. To fix
this inconsistency with queryElement, update the conditions to also check if the
selector starts with './' in all these locations, ensuring all three XPath
patterns ('//', '/', './') are detected properly.
| // Helper method for single XPath evaluation | ||
| private evaluateXPathSingle = ( | ||
| document: Document, | ||
| xpath: string | ||
| ): Element | null => { | ||
| try { | ||
| const result = document.evaluate( | ||
| xpath, | ||
| document, | ||
| null, | ||
| XPathResult.FIRST_ORDERED_NODE_TYPE, | ||
| null | ||
| ); | ||
|
|
||
| if (element) { | ||
| const value = this.extractValue(element, attribute); | ||
| if (value !== null && value !== "") { | ||
| record[label] = value; | ||
| } else { | ||
| console.warn( | ||
| `❌ No value for ${label} in container ${containerIndex + 1}` | ||
| ); | ||
| record[label] = ""; | ||
| } | ||
| } else { | ||
| console.warn( | ||
| `❌ Element not found for ${label} with selector:`, | ||
| selector | ||
| ); | ||
| record[label] = ""; | ||
| } | ||
| } | ||
| const element = result.singleNodeValue as Element | null; | ||
|
|
||
| if (Object.values(record).some((value) => value !== "")) { | ||
| nonTableData.push(record); | ||
| } | ||
| } | ||
| if (!element) { | ||
| console.warn(`❌ XPath found no element for: ${xpath}`); | ||
| } | ||
|
|
||
| // Combine and limit results | ||
| const extractedData = [...tableData, ...nonTableData].slice(0, limit); | ||
|
|
||
| return extractedData; | ||
| return element; | ||
| } catch (error) { | ||
| console.error("Error in client-side extractListData:", error); | ||
| return []; | ||
| console.error("❌ XPath evaluation failed:", xpath, error); | ||
| return null; | ||
| } | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Remove redundant evaluateXPathSingle method.
This method is nearly identical to the existing evaluateXPath method. You can reuse evaluateXPath by passing document as the root element.
Replace the usage on line 464:
- element = this.evaluateXPathSingle(iframeDocument, indexedSelector);
+ element = this.evaluateXPath(iframeDocument, indexedSelector);Then remove the evaluateXPathSingle method entirely (lines 531-556).
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Helper method for single XPath evaluation | |
| private evaluateXPathSingle = ( | |
| document: Document, | |
| xpath: string | |
| ): Element | null => { | |
| try { | |
| const result = document.evaluate( | |
| xpath, | |
| document, | |
| null, | |
| XPathResult.FIRST_ORDERED_NODE_TYPE, | |
| null | |
| ); | |
| if (element) { | |
| const value = this.extractValue(element, attribute); | |
| if (value !== null && value !== "") { | |
| record[label] = value; | |
| } else { | |
| console.warn( | |
| `❌ No value for ${label} in container ${containerIndex + 1}` | |
| ); | |
| record[label] = ""; | |
| } | |
| } else { | |
| console.warn( | |
| `❌ Element not found for ${label} with selector:`, | |
| selector | |
| ); | |
| record[label] = ""; | |
| } | |
| } | |
| const element = result.singleNodeValue as Element | null; | |
| if (Object.values(record).some((value) => value !== "")) { | |
| nonTableData.push(record); | |
| } | |
| } | |
| if (!element) { | |
| console.warn(`❌ XPath found no element for: ${xpath}`); | |
| } | |
| // Combine and limit results | |
| const extractedData = [...tableData, ...nonTableData].slice(0, limit); | |
| return extractedData; | |
| return element; | |
| } catch (error) { | |
| console.error("Error in client-side extractListData:", error); | |
| return []; | |
| console.error("❌ XPath evaluation failed:", xpath, error); | |
| return null; | |
| } | |
| }; | |
| // (The entire evaluateXPathSingle method has been removed) |
🤖 Prompt for AI Agents
In src/helpers/clientListExtractor.ts from lines 531 to 556, remove the
redundant evaluateXPathSingle method as it duplicates the existing evaluateXPath
method. Instead, update the code at line 464 to call evaluateXPath with the
document as the root element parameter. After replacing all usages, delete the
entire evaluateXPathSingle method to avoid duplication and simplify the code.
| const matched = document.evaluate( | ||
| xpath, | ||
| document, | ||
| null, | ||
| XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, | ||
| null | ||
| ); | ||
| const matchedSet = new Set<HTMLElement>(); | ||
| for (let i = 0; i < matched.snapshotLength; i++) { | ||
| matchedSet.add(matched.snapshotItem(i) as HTMLElement); | ||
| } | ||
|
|
||
| return xpath; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incomplete XPath validation logic.
The post-validation creates a Set of matched elements but doesn't verify that it matches exactly the group elements.
// 6. Post-validate that XPath matches all elements
const matched = document.evaluate(
xpath,
document,
null,
XPathResult.ORDERED_NODE_SNAPSHOT_TYPE,
null
);
const matchedSet = new Set<HTMLElement>();
for (let i = 0; i < matched.snapshotLength; i++) {
matchedSet.add(matched.snapshotItem(i) as HTMLElement);
}
+
+// Verify the XPath matches all group elements and only group elements
+const groupSet = new Set(elements);
+if (matchedSet.size !== groupSet.size) {
+ console.warn(`XPath matched ${matchedSet.size} elements but group has ${groupSet.size}`);
+ return xpath; // Still return but with warning
+}
+
+for (const element of elements) {
+ if (!matchedSet.has(element)) {
+ console.warn("XPath does not match all group elements");
+ break;
+ }
+}
return xpath;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const matched = document.evaluate( | |
| xpath, | |
| document, | |
| null, | |
| XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, | |
| null | |
| ); | |
| const matchedSet = new Set<HTMLElement>(); | |
| for (let i = 0; i < matched.snapshotLength; i++) { | |
| matchedSet.add(matched.snapshotItem(i) as HTMLElement); | |
| } | |
| return xpath; | |
| } | |
| const matched = document.evaluate( | |
| xpath, | |
| document, | |
| null, | |
| XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, | |
| null | |
| ); | |
| const matchedSet = new Set<HTMLElement>(); | |
| for (let i = 0; i < matched.snapshotLength; i++) { | |
| matchedSet.add(matched.snapshotItem(i) as HTMLElement); | |
| } | |
| // Verify the XPath matches all group elements and only group elements | |
| const groupSet = new Set(elements); | |
| if (matchedSet.size !== groupSet.size) { | |
| console.warn( | |
| `XPath matched ${matchedSet.size} elements but group has ${groupSet.size}` | |
| ); | |
| return xpath; // Still return but with warning | |
| } | |
| for (const element of elements) { | |
| if (!matchedSet.has(element)) { | |
| console.warn("XPath does not match all group elements"); | |
| break; | |
| } | |
| } | |
| return xpath; | |
| } |
🤖 Prompt for AI Agents
In src/helpers/clientSelectorGenerator.ts around lines 3417 to 3430, the code
collects matched elements into a Set but does not verify if these matched
elements exactly correspond to the intended group elements. To fix this, add
logic to compare the matchedSet with the expected group elements to ensure they
are identical before returning the xpath. If they do not match exactly, handle
the discrepancy appropriately, such as returning null or throwing an error.
| elementFingerprints.forEach((fingerprint, element) => { | ||
| if (processedElements.has(element)) return; | ||
|
|
||
| const currentGroup = [element]; | ||
| processedElements.add(element); | ||
|
|
||
| // Find similar elements | ||
| elementFingerprints.forEach((otherFingerprint, otherElement) => { | ||
| if (processedElements.has(otherElement)) return; | ||
|
|
||
| const similarity = this.calculateSimilarity( | ||
| fingerprint, | ||
| otherFingerprint | ||
| ); | ||
| if (similarity >= this.groupingConfig.similarityThreshold) { | ||
| currentGroup.push(otherElement); | ||
| processedElements.add(otherElement); | ||
| } | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Performance concern: O(n²) complexity in element grouping.
The nested iteration over all elements to find similar ones could be slow on pages with many elements. For 1000 elements, this results in ~500,000 comparisons.
Consider implementing a more efficient grouping algorithm:
- Hash-based grouping: First group elements by tag name and normalized classes to reduce comparison space
- Locality-sensitive hashing: Use fingerprint signatures to create hash buckets
- Early termination: Stop searching once enough group members are found
Example optimization:
-elementFingerprints.forEach((fingerprint, element) => {
- if (processedElements.has(element)) return;
-
- const currentGroup = [element];
- processedElements.add(element);
-
- // Find similar elements
- elementFingerprints.forEach((otherFingerprint, otherElement) => {
- if (processedElements.has(otherElement)) return;
-
- const similarity = this.calculateSimilarity(
- fingerprint,
- otherFingerprint
- );
- if (similarity >= this.groupingConfig.similarityThreshold) {
- currentGroup.push(otherElement);
- processedElements.add(otherElement);
- }
- });
+// Group by tag name first to reduce comparison space
+const elementsByTag = new Map<string, Array<[HTMLElement, ElementFingerprint]>>();
+elementFingerprints.forEach((fingerprint, element) => {
+ const tag = fingerprint.tagName;
+ if (!elementsByTag.has(tag)) {
+ elementsByTag.set(tag, []);
+ }
+ elementsByTag.get(tag)!.push([element, fingerprint]);
+});
+
+// Process each tag group separately
+elementsByTag.forEach((tagElements) => {
+ tagElements.forEach(([element, fingerprint]) => {
+ if (processedElements.has(element)) return;
+
+ const currentGroup = [element];
+ processedElements.add(element);
+
+ // Only compare within same tag group
+ for (const [otherElement, otherFingerprint] of tagElements) {
+ if (processedElements.has(otherElement)) continue;
+
+ const similarity = this.calculateSimilarity(fingerprint, otherFingerprint);
+ if (similarity >= this.groupingConfig.similarityThreshold) {
+ currentGroup.push(otherElement);
+ processedElements.add(otherElement);
+ }
+ }
+ // ... rest of group processing
+ });
+});Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In src/helpers/clientSelectorGenerator.ts around lines 340 to 358, the current
nested loops cause O(n²) complexity by comparing every element fingerprint with
all others, leading to performance issues on large pages. To fix this, first
group elements by tag name and normalized class names to limit comparisons
within smaller groups, then apply similarity checks only inside these groups.
Optionally, implement locality-sensitive hashing on fingerprints to bucket
similar elements and add early termination in the inner loop once a sufficient
group size is reached, reducing unnecessary comparisons.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (4)
src/components/browser/BrowserWindow.tsx (4)
150-150: Still using inline type definitions despite previous feedback.The past review comments recommended defining proper interfaces for complex types like
highlighterDataandgroupInfo, but this change continues to use inline type definitions. This makes the code harder to maintain and reuse.Also applies to: 172-176
662-675: Remove duplicate useEffect hook.This useEffect hook is an exact duplicate of the one at lines 647-660. This will cause event listeners to be registered twice, leading to potential performance issues and unexpected behavior.
Remove this duplicate block entirely.
874-1041: Significant code duplication with handleDOMElementSelection.The handleClick function duplicates most of the logic from handleDOMElementSelection, including text/list selection, attribute handling, and pagination logic. This violates the DRY principle and makes maintenance difficult.
Since handleDOMElementSelection is already designed to handle element selection in DOM mode, consider:
- Remove handleClick entirely and use handleDOMElementSelection for both modes
- Or extract the common logic into reusable functions
988-999: Extract XPath cleaning logic to a utility function.The selector cleaning logic for removing index patterns is duplicated in multiple places.
Create a utility function:
const cleanSelector = (selector: string): string => { // Remove index patterns for grouping if (selector.includes("[") && selector.match(/\[\d+\]/)) { return selector.replace(/\[\d+\]/g, ""); } // Clean XPath selectors if (selector.includes("/")) { const xpathParts = selector.split("/").filter(part => part); const cleanedParts = xpathParts.map(part => part.replace(/\[\d+\]/g, "")); if (cleanedParts.length > 0) { return "//" + cleanedParts.join("/"); } } return selector; };
🧹 Nitpick comments (2)
src/components/browser/BrowserWindow.tsx (2)
488-513: Complex but correct group element handling.The logic for mapping group element coordinates is comprehensive and appears correct. However, this coordinate mapping logic is quite complex and could benefit from extraction to a helper function for better maintainability.
Consider extracting the coordinate mapping logic:
const mapGroupElementCoordinates = (groupElements: HTMLElement[], iframeRect: DOMRect) => { return groupElements.map(element => { const elementRect = element.getBoundingClientRect(); return { element, rect: new DOMRect( elementRect.x + iframeRect.left - IFRAME_BODY_PADDING, elementRect.y + iframeRect.top - IFRAME_BODY_PADDING, elementRect.width, elementRect.height ), }; }); };
691-871: Consider breaking down this large function.The
handleDOMElementSelectionfunction is quite comprehensive but at ~180 lines, it's quite large and handles multiple distinct concerns (pagination, group selection, list selection, text selection). Consider breaking it down into smaller, more focused functions for better maintainability and testability.Consider extracting separate handlers:
const handlePaginationSelection = (highlighterData, paginationType) => { ... }; const handleGroupListSelection = (highlighterData) => { ... }; const handleListFieldSelection = (highlighterData) => { ... }; const handleTextSelection = (highlighterData) => { ... };
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/components/browser/BrowserWindow.tsx(14 hunks)
🔇 Additional comments (4)
src/components/browser/BrowserWindow.tsx (4)
315-329: Good implementation of child selector caching.The logic properly clears cached selectors and regenerates them when the list selector changes. The conditional check for
currentSnapshotand iframe document ensures the operation only runs when the DOM is available.
442-445: Good performance optimization.Adding an early return when neither
getTextnorgetListis active prevents unnecessary processing of highlighter data when no capture modes are enabled.
1216-1296: Excellent group highlighting implementation.The enhanced DOM mode highlighting provides clear visual feedback for both individual elements and group selections. The use of dashed borders for group elements and numbered labels helps users understand they're selecting a list. The conditional rendering logic properly handles different selection states.
1310-1310: Good prop addition for cached selector optimization.Passing
cachedChildSelectorsto theDOMBrowserRendererallows the renderer to leverage the cached selector data, which should improve performance by avoiding repeated selector calculations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (4)
src/components/browser/BrowserWindow.tsx (4)
150-150: Define proper interfaces for complex types.The inline type definition for highlighterData makes the code harder to maintain. Consider extracting it to a proper interface as suggested in the previous review.
656-669: Remove duplicate useEffect hook.This useEffect hook duplicates the one at lines 641-654, causing event listeners to be registered twice.
868-1035: Significant code duplication with handleDOMElementSelection.The handleClick function duplicates most of the logic from handleDOMElementSelection. Consider removing it and using handleDOMElementSelection for both modes as suggested in the previous review.
982-993: Extract XPath cleaning logic to a utility function.The selector cleaning logic is duplicated here and in handleDOMElementSelection. Extract it to a reusable utility function as suggested in the previous review.
🧹 Nitpick comments (4)
server/src/browser-management/classes/RemoteBrowser.ts (1)
445-465: Solid debounced network monitoring implementation.The network request tracking and timeout-based snapshot triggering is well implemented. Consider adding a maximum limit to
pendingNetworkRequestsarray to prevent potential memory issues if many requests accumulate.this.pendingNetworkRequests.push(url); + + // Prevent memory leaks by limiting tracked requests + if (this.pendingNetworkRequests.length > 100) { + this.pendingNetworkRequests = this.pendingNetworkRequests.slice(-50); + }maxun-core/src/browserSide/scraper.js (3)
697-703: Use optional chaining for cleaner code.Consider using optional chaining for better readability.
Apply this diff to use optional chaining:
for (const attr of dataAttributes) { const dataValue = element.getAttribute(attr); - if (dataValue && dataValue.trim()) { + if (dataValue?.trim()) { textContent = dataValue.trim(); break; } }
936-960: Consider escaping regex special characters in pattern matching.The pattern matching logic might fail if
listSelectorcontains regex special characters. Consider escaping the pattern before using it in replace operations.const createIndexedXPath = ( childSelector, listSelector, containerIndex ) => { + // Escape regex special characters + const escapeRegex = (str) => str.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'); + // Check if the child selector contains the list selector pattern if (childSelector.includes(listSelector.replace("//", ""))) { // Replace the list selector part with indexed version const listPattern = listSelector.replace("//", ""); + const escapedPattern = escapeRegex(listPattern); const indexedListSelector = `(${listSelector})[${containerIndex}]`; const indexedSelector = childSelector.replace( - `//${listPattern}`, + `//${escapedPattern}`, indexedListSelector );
425-1263: Consider modularizing the scrapeList function for better maintainability.The
scrapeListfunction is quite large (300+ lines). Consider extracting the XPath-based extraction logic (lines 1000-1047) and CSS-based table extraction logic (lines 1076-1224) into separate helper functions to improve readability and testability.Example structure:
const extractXPathData = (containers, fields, limit, document) => { // XPath extraction logic }; const extractCSSTableData = (containers, containerFields, limit) => { // CSS table extraction logic }; window.scrapeList = async function ({ listSelector, fields, limit = 10 }) { // ... setup code ... if (isXPathSelector(listSelector)) { return extractXPathData(containers, fields, limit, document); } // ... CSS-based extraction ... };
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
maxun-core/src/browserSide/scraper.js(2 hunks)server/src/browser-management/classes/RemoteBrowser.ts(13 hunks)src/components/browser/BrowserWindow.tsx(17 hunks)
🧰 Additional context used
🪛 Biome (1.9.4)
maxun-core/src/browserSide/scraper.js
[error] 616-616: Unnecessary continue statement
Unsafe fix: Delete the unnecessary continue statement
(lint/correctness/noUnnecessaryContinue)
[error] 699-699: Change to an optional chain.
Unsafe fix: Change to an optional chain.
(lint/complexity/useOptionalChain)
🔇 Additional comments (13)
src/components/browser/BrowserWindow.tsx (4)
308-324: LGTM! Well-implemented child selector caching.The logic correctly populates cached child selectors when a list selector is set, with proper null checks and state clearing.
428-547: Well-structured group element handling in domHighlighterHandler.The implementation correctly:
- Guards against unnecessary processing when no actions are active
- Maps group element rectangles to absolute coordinates
- Maintains backward compatibility for non-group elements
- Simplifies the highlighter data setting logic
Good refactoring!
1210-1291: Excellent implementation of group element highlighting!The visual distinction between individual elements (solid border) and group elements (dashed border with labels) provides clear user feedback. The bounds checking ensures highlights stay within viewport boundaries.
1304-1304: Good optimization passing cached child selectors.Passing
cachedChildSelectorsto the child component avoids redundant computation and improves performance.server/src/browser-management/classes/RemoteBrowser.ts (8)
201-203: LGTM! Simplified network monitoring approach.The replacement of complex network resource caching with a simple debounced mechanism using timeout and request tracking is a good simplification that should improve maintainability.
253-253: Good simplification of snapshot processing.Returning raw snapshots without complex resource embedding aligns with the simplified approach and should improve performance while maintaining API compatibility.
590-594: Excellent use of rrweb's built-in capabilities.Leveraging rrweb's
inlineImagesandcollectFontsoptions instead of custom resource processing is a much cleaner approach that delegates complexity to the well-tested library.
610-618: Robust error handling for navigation scenarios.The enhanced error handling properly distinguishes between expected navigation/closure events and actual errors, preventing unnecessary error emissions during normal operation.
652-655: Good cleanup practices.Proper cleanup of the network request timeout prevents memory leaks and ensures no dangling timers remain active after stopping DOM streaming.
706-709: Good defensive programming.Adding the page closure check prevents runtime errors and provides graceful degradation when attempting to extract data from a closed page.
1536-1536: Excellent robustness improvements.The additional page closure checks, enhanced error handling in
switchOff, and proper DOM/screenshot mode handling inchangeTabsignificantly improve the reliability of the browser management system.Also applies to: 1675-1703, 1825-1825, 1837-1843
619-619: Bundle File Path Verified – No Action NeededThe rrweb bundle is present at
server/src/browser-management/classes/rrweb-bundle.js, and injecting it at the context level is both correct and efficient. Approving these changes.maxun-core/src/browserSide/scraper.js (1)
963-1047: Well-architected unified scraping implementation!The separation of XPath and CSS selector handling is clean and maintains backward compatibility while adding powerful new capabilities. The indexed XPath approach for container-based extraction is particularly elegant.
| // For each event, include userId to make sure events are handled for the correct browser | ||
| logger.log("debug", `Registering editor events for user: ${this.userId}`); | ||
|
|
||
| this.socket.on(`captureDirectScreenshot:${this.userId}`, async (settings) => { | ||
| logger.debug(`Direct screenshot capture requested for user ${this.userId}`); | ||
| this.socket.on( | ||
| `captureDirectScreenshot:${this.userId}`, | ||
| async (settings) => { | ||
| logger.debug( | ||
| `Direct screenshot capture requested for user ${this.userId}` | ||
| ); | ||
| await this.captureDirectScreenshot(settings); | ||
| }); | ||
| } | ||
| ); | ||
|
|
||
| // For backward compatibility | ||
| this.socket.on('captureDirectScreenshot', async (settings) => { | ||
| await this.captureDirectScreenshot(settings); | ||
| }); | ||
|
|
||
| // Listen for specific events for this user | ||
| this.socket.on(`rerender:${this.userId}`, async () => { | ||
| logger.debug(`Rerender event received for user ${this.userId}`); | ||
| await this.makeAndEmitScreenshot(); | ||
| }); | ||
|
|
||
| // For backward compatibility, also listen to the general event | ||
| this.socket.on('rerender', async () => { | ||
| logger.debug(`General rerender event received, checking if for user ${this.userId}`); | ||
| await this.makeAndEmitScreenshot(); | ||
| }); | ||
|
|
||
| this.socket.on(`settings:${this.userId}`, (settings) => { | ||
| this.interpreterSettings = settings; | ||
| logger.debug(`Settings updated for user ${this.userId}`); | ||
| }); | ||
|
|
||
| this.socket.on(`changeTab:${this.userId}`, async (tabIndex) => { | ||
| logger.debug(`Tab change to ${tabIndex} requested for user ${this.userId}`); | ||
| await this.changeTab(tabIndex); | ||
| }); | ||
|
|
||
| this.socket.on(`addTab:${this.userId}`, async () => { | ||
| logger.debug(`New tab requested for user ${this.userId}`); | ||
| await this.currentPage?.context().newPage(); | ||
| const lastTabIndex = this.currentPage ? this.currentPage.context().pages().length - 1 : 0; | ||
| await this.changeTab(lastTabIndex); | ||
| }); | ||
|
|
||
| this.socket.on(`closeTab:${this.userId}`, async (tabInfo) => { | ||
| logger.debug(`Close tab ${tabInfo.index} requested for user ${this.userId}`); | ||
| const page = this.currentPage?.context().pages()[tabInfo.index]; | ||
| if (page) { | ||
| if (tabInfo.isCurrent) { | ||
| if (this.currentPage?.context().pages()[tabInfo.index + 1]) { | ||
| // next tab | ||
| await this.changeTab(tabInfo.index + 1); | ||
| } else { | ||
| //previous tab | ||
| await this.changeTab(tabInfo.index - 1); | ||
| } | ||
| } | ||
| await page.close(); | ||
| logger.log( | ||
| 'debug', | ||
| `Tab ${tabInfo.index} was closed for user ${this.userId}, new tab count: ${this.currentPage?.context().pages().length}` | ||
| ); | ||
| // For backward compatibility | ||
| this.socket.on("captureDirectScreenshot", async (settings) => { | ||
| await this.captureDirectScreenshot(settings); | ||
| }); | ||
|
|
||
| // Listen for specific events for this user | ||
| this.socket.on(`rerender:${this.userId}`, async () => { | ||
| logger.debug(`Rerender event received for user ${this.userId}`); | ||
| if (this.renderingMode === "dom") { | ||
| await this.makeAndEmitDOMSnapshot(); | ||
| } else { | ||
| await this.makeAndEmitScreenshot(); | ||
| } | ||
| }); | ||
|
|
||
| this.socket.on("rerender", async () => { | ||
| logger.debug( | ||
| `General rerender event received, checking if for user ${this.userId}` | ||
| ); | ||
| if (this.renderingMode === "dom") { | ||
| await this.makeAndEmitDOMSnapshot(); | ||
| } else { | ||
| await this.makeAndEmitScreenshot(); | ||
| } | ||
| }); | ||
|
|
||
| this.socket.on(`settings:${this.userId}`, (settings) => { | ||
| this.interpreterSettings = settings; | ||
| logger.debug(`Settings updated for user ${this.userId}`); | ||
| }); | ||
|
|
||
| this.socket.on(`changeTab:${this.userId}`, async (tabIndex) => { | ||
| logger.debug( | ||
| `Tab change to ${tabIndex} requested for user ${this.userId}` | ||
| ); | ||
| await this.changeTab(tabIndex); | ||
| }); | ||
|
|
||
| this.socket.on(`addTab:${this.userId}`, async () => { | ||
| logger.debug(`New tab requested for user ${this.userId}`); | ||
| await this.currentPage?.context().newPage(); | ||
| const lastTabIndex = this.currentPage | ||
| ? this.currentPage.context().pages().length - 1 | ||
| : 0; | ||
| await this.changeTab(lastTabIndex); | ||
| }); | ||
|
|
||
| this.socket.on(`closeTab:${this.userId}`, async (tabInfo) => { | ||
| logger.debug( | ||
| `Close tab ${tabInfo.index} requested for user ${this.userId}` | ||
| ); | ||
| const page = this.currentPage?.context().pages()[tabInfo.index]; | ||
| if (page) { | ||
| if (tabInfo.isCurrent) { | ||
| if (this.currentPage?.context().pages()[tabInfo.index + 1]) { | ||
| // next tab | ||
| await this.changeTab(tabInfo.index + 1); | ||
| } else { | ||
| logger.log('error', `Tab index ${tabInfo.index} out of range for user ${this.userId}`); | ||
| } | ||
| }); | ||
|
|
||
| this.socket.on(`setViewportSize:${this.userId}`, async (data: { width: number, height: number }) => { | ||
| const { width, height } = data; | ||
| logger.log('debug', `Viewport size change to width=${width}, height=${height} requested for user ${this.userId}`); | ||
|
|
||
| // Update the browser context's viewport dynamically | ||
| if (this.context && this.browser) { | ||
| this.context = await this.browser.newContext({ viewport: { width, height } }); | ||
| logger.log('debug', `Viewport size updated to width=${width}, height=${height} for user ${this.userId}`); | ||
| } | ||
| }); | ||
|
|
||
| // For backward compatibility, also register the standard events | ||
| this.socket.on('settings', (settings) => this.interpreterSettings = settings); | ||
| this.socket.on('changeTab', async (tabIndex) => await this.changeTab(tabIndex)); | ||
| this.socket.on('addTab', async () => { | ||
| await this.currentPage?.context().newPage(); | ||
| const lastTabIndex = this.currentPage ? this.currentPage.context().pages().length - 1 : 0; | ||
| await this.changeTab(lastTabIndex); | ||
| }); | ||
| this.socket.on('closeTab', async (tabInfo) => { | ||
| const page = this.currentPage?.context().pages()[tabInfo.index]; | ||
| if (page) { | ||
| if (tabInfo.isCurrent) { | ||
| if (this.currentPage?.context().pages()[tabInfo.index + 1]) { | ||
| await this.changeTab(tabInfo.index + 1); | ||
| } else { | ||
| await this.changeTab(tabInfo.index - 1); | ||
| } | ||
| } | ||
| await page.close(); | ||
| //previous tab | ||
| await this.changeTab(tabInfo.index - 1); | ||
| } | ||
| }); | ||
| this.socket.on('setViewportSize', async (data: { width: number, height: number }) => { | ||
| const { width, height } = data; | ||
| if (this.context && this.browser) { | ||
| this.context = await this.browser.newContext({ viewport: { width, height } }); | ||
| } | ||
| await page.close(); | ||
| logger.log( | ||
| "debug", | ||
| `Tab ${tabInfo.index} was closed for user ${ | ||
| this.userId | ||
| }, new tab count: ${this.currentPage?.context().pages().length}` | ||
| ); | ||
| } else { | ||
| logger.log( | ||
| "error", | ||
| `Tab index ${tabInfo.index} out of range for user ${this.userId}` | ||
| ); | ||
| } | ||
| }); | ||
|
|
||
| this.socket.on( | ||
| `setViewportSize:${this.userId}`, | ||
| async (data: { width: number; height: number }) => { | ||
| const { width, height } = data; | ||
| logger.log( | ||
| "debug", | ||
| `Viewport size change to width=${width}, height=${height} requested for user ${this.userId}` | ||
| ); | ||
|
|
||
| // Update the browser context's viewport dynamically | ||
| if (this.context && this.browser) { | ||
| this.context = await this.browser.newContext({ | ||
| viewport: { width, height }, | ||
| }); | ||
| logger.log( | ||
| "debug", | ||
| `Viewport size updated to width=${width}, height=${height} for user ${this.userId}` | ||
| ); | ||
| } | ||
| } | ||
| ); | ||
|
|
||
| // For backward compatibility, also register the standard events | ||
| this.socket.on( | ||
| "settings", | ||
| (settings) => (this.interpreterSettings = settings) | ||
| ); | ||
| this.socket.on( | ||
| "changeTab", | ||
| async (tabIndex) => await this.changeTab(tabIndex) | ||
| ); | ||
| this.socket.on("addTab", async () => { | ||
| await this.currentPage?.context().newPage(); | ||
| const lastTabIndex = this.currentPage | ||
| ? this.currentPage.context().pages().length - 1 | ||
| : 0; | ||
| await this.changeTab(lastTabIndex); | ||
| }); | ||
| this.socket.on("closeTab", async (tabInfo) => { | ||
| const page = this.currentPage?.context().pages()[tabInfo.index]; | ||
| if (page) { | ||
| if (tabInfo.isCurrent) { | ||
| if (this.currentPage?.context().pages()[tabInfo.index + 1]) { | ||
| await this.changeTab(tabInfo.index + 1); | ||
| } else { | ||
| await this.changeTab(tabInfo.index - 1); | ||
| } | ||
| }); | ||
| } | ||
| await page.close(); | ||
| } | ||
| }); | ||
| this.socket.on( | ||
| "setViewportSize", | ||
| async (data: { width: number; height: number }) => { | ||
| const { width, height } = data; | ||
| if (this.context && this.browser) { | ||
| this.context = await this.browser.newContext({ | ||
| viewport: { width, height }, | ||
| }); | ||
| } | ||
| } | ||
| ); | ||
|
|
||
| this.socket.on('extractListData', async (data: { | ||
| listSelector: string, | ||
| fields: Record<string, any>, | ||
| currentListId: number, | ||
| pagination: any | ||
| this.socket.on( | ||
| "extractListData", | ||
| async (data: { | ||
| listSelector: string; | ||
| fields: Record<string, any>; | ||
| currentListId: number; | ||
| pagination: any; | ||
| }) => { | ||
| if (this.currentPage) { | ||
| const extractedData = await this.extractListData( | ||
| this.currentPage, | ||
| data.listSelector, | ||
| data.fields | ||
| ); | ||
|
|
||
| this.socket.emit('listDataExtracted', { | ||
| currentListId: data.currentListId, | ||
| data: extractedData | ||
| }); | ||
| } | ||
| }); | ||
| if (this.currentPage) { | ||
| const extractedData = await this.extractListData( | ||
| this.currentPage, | ||
| data.listSelector, | ||
| data.fields | ||
| ); | ||
|
|
||
| this.socket.emit("listDataExtracted", { | ||
| currentListId: data.currentListId, | ||
| data: extractedData, | ||
| }); | ||
| } | ||
| } | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Significant architectural improvement with room for optimization.
The user-specific event handling is a major improvement for multi-user support. However, the extensive code duplication between user-specific and general event handlers could be refactored for better maintainability.
Consider creating a helper function to reduce duplication:
+ private registerEventHandler(eventName: string, handler: (...args: any[]) => void) {
+ // Register user-specific event
+ this.socket.on(`${eventName}:${this.userId}`, handler);
+ // Register general event for backward compatibility
+ this.socket.on(eventName, handler);
+ }Then use it to replace the duplicated event registrations. Also consider reducing the verbose logging for production environments.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In server/src/browser-management/classes/RemoteBrowser.ts between lines 1289 and
1463, there is significant duplication in socket event handlers for
user-specific and general events. To fix this, create a helper function that
registers an event handler with optional userId scoping, then replace the
duplicated event registrations by calling this helper with appropriate
parameters. Additionally, refactor the logging to be less verbose or conditional
based on environment to improve maintainability and performance.
| } catch (e) { | ||
| console.warn( | ||
| `Cannot access ${element.tagName.toLowerCase()} content:`, | ||
| e | ||
| ); | ||
| continue; | ||
| } | ||
| } else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unnecessary continue statement.
The continue statement at line 616 is unnecessary as it's the last statement in the loop iteration.
Apply this diff to remove the unnecessary continue:
} catch (e) {
console.warn(
`Cannot access ${element.tagName.toLowerCase()} content:`,
e
);
- continue;
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| } catch (e) { | |
| console.warn( | |
| `Cannot access ${element.tagName.toLowerCase()} content:`, | |
| e | |
| ); | |
| continue; | |
| } | |
| } else { | |
| } catch (e) { | |
| console.warn( | |
| `Cannot access ${element.tagName.toLowerCase()} content:`, | |
| e | |
| ); | |
| } | |
| } else { |
🧰 Tools
🪛 Biome (1.9.4)
[error] 616-616: Unnecessary continue statement
Unsafe fix: Delete the unnecessary continue statement
(lint/correctness/noUnnecessaryContinue)
🤖 Prompt for AI Agents
In maxun-core/src/browserSide/scraper.js around lines 611 to 618, the continue
statement inside the catch block is redundant because it is the last statement
in the loop iteration. Remove the continue statement to clean up the code
without changing its behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🔭 Outside diff range comments (1)
src/components/recorder/DOMBrowserRenderer.tsx (1)
880-890: Fix undefined variable reference.The code references
minimalCSSwhich is not defined in the current scope. It should use thecssInjectionvariable that was constructed earlier.Apply this fix:
if (headTagRegex.test(rebuiltHTML)) { rebuiltHTML = rebuiltHTML.replace( "<head>", - `<head><base href="${snapshotData.baseUrl}">${minimalCSS}` + `<head>${cssInjection}` ); } else if (rebuiltHTML.includes("<html>")) { rebuiltHTML = rebuiltHTML.replace( "<html>", - `<html><head><base href="${snapshotData.baseUrl}">${minimalCSS}</head>` + `<html><head>${cssInjection}</head>` ); }
♻️ Duplicate comments (3)
src/components/recorder/RightSidePanel.tsx (1)
263-263: Avoid hardcoding the extraction limit.The extraction limit is still hardcoded to 5, which reduces flexibility. This should be configurable based on user settings or use a named constant.
Consider using the limit from the list step configuration:
+ const listStep = browserSteps.find(step => step.id === currentListId); + const limit = listStep?.limit || 10; // Default to 10 if not set + const extractedData = clientListExtractor.extractListData( iframeDoc, listSelector, fields, - 5 + limit );src/components/recorder/DOMBrowserRenderer.tsx (2)
110-110: Improve type safety by defining proper types for groupInfo.The
groupInfoparameter usesanytype, which reduces type safety and makes the code harder to maintain.Define a proper interface for groupInfo:
+interface GroupInfo { + isGroupElement: boolean; + groupSize: number; + groupElements: HTMLElement[]; + groupFingerprint?: ElementFingerprint; +} interface RRWebDOMBrowserRendererProps { // ... other props onHighlight?: (data: { rect: DOMRect; selector: string; elementInfo: ElementInfo | null; childSelectors?: string[]; - groupInfo?: any; + groupInfo?: GroupInfo; }) => void; onElementSelect?: (data: { rect: DOMRect; selector: string; elementInfo: ElementInfo | null; childSelectors?: string[]; - groupInfo?: any; + groupInfo?: GroupInfo; }) => void;Also applies to: 117-117
398-416: Remove redundant selector generation call.The code retrieves highlighter data again to get groupInfo, but this information should already be available from the currentHighlight state that was set during the last mouse move.
Use the existing highlight data:
if (currentHighlight && onElementSelect) { - // Get the group info for the current highlight - const highlighterData = - clientSelectorGenerator.generateDataForHighlighter( - { x: iframeX, y: iframeY }, - iframeDoc, - true, - cachedChildSelectors - ); - onElementSelect({ rect: currentHighlight.rect, selector: currentHighlight.selector, elementInfo: currentHighlight.elementInfo, childSelectors: cachedChildSelectors.length > 0 ? cachedChildSelectors - : highlighterData?.childSelectors || [], - groupInfo: highlighterData?.groupInfo, + : currentHighlight.childSelectors || [], + groupInfo: currentHighlight.groupInfo, }); }To support this, update the currentHighlight state interface to include groupInfo.
🧹 Nitpick comments (1)
src/components/recorder/DOMBrowserRenderer.tsx (1)
800-870: Consider simplifying the comprehensive CSS injection.While the enhanced CSS provides good styling coverage, the complexity might introduce unexpected side effects. Consider breaking this into smaller, more focused CSS blocks for easier maintenance.
The current implementation handles fonts, external stylesheets, scrollbar hiding, and general styling all in one large block. Consider modularizing:
+const createFontCSS = (fonts) => fonts.map(font => `...`).join('\n'); +const createScrollbarCSS = () => `/* Hide scrollbars */ ...`; +const createBaseCSS = () => `/* Base styles */ ...`; + +const enhancedCSS = [ + createBaseCSS(), + createScrollbarCSS(), + ...(snapshotData.resources.fonts?.length > 0 ? [createFontCSS(snapshotData.resources.fonts)] : []), + ...(snapshotData.resources.stylesheets?.length > 0 ? [externalCSS] : []) +].join('\n\n');
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/components/recorder/DOMBrowserRenderer.tsx(11 hunks)src/components/recorder/RightSidePanel.tsx(11 hunks)src/context/globalInfo.tsx(6 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
src/components/recorder/DOMBrowserRenderer.tsx (1)
src/helpers/clientSelectorGenerator.ts (1)
clientSelectorGenerator(3576-3576)
🪛 Biome (1.9.4)
src/components/recorder/RightSidePanel.tsx
[error] 100-100: Change to an optional chain.
Unsafe fix: Change to an optional chain.
(lint/complexity/useOptionalChain)
🔇 Additional comments (15)
src/components/recorder/RightSidePanel.tsx (6)
25-25: LGTM: Good addition of selector generator import.The import is properly added and will be used for cleanup operations throughout the file.
57-57: Excellent: Improved state management with global store.The integration with
updateDOMModefrom the global store provides better centralized state management for DOM mode and snapshot data.
88-88: LGTM: Consistent global DOM mode usage.All DOM mode state updates now properly use the global
updateDOMModefunction, ensuring consistent state management across the application.Also applies to: 94-94, 101-101, 116-116
246-257: Helpful debugging enhancement for XPath selectors.The logging provides valuable insights into field selector types and XPath detection, which will aid in troubleshooting extraction issues.
268-276: Excellent user feedback for extraction issues.The warning notification when no data is extracted provides valuable feedback to users about potential selector problems, improving the user experience.
460-460: Proper cleanup implementation.The
clientSelectorGenerator.cleanup()calls are strategically placed after capture operations complete or are discarded, ensuring proper state reset and preventing memory leaks.Also applies to: 529-529, 659-659, 690-690, 709-709
src/context/globalInfo.tsx (4)
30-63: Well-structured ProcessedSnapshot interface.The interface comprehensively covers all aspects of snapshot data including resources, metadata, and processing statistics. The typing is thorough and will help prevent runtime errors.
124-128: Excellent extension of GlobalInfo interface.The addition of DOM mode state and snapshot management properties follows good TypeScript practices with proper nullable types and clear method signatures.
158-159: Proper state initialization.The default values and state hooks are correctly implemented with appropriate initial values (
falsefor isDOMMode andnullfor currentSnapshot).Also applies to: 186-187
212-222: Smart DOM mode update logic.The
updateDOMModefunction handles the relationship between DOM mode and snapshot data correctly - clearing the snapshot when DOM mode is disabled and allowing optional snapshot updates.src/components/recorder/DOMBrowserRenderer.tsx (5)
101-101: Good addition of cachedChildSelectors support.The new prop enables cached child selectors to influence highlight generation, which aligns with the PR's goal of improving capture accuracy.
Also applies to: 150-150
212-222: Efficient early exit optimization.The early return when not in capture mode prevents unnecessary processing and improves performance, which is a good optimization.
228-229: Proper integration of cached child selectors.The integration of
cachedChildSelectorsinto the highlight data generation ensures consistent selector behavior across the highlighting system.
250-275: Improved highlighting logic for group elements.The refactored logic properly handles group element highlighting in the first phase and child selector validation in the second phase, with helpful debug logging.
356-358: Good performance optimization.Skipping mouse move processing when not in capture mode reduces unnecessary computation and improves overall performance.
What this PR does?
Improves the capture list action's accuracy in capturing data.
Summary by CodeRabbit
New Features
Improvements
Bug Fixes
Chores