-
Notifications
You must be signed in to change notification settings - Fork 1.1k
feat: click next improvements #482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThe changes enhance pagination handling in the Changes
Sequence Diagram(s)sequenceDiagram
participant I as Interpreter
participant B as Browser/DOM
participant L as Logger
I->>B: Execute captureContentSignature (before click)
I->>B: Attempt to click pagination button
alt Click succeeds
I->>L: Log pagination success
else Click fails
I->>B: Dispatch fallback click event
I->>L: Log retry attempt and error
end
I->>B: Capture new signature post-click
I->>L: Compare signatures and set paginationSuccess
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (2)
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
maxun-core/src/interpret.ts (1)
774-781: Consider adding more specific error handling.In the catch block for the dispatch event with navigation, you're catching a generic error but not logging its specific details. This could make debugging more difficult in production.
} catch (dispatchNavError) { try { await button.click(); await page.waitForTimeout(2000); } catch (clickError) { + debugLog(`Click error after dispatch failure: ${clickError.message}`); await button.dispatchEvent('click'); await page.waitForTimeout(2000); } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
maxun-core/src/interpret.ts(2 hunks)
🔇 Additional comments (8)
maxun-core/src/interpret.ts (8)
732-741: Great addition of content signature tracking.The new
captureContentSignaturefunction smartly captures the URL, item count, and first few items before pagination, which will help reliably detect content changes even when the URL remains the same.
743-744: Good improvement to logging.Capturing the item count before clicking provides valuable debugging information and will make it easier to track pagination progress.
729-746: Better naming withpaginationSuccessvariable.Renaming from
navigationSuccesstopaginationSuccessmore accurately reflects the operation being performed, improving code readability and maintenance.
760-783: Robust enhancement to click handling.The addition of a fallback mechanism using
dispatchEvent('click')when the regular click fails is an excellent way to improve reliability. Some websites have complex event handling that might block regular clicks but respond to dispatch events.
787-804: Comprehensive page change detection.The change detection now checks multiple indicators (URL change, content change, item count change) which will make pagination more reliable across different website implementations.
806-807: Improved error logging.Including the error message in the log makes debugging much easier, especially for intermittent pagination issues.
818-820: Enhanced failure reporting.Adding a specific log message for pagination failure helps with debugging and understanding when the scraper has stopped due to pagination issues rather than other errors.
812-813: Clear retry attempt logging.Including the current attempt number and maximum retries in the log message provides better context for debugging pagination issues.
Summary by CodeRabbit