feat: optimize HTML text extraction in TableWidgetV2#38153
feat: optimize HTML text extraction in TableWidgetV2#38153rahulbarwal merged 6 commits intoreleasefrom
Conversation
- Enhanced the getTextFromHTML function with error handling to prevent failures when processing invalid HTML. - Introduced logic to conditionally extract HTML text only when necessary, improving performance. - Updated sorting and filtering mechanisms to utilize extracted HTML text, ensuring accurate comparisons and display.
WalkthroughThe changes in the pull request enhance the Changes
Possibly related PRs
Suggested labels
Suggested reviewers
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
app/client/src/widgets/TableWidgetV2/widget/derived.js(5 hunks)
🧰 Additional context used
🪛 Biome (1.9.4)
app/client/src/widgets/TableWidgetV2/widget/derived.js
[error] 540-543: Other switch clauses can erroneously access this declaration.
Wrap the declaration in a block to restrict its access to the switch clause.
The declaration is defined in this switch clause:
Unsafe fix: Wrap the declaration in a block.
(lint/correctness/noSwitchDeclarations)
[error] 544-547: Other switch clauses can erroneously access this declaration.
Wrap the declaration in a block to restrict its access to the switch clause.
The declaration is defined in this switch clause:
Unsafe fix: Wrap the declaration in a block.
(lint/correctness/noSwitchDeclarations)
🔇 Additional comments (3)
app/client/src/widgets/TableWidgetV2/widget/derived.js (3)
287-295: Good addition of error handling in getTextFromHTML
The try-catch block enhances robustness when processing invalid HTML content.
850-866: Efficiently processing HTML values with conditional checks
Optimizes performance by avoiding unnecessary calls to getTextFromHTML when not needed.
919-926: Verify existence of extracted HTML text keys
Ensure that the keys returned by getKeyForExtractedTextFromHTML exist in the data to prevent potential undefined values.
Please confirm that all possible HTML columns are accounted for and handled correctly.
- Optimized HTML text extraction logic to conditionally call getTextFromHTML, improving performance. - Updated filtering and sorting mechanisms to account for HTML columns, ensuring accurate data handling. - Removed redundant code related to HTML column checks, streamlining the implementation.
…es` to better reflect its purpose.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (2)
app/client/src/widgets/TableWidgetV2/widget/derived.js (2)
287-295: Consider using DOMParser for better performanceWhile the current implementation works, using DOMParser would be more efficient than creating a DOM element.
- try { - const tempDiv = document.createElement("div"); - tempDiv.innerHTML = html; - return tempDiv.textContent || tempDiv.innerText || ""; - } catch (e) { - return ""; - } + try { + const parser = new DOMParser(); + const doc = parser.parseFromString(html, 'text/html'); + return doc.body.textContent || ""; + } catch (e) { + return ""; + }
924-935: Consider extracting repeated HTML text retrieval logicThe HTML text retrieval logic is duplicated. Consider extracting it into a helper function.
+ const getHTMLColumnText = (row, column) => { + return row[getKeyForExtractedTextFromHTML(column)] ?? + getTextFromHTML(row[column]); + }; + const originalColValue = isHTMLColumn - ? originalRow[ - getKeyForExtractedTextFromHTML(props.filters[i].column) - ] ?? getTextFromHTML(originalRow[props.filters[i].column]) + ? getHTMLColumnText(originalRow, props.filters[i].column) : originalRow[props.filters[i].column]; const displayedColValue = isHTMLColumn - ? displayedRow[ - getKeyForExtractedTextFromHTML(props.filters[i].column) - ] ?? getTextFromHTML(displayedRow[props.filters[i].column]) + ? getHTMLColumnText(displayedRow, props.filters[i].column) : displayedRow[props.filters[i].column];
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
app/client/src/widgets/TableWidgetV2/widget/derived.js(5 hunks)
🔇 Additional comments (2)
app/client/src/widgets/TableWidgetV2/widget/derived.js (2)
298-320: LGTM! Well-implemented optimization
The optimization to avoid unnecessary HTML text extraction is well implemented with:
- Clear conditions for when extraction is needed
- Efficient caching mechanism using generated keys
550-566: LGTM! Properly scoped implementation
The HTML sorting logic is well-implemented with proper block scoping and efficient use of cached values.
|
/build-deploy-preview skip-tests=true |
|
Deploying Your Preview: https://github.com/appsmithorg/appsmith/actions/runs/12387086558. |
|
Deploy-Preview-URL: https://ce-38153.dp.appsmith.com |
…o rahulbarwal/htmlcell/perf/enhancement
- Changed the implementation of HTML column alias storage from an array to a Set for improved performance in filtering and sorting operations. - Updated related filtering and sorting logic to utilize Set methods, enhancing efficiency. - Added a new test case to validate that search functionality does not filter based on HTML attributes.
|
/build-deploy-preview skip-tests=true |
|
Deploying Your Preview: https://github.com/appsmithorg/appsmith/actions/runs/12431487288. |
|
Deploy-Preview-URL: https://ce-38153.dp.appsmith.com |
- Updated the `getTextFromHTML` function to stringify JavaScript objects before processing, ensuring compatibility with HTML column handling. - Added a new test case to validate search functionality when a JavaScript object is passed in the HTML column, confirming that the search works as expected.
Description
Fixes #38275
or
Fixes
Issue URLWarning
If no issue exists, please create an issue first, and check with the maintainers if the issue is valid.
Automation
/ok-to-test tags="@tag.Table"
🔍 Cypress test results
Tip
🟢 🟢 🟢 All cypress tests have passed! 🎉 🎉 🎉
Workflow run: https://github.com/appsmithorg/appsmith/actions/runs/12443677014
Commit: b98e9a3
Cypress dashboard.
Tags:
@tag.TableSpec:
Sat, 21 Dec 2024 09:18:53 UTC
Communication
Should the DevRel and Marketing teams inform users about this change?
Summary by CodeRabbit
New Features
Bug Fixes