[SPARK-21254] [WebUI] History UI Performance fixes #18777
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
As described in JIRA ticket, History page is taking ~1min to load for cases when amount of jobs is 10k+.
Most of the time is currently being spent on DOM manipulations and all additional costs implied by this (browser repaints and reflows).
PR's goal is not to change any behavior but to optimize time of History UI rendering:
The most costly operation is setting
innerHTMLfordurationcolumn within a loop, which is extremely unperformant. Refactoring this helped to get time down to 10-15sSecond big gain bringing page load time down to 4s was was achieved by detaching table's DOM before parsing it with DataTables jQuery plugin.
Another chunk of improvements (1, 2, 3) was focused on removing unnecessary DOM manipulations that in total contributed ~250ms to page load time.
How was this patch tested?
Tested by existing Selenium tests in
org.apache.spark.deploy.history.HistoryServerSuite. Version of HtmlUnitDriver had a bug that was preventing rendering the full table and making testajax rendered relative links are prefixed with uiRoot (spark.ui.proxyBase)constantly fail.Changes were also tested on Criteo's spark-2.1 fork with 20k+ number of rows in the table, reducing load time to 4s.