-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-13547] [SQL] [WEBUI] Add SQL query in web UI's SQL Tab #14158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #62180 has finished for PR 14158 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to insert a "-" rather than leave this blank in the case it's not available?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just checked and found out that the common practice in the Spark Web UI is to use a blank instead of a hyphen. I'll fix. Thanks!
|
Checked it out and gave it a test run and it looks good, just one issue. When the user only uses the dataframes api it will never show any SQL Text (only "-"), should we check if there is no SQL Text to display and not show the column in that case? |
|
Thanks, @ajbozarth. That's a great point. When no execution in a table has SQL text, this column shouldn't be shown. I'll fix this. |
|
In the commit above, I made three adjustments:
A snapshot of the SQL Tab: The following need to be discussed and implemented (suggested by @yhuai):
|
|
Test build #62207 has finished for PR 14158 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about a button to copy full query text into clipboard?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really necessary? The text can be easily highlighted and copied.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.. for very long query that wouldn't fit as per this comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Theres a "+more" button that expands to show the full query
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nvm, I understand you now, if we end up truncating the text at 1000 chars as proposed then yes this may be a useful feature, but if the text is available to copy wouldn't the size issue still exist?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you mean that it could be too long for clipboard? I don't think that would be an issue...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @felixcheung. When I am looking at the discussions again today, I find I am more clear about your suggestions. But actually, truncating very long queries is not because they are not visually pleasant to be displayed on the page, but it may take too many resources to store and transfer such a long unlimited string. So the queries longer than 1000 chars are truncated after parsing, which will never be recovered. So the idea of clipboard may not be used to resolve it.
But still, I believe the clipboard is a good idea. I do think we could extend the limit to 10k chars instead of 1k if we add a clipboard later.
|
Checked out your changes and I like the updates, plus I think the column looks better on the end. Addressing the questions you raised: i agree we should truncate long queries (1000 chars sounds like a decent number), we should make it clear they were truncated though (maybe with a "..."?) I also think a tool tip would be good. Maybe something like: "Left blank when the query was not issued by SQL"? |
|
@ajbozarth @felixcheung Thanks for viewing and discussing! I plan to implement the two updates I mentioned above. The idea of clipboard sounds awesome, but I think we might need more discussion about this new function. |
|
Updated by truncating long texts and adding a tooltip. |
|
Test build #62364 has finished for PR 14158 at commit
|
|
I'll try to take a look at this tomorrow |
|
The tooltip looks odd centered on the column instead of the title text but I believe that's how the whole Web UI works, other than that nit LGTM |
|
@srowen @JoshRosen what do you think of @nblintao 's pr? Some of my team really likes the addition and would love to see it merged |
|
@nblintao Could you rebase? I'd like to try and see if we can get this reviewed and merged. |
|
Please be certain that this works well even for very large queries. They are not commonplace, but I know that Spark SQL does sometimes get asked to handle SQL queries that are hundreds of lines long. |
|
Thanks @markhamstra that was already addressed earlier in review |
|
It's great if it is already addressed, but I just didn't see anything explicit in the discussion or examples that showed any query of the magnitude that I am talking about. Machine-generated queries can be staggeringly large (e.g. I've seen queries that were so long and complicated that they took Spark SQL more than 9 minutes just to parse), and this UI enhancement must be prepared to handle those. |
|
Thanks, @ajbozarth. Sure, I'd love to rebase it and get it merged. But I am really busy with my mid-terms and interviews recently. I'll take a look this week and fix it if it's easy to do. I really appreciate your concern about my pull requests. |
|
@markhamstra Thanks. Here is my related explanation at the top of this PR.
Is this what you mean? |
|
@nblintao I really haven't reviewed these changes closely enough to have a specific complaint or concern in mind, but I'm more concerned about what happens when you ask to see "more" when that "more" could be several tens of thousands of characters, not just the few thousand that many would consider to be a very long SQL query. |
|
@markhamstra Queries longer than 1000 chars are truncated to 1000 chars after the parsing. "+more" will only expand 140 chars to at most 1000 chars. |
|
@nblintao Got it; thanks. There may be distinct queries that will be entirely the same within the first 1000 characters, but that's just the nature of working with these very large queries -- there are lots of things that make them difficult, but it sounds like you've made sure not to make things difficult for more normal queries as a consequence, so that's all good. |
|
@ajbozarth I finally have a chance to rebase it in the winter break. Could you please have a look? Thanks! |
|
Test build #70700 has finished for PR 14158 at commit
|
|
retest this please |
|
Test build #70704 has finished for PR 14158 at commit
|
ajbozarth
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of comments, but from a UI side this looks ok to me. @gatorsmile What do you think of this from the SQL side? Also @srowen and @vanzin second opinion on the UI side?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you shouldn't need this, scala can handle line wrapping for Node return values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really necessary? Given the reasoning behind the max length I don't see a reason to include extra lines of code to include the suffix as part of the max length.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think either way is okay. Here, I am considering keeping the text displayed (including suffix) less than 1000 chars.
|
Hi @nblintao, is it still active? |
|
@HyukjinKwon Thanks for reminding. I am finals this week. I will fix this after 15th. |
|
I have just rebased. @ajbozarth @HyukjinKwon @gatorsmile @srowen @vanzin |
| def refresh(): Unit = children.foreach(_.refresh()) | ||
|
|
||
| // Record the original sql text in the top logical plan for checking in the web UI. | ||
| var sqlText: Option[String] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using var for this PR should be avoided.
| } | ||
| } | ||
| logicalPlan.sqlText = Some(truncatedSqlText) | ||
| logicalPlan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The solution in this PR looks intrusive to me. If we really want to store the original sql text, we can add it into the QueryExecution. The value can be initialized when we build the QueryExecution
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing it out. I agree that QueryExecution is a better place for original SQL text. I have updated my code accordingly. Could you please have a look?
|
Test build #77510 has finished for PR 14158 at commit
|
|
Test build #77518 has finished for PR 14158 at commit
|
|
Test build #77523 has started for PR 14158 at commit |
|
retest this please |
|
Test build #78291 has started for PR 14158 at commit |
|
i'll retrigger this once jenkins is back up. |
|
test this please |
|
Test build #78299 has finished for PR 14158 at commit
|
|
retest this please |
|
Test build #78332 has finished for PR 14158 at commit
|
|
gentle ping @nblintao, is this PR active? If so, I guess the test failure should be fixed if related. |
|
@HyukjinKwon It's still active. I'll fix it when I'm available. Thanks. |
|
hey @nblintao, do you maybe happened to have some time to continue this one? |
|
@HyukjinKwon Sorry for the delay. I'm busy looking for jobs these days. I'll try my best to fix it in October. Thank you for reminding me! |
|
I'm going to close this PR because it goes stale, please feel free to reopen it or open another PR if anyone have more thoughts on this issue. |
Closes apache#11494 Closes apache#14158 Closes apache#16803 Closes apache#16864 Closes apache#17455 Closes apache#17936 Closes apache#19377 Added: Closes apache#19380 Closes apache#18642 Closes apache#18377 Closes apache#19632 Added: Closes apache#14471 Closes apache#17402 Closes apache#17953 Closes apache#18607 Also cc srowen vanzin HyukjinKwon gatorsmile cloud-fan to see if you have other PRs to close. Author: Xingbo Jiang <[email protected]> Closes apache#19669 from jiangxb1987/stale-prs.

What changes were proposed in this pull request?
For the web UI, this PR adds the raw SQL query text for each query in the new column "SQL Text" in the SQL Tab. To achieve this, it also passes the SQL text from the parser to the execution UI data, through the logical plan.
If the query has no related query text, show a blank in that field. If there is no query in the table has a query text, not show the column.
If the query text is long (i.e. longer than 140 chars), show the short abstract and display the full version on demand by the link "+more".
If the query text is too long (i.e. longer than 1000 chars), it will be truncated after parsing. This is to avoid downing browsers or web UI servers by running out of memory.
Also, a tooltip is added on the header "SQL Text", saying
Shows 140 characters by default. Click "+more" to see more. Long texts are truncated to 1000 characters. Left blank when the query was not issued by SQL..How was this patch tested?
Tested by trying multiple SQL queries.
A screenshot of the SQL Tab:

(Query 3 is identical to Query 2, but its SQL text is shown in detail by clicking "+more")