-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-4411] [Web UI] Add "kill" link for jobs in the UI #15441
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@srowen @kayousterhout @tgravescs You have had input or the JIRA or previous PR, could you take a look? |
| val jobId = Option(request.getParameter("id")).getOrElse("-1").toInt | ||
| if (jobId >= 0 && killFlag && jobProgresslistener.activeJobs.contains(jobId)) { | ||
| sc.get.cancelJob(jobId) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating an Option only to immediately get the value out of it is poor style, and unnecessary.
val jobId = Option(request.getParameter("id"))
jobId.foreach { id =>
if (killFlag && jobProgresslistener.activeJobs.contains(id)) {
sc.get.cancelJob(id)
}
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, there is no need for sc.get. In fact, that's a bug if parent.sc really should be an Option and thus could be None.
sc.foreach(_.cancelJob(id))There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And if the Job isn't actually going to be canceled, then there is no need to delay the page refresh (which I'm not entirely happy with, but I'm not going to try to resolve that issue right now.) So...
sc.foreach { sparkContext =>
sparkContext.cancelJob(id)
Thread.sleep(100)
}And we better make that jobId an Option[Int] instead of an Option[String], so...
val jobId = Option(request.getParameter("id")).map(_.toInt)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied this code from StagesTab.scala so I'll fix these issues for both in the morning
|
Test build #66775 has finished for PR 15441 at commit
|
|
Addressed @markhamstra comments for both |
| // Do a quick pause here to give Spark time to kill the job so it shows up as | ||
| // killed after the refresh. Note that this will block the serving thread so the | ||
| // time should be limited in duration. | ||
| Thread.sleep(100) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're doing the sleep here even if sc is None, but I wouldn't expect that that is even possible or that it is handled completely correctly elsewhere in the Web UI if it actually is possible, so this is likely fine in practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was my thinking, that in the case of sc being None that this code shouldn't be able to be called. Plus by moving the sleep into the if statement it at least only run when an actual kill is attempted.
|
Test build #66843 has finished for PR 15441 at commit
|
|
@srowen @tgravescs any chance you'd be able to take a look at this before spark summit? |
srowen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is pretty OK, pending a few minor changes.
| // Do a quick pause here to give Spark time to kill the job so it shows up as | ||
| // killed after the refresh. Note that this will block the serving thread so the | ||
| // time should be limited in duration. | ||
| Thread.sleep(100) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, in both this and the kill-stage endpoint I wonder why we have a 'terminate' flag at all. What would it mean to call this with terminate=false? it seems like this can be checked once, if at all, at the start of the method. Or just removed, if I'm not missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just did some looking into the git history and thats left over from when there wasn't a separate /kill endpoint, so I'll remove that logic in my next commit
| connection.setRequestMethod(method) | ||
| connection.connect() | ||
| val code = connection.getResponseCode() | ||
| connection.disconnect() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For total tidiness you might use try-finally here to disconnect on all paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might just because it's late and I'm tired but I'm not quite sure where you think the try-finally should be
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just mean
val connection = url.openConnection().asInstanceOf[HttpURLConnection]
connection.setRequestMethod(method)
try {
connection.connect()
connection.getResponseCode()
} finally {
connection.disconnect()
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I didn't know scala could do that (how it returns), still learning new things every day
docs/configuration.md
Outdated
| <td>true</td> | ||
| <td> | ||
| Allows stages and corresponding jobs to be killed from the web ui. | ||
| Allows jobs and stages to be killed from the web ui. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: web ui -> web UI
|
@srowen I addressed most of your comments except the one about the try-finally I commented on above |
|
Test build #67406 has finished for PR 15441 at commit
|
|
Test build #67417 has finished for PR 15441 at commit
|
|
@srowen Addressed the try-finally, this look good to go now? |
srowen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple minor extra comments
| override def row(jobTableRow: JobTableRowData): Seq[Node] = { | ||
| val job = jobTableRow.jobData | ||
|
|
||
| val killLink = if (killEnabled) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, does this work? there's no 'else' so this becomes an Any and I don't know what happens below when it's rendered as a string. Should it be an empty string otherwise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch, since it's supposed to be html I'll add a else Seq.empty to each instance of this
| /** Initialize all components of the server. */ | ||
| def initialize() { | ||
| attachTab(new JobsTab(this)) | ||
| attachTab(jobsTab) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #15603 which reminds me that these two tabs don't need to be members. They can be locals. You can make the change for both here.
|
@srowen I addressed your latest comments |
srowen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that looks good pending tests.
|
Test build #67465 has finished for PR 15441 at commit
|
|
Jenkins, retest this please |
|
Test build #67471 has finished for PR 15441 at commit
|
|
Good to merge? @srowen |
|
Merged to master |
## What changes were proposed in this pull request? Currently users can kill stages via the web ui but not jobs directly (jobs are killed if one of their stages is). I've added the ability to kill jobs via the web ui. This code change is based on apache#4823 by lianhuiwang and updated to work with the latest code matching how stages are currently killed. In general I've copied the kill stage code warning and note comments and all. I also updated applicable tests and documentation. ## How was this patch tested? Manually tested and dev/run-tests  Author: Alex Bozarth <[email protected]> Author: Lianhui Wang <[email protected]> Closes apache#15441 from ajbozarth/spark4411.
## What changes were proposed in this pull request? Currently users can kill stages via the web ui but not jobs directly (jobs are killed if one of their stages is). I've added the ability to kill jobs via the web ui. This code change is based on apache#4823 by lianhuiwang and updated to work with the latest code matching how stages are currently killed. In general I've copied the kill stage code warning and note comments and all. I also updated applicable tests and documentation. ## How was this patch tested? Manually tested and dev/run-tests  Author: Alex Bozarth <[email protected]> Author: Lianhui Wang <[email protected]> Closes apache#15441 from ajbozarth/spark4411.
What changes were proposed in this pull request?
Currently users can kill stages via the web ui but not jobs directly (jobs are killed if one of their stages is). I've added the ability to kill jobs via the web ui. This code change is based on #4823 by @lianhuiwang and updated to work with the latest code matching how stages are currently killed. In general I've copied the kill stage code warning and note comments and all. I also updated applicable tests and documentation.
How was this patch tested?
Manually tested and dev/run-tests