Skip to content

[SPARK-26714][CORE][WEBUI] Show 0 partition job in WebUI#23637

Closed
deshanxiao wants to merge 9 commits intoapache:masterfrom
deshanxiao:spark-26714
Closed

[SPARK-26714][CORE][WEBUI] Show 0 partition job in WebUI#23637
deshanxiao wants to merge 9 commits intoapache:masterfrom
deshanxiao:spark-26714

Conversation

@deshanxiao
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

When the job's partiton is zero, it will still get a jobid but not shown in ui. It's strange. This PR is to show this job in ui.

Example:
In bash:
mkdir -p /home/test/testdir

sc.textFile("/home/test/testdir")

Some logs:

19/01/24 17:26:19 INFO FileInputFormat: Total input paths to process : 0
19/01/24 17:26:19 INFO SparkContext: Starting job: collect at WordCount.scala:9
19/01/24 17:26:19 INFO DAGScheduler: Job 0 finished: collect at WordCount.scala:9, took 0.003735 s

How was this patch tested?

UT

@deshanxiao
Copy link
Copy Markdown
Contributor Author

It looks like:

image

Comment thread core/src/main/scala/org/apache/spark/ui/UIUtils.scala Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case is there actually an attempt 0?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! The jobRow invokes the method to generate job page. Maybe I can use a if to handle the partition 0 job.

Comment thread core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
Comment thread core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala Outdated
xiaodeshan added 2 commits January 28, 2019 00:12
Comment thread core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
Copy link
Copy Markdown
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, here's where I just don't know the code well enough to decide if that's the right place to return. I get that you're trying to account for the time taken to process the job, which waits on 0 tasks. Is that meaningful? conceptually it takes no time at all. It always succeeds too, right? can it even fail? it just seems a little weird to check this in two places, but might make sense, not sure.

@deshanxiao
Copy link
Copy Markdown
Contributor Author

deshanxiao commented Jan 28, 2019

Yes, it takes no time at all and It always succeeds. Maybe using the same time in SparkListenerJobStart and SparkListenerJobEnd will be better. In addition, the method submitJob is invoked in two positions. I don't want to handle it twice. Hence, I think the best place to place the code is the method submitJob itself.

Copy link
Copy Markdown
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was more comfortable with the original change, as you have it now.

Comment thread core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala Outdated
@deshanxiao
Copy link
Copy Markdown
Contributor Author

deshanxiao commented Jan 29, 2019

retest this please.


val jobId = nextJobId.getAndIncrement()
if (partitions.size == 0) {
val time = clock.getTimeMillis()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking OK to me, though I wouldn't mind, say, @cloud-fan taking a quick look.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srowen Thank you! @cloud-fan Could you give me some suggestions?

@cloud-fan
Copy link
Copy Markdown
Contributor

LGTM, cc @gengliangwang

@gengliangwang
Copy link
Copy Markdown
Member

gengliangwang commented Jan 29, 2019

After the changes, the UI is kind of confusing. Should we revise the wording "(Unknown Stage Name)"?

@srowen
Copy link
Copy Markdown
Member

srowen commented Jan 29, 2019

@gengliangwang actually now it will use the job.name as description if present, yes. That screenshot was from the original version of the PR.

@SparkQA
Copy link
Copy Markdown

SparkQA commented Jan 29, 2019

Test build #4534 has finished for PR 23637 at commit 2185cb8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@deshanxiao
Copy link
Copy Markdown
Contributor Author

@gengliangwang @srowen Sorry, maybe I didn't make it clear. The job.name will be (Unknown Stage Name) in this case whatever changing PR or not. I change the original PR is to fit with the code logic.

We can see the default job.name will be sent in:

    // AppStatusListener.scala
    val lastStageInfo = event.stageInfos.sortBy(_.stageId).lastOption
    val lastStageName = lastStageInfo.map(_.name).getOrElse("(Unknown Stage Name)")
    val jobGroup = Option(event.properties)
      .flatMap { p => Option(p.getProperty(SparkContext.SPARK_JOB_GROUP_ID)) }
    val sqlExecutionId = Option(event.properties)
      .flatMap(p => Option(p.getProperty(SQL_EXECUTION_ID_KEY)).map(_.toLong))

    val job = new LiveJob(
      event.jobId,
      lastStageName,
      if (event.time > 0) Some(new Date(event.time)) else None,
      event.stageIds,
      jobGroup,
      numTasks,
      sqlExecutionId)
    liveJobs.put(event.jobId, job)
    liveUpdate(job, now)
private class LiveJob(
    val jobId: Int,
    name: String,
    val submissionTime: Option[Date],
    val stageIds: Seq[Int],
    jobGroup: Option[String],
    numTasks: Int,
    sqlExecutionId: Option[Long]) extends LiveEntity {

So, if necessary, we can set job.name in two methods:

1.Add more info in SparkListenerJobStart
2.Constructing a empty stage contains a callsite in StageInfo?

But it will be more complex and get some compatibility problem.

In the end, I change it to original. Mybe the UI is kind of confusing. This is acceptable because the job description in here places the last stage info originally. In this case, we have no stage. So showing "Unknown Stage Name" is acceptable I think.

@deshanxiao
Copy link
Copy Markdown
Contributor Author

Here is the lastest PR screenshot:

image

@srowen
Copy link
Copy Markdown
Member

srowen commented Jan 30, 2019

Oh I see. Well it's consistent with the current logic which uses job.name in all cases as a fallback. I think this is OK; I'm not sure what other placeholder string would be more meaningful.

@deshanxiao
Copy link
Copy Markdown
Contributor Author

deshanxiao commented Jan 30, 2019

@srowen Yes, I argee with you. @gengliangwang Could you give me some suggestions? What placeholder string more meaningful do you think?

@gengliangwang
Copy link
Copy Markdown
Member

gengliangwang commented Jan 30, 2019

Hi @deshanxiao ,
yes, the description (Unknown Stage Name) is for all the jobs that have 0 stages, which is not directly related to this PR.
My point is, can we also update this Job description? In the job page, it is confusing to me that the description is (Unknown Stage Name). I would suggest to revise it as Unknown Job without stages or Unknown Job.
But this is just personal opinion. We can go without it. This PR LGTM overall.

@deshanxiao
Copy link
Copy Markdown
Contributor Author

I get it. Thank you @gengliangwang! I think Unknown Job without stages will be better. I will change it. Thanks!

@deshanxiao
Copy link
Copy Markdown
Contributor Author

Retest please.

Comment thread core/src/main/scala/org/apache/spark/status/AppStatusListener.scala Outdated
Copy link
Copy Markdown
Member

@gengliangwang gengliangwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment thread core/src/main/scala/org/apache/spark/status/AppStatusListener.scala Outdated
@cloud-fan
Copy link
Copy Markdown
Contributor

ok to test

@SparkQA
Copy link
Copy Markdown

SparkQA commented Jan 31, 2019

Test build #101929 has finished for PR 23637 at commit 0c1ea7e.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Copy Markdown
Contributor

retest this please

@SparkQA
Copy link
Copy Markdown

SparkQA commented Jan 31, 2019

Test build #101943 has finished for PR 23637 at commit 0c1ea7e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Copy Markdown
Member

srowen commented Feb 2, 2019

Merged to master

@srowen srowen closed this in a0faabf Feb 2, 2019
jackylee-ch pushed a commit to jackylee-ch/spark that referenced this pull request Feb 18, 2019
## What changes were proposed in this pull request?

When the job's partiton is zero, it will still get a jobid but not shown in ui. It's strange. This PR is to show this job in ui.

Example:
In bash:
mkdir -p /home/test/testdir

sc.textFile("/home/test/testdir")

Some logs:

```
19/01/24 17:26:19 INFO FileInputFormat: Total input paths to process : 0
19/01/24 17:26:19 INFO SparkContext: Starting job: collect at WordCount.scala:9
19/01/24 17:26:19 INFO DAGScheduler: Job 0 finished: collect at WordCount.scala:9, took 0.003735 s
```

## How was this patch tested?

UT

Closes apache#23637 from deshanxiao/spark-26714.

Authored-by: xiaodeshan <xiaodeshan@xiaomi.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants