Skip to content

Conversation

@tgravescs
Copy link
Contributor

What changes were proposed in this pull request?

This adds UI updates to support stage level scheduling and ResourceProfiles. 3 main things have been added. ResourceProfile id added to the Stage page, the Executors page now has an optional selectable column to show the ResourceProfile Id of each executor, and the Environment page now has a section with the ResourceProfile ids. Along with this the rest api for environment page was updated to include the Resource profile information.

I debating on splitting the resource profile information into its own page but I wasn't sure it called for a completely separate page. Open to peoples thoughts on this.

Screen shots:
Screen Shot 2020-04-01 at 3 07 46 PM
Screen Shot 2020-04-01 at 3 08 14 PM
Screen Shot 2020-04-01 at 3 09 03 PM
Screen Shot 2020-04-01 at 11 05 48 AM

Why are the changes needed?

For user to be able to know what resource profile was used with which stage and executors. The resource profile information is also available so user debugging can see exactly what resources were requested with that profile.

Does this PR introduce any user-facing change?

Yes, UI updates.

How was this patch tested?

Unit tests and tested on yarn both active applications and with the history server.

@SparkQA
Copy link

SparkQA commented Apr 1, 2020

Test build #120691 has finished for PR 28094 at commit d249dce.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 2, 2020

Test build #120735 has finished for PR 28094 at commit 0dee439.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tgravescs
Copy link
Contributor Author

@HyukjinKwon @attilapiros anyone available for reviewing?

Copy link
Contributor

@attilapiros attilapiros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry this could be out of scope of this PR but I started to read the relevant changes and related to

val stageInfo = new StageInfo(1, 1, "me-stage", 1, Seq.empty, Seq(1, 2, 3), "details",
resourceProfileId = ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID)

I would expect a check of an old StageInfo JSON (without resourceProfileId) and whether that would be constructed the same with StageInfo constructed without any resourceProfileId as both expected to use the default value. So I believe StageInfo should have a default value for the resourceProfileId:

  @DeveloperApi
  class StageInfo(
    ...
     val resourceProfileId: Int = ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID) {

This can be checked in a separate unit test too as the name in case of failure would be misleading.

This way we could preserve the compatibility with earlier event logs. Is not it?

@tgravescs
Copy link
Contributor Author

correct, the default value should keep it compatible. I did some testing with old history files and they had worked. I can add unit test if I missed it and recheck just to make sure

Copy link
Contributor

@attilapiros attilapiros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will continue reviewing this but here are my comments so far.

@attilapiros
Copy link
Contributor

I finished the first round. In addition to the shared review comments I miss a unit test from the JsonProtocolSuite where the new SparkListenerResourceProfileAdded is tested or at least one call of testEvent(...) with the new class.

@attilapiros
Copy link
Contributor

attilapiros commented Apr 29, 2020

Oh sorry and one more idea (this can be done separate PR too): on the Executors and Stage page where the resource profile ID is given we could add a tool tip with the resource profile content if you like.

@tgravescs
Copy link
Contributor Author

thanks for the review, will update the PR hopefully end of the week early next week as I got pulled into something.

@tgravescs
Copy link
Contributor Author

thanks, I'll add some tests in JsonProtocolSuite. I think the tooltip can be done separately so I'll file a jira for that

@SparkQA
Copy link

SparkQA commented May 4, 2020

Test build #122275 has finished for PR 28094 at commit 13ad0db.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 4, 2020

Test build #122280 has finished for PR 28094 at commit 42d0180.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tgravescs
Copy link
Contributor Author

test passes locally, not sure why its failing here. I realize I accidentally left a test I don't need so removed it

@SparkQA
Copy link

SparkQA commented May 4, 2020

Test build #122291 has finished for PR 28094 at commit 644f9a4.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@attilapiros
Copy link
Contributor

It fails as Resource Profile IDs are autogenerated and they come from a global state:

    private var _id = ResourceProfile.getNextProfileId

When you execute this suite only then it is really a 0 as in the given JSON string but on the Jenkins some earlier test increased it.

So just a setToDefaultProfile() is missing from the freshly built Resource profile.

@SparkQA
Copy link

SparkQA commented May 5, 2020

Test build #122316 has finished for PR 28094 at commit 92b0007.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@attilapiros attilapiros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some tiny things otherwise LGTM.

pom.xml Outdated
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>4.3.0</version>
<version>3.4.4</version>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This must be a mistake I guess.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes thanks for catching.

Comment on lines 46 to 47
val execStr = new StringBuilder()
execStr ++= s"\t${execReq.resourceName}: [amount: ${execReq.amount}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this lines can be merged into a bit more efficient one:

   val execStr = new StringBuilder(s"\t${execReq.resourceName}: [amount: ${execReq.amount}")

@SparkQA
Copy link

SparkQA commented May 7, 2020

Test build #122409 has finished for PR 28094 at commit 994e40f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tgravescs
Copy link
Contributor Author

anyone else have comments @HyukjinKwon @mridulm

val attributes: Map[String, String],
val resources: Map[String, ResourceInformation])
val resources: Map[String, ResourceInformation],
val resourceProfileId: Int)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While deserializing existing history files, which wont have resource profile related events, will it default to using ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID ?
Or is it backwardly incompatible ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the LiveEntity creates the ExecutorSummary and it has a default of the ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID and executorInfoFromJson will default it to ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID when that field isn't found as well.
So it will just default to that for older history files that don't have this field.

I had tested this with some 2.x history files originally and it was working that way as expected. But I should retest again after the changes from review comments but I don't think anything has broken that.

Definitely something we want to make sure its backwards compatible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Tom ! My only concern was around what happens if the resource event is not fired (which it wont for older files) and how this works w.r.t existing jsons which dont have rp field populated.

Once you have validated this, rest looked good to me - but I am not really an expert on the UI side of things, so someone more familiar should probably take a look too.

@tgravescs
Copy link
Contributor Author

thanks for looking @mridulm, I'll upmerge this and retest for backwards compatibility again.

@tgravescs
Copy link
Contributor Author

Upmerged and retested with 2.4, 3.0, and 3.1 history files. Everything is working with the older files from 2.4 and 3.0 and it defaults to use the DEFAULT_PROFILE when its missing from the history file.

@SparkQA
Copy link

SparkQA commented May 18, 2020

Test build #122813 has finished for PR 28094 at commit 11a8e04.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tgravescs
Copy link
Contributor Author

Merging to master

@asfgit asfgit closed this in b64688e May 21, 2020
@HyukjinKwon
Copy link
Member

Hey sorry for late response. I just found out during checking the queued PRs. I just skimmed and seems fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants