Skip to content

Conversation

@rajatahujaatinmobi
Copy link

@rajatahujaatinmobi rajatahujaatinmobi commented Jun 21, 2020

What changes were proposed in this pull request?

When a Spark Job launched in Cluster mode with Yarn, Application Master sets spark.ui.port port to 0 which means Driver's web UI gets any random port even if we want to explicitly set the Port range for Driver's Web UI

Why are the changes needed?

We access Spark Web UI via Knox Proxy, and there are firewall restrictions due to which we can not access Spark Web UI since Web UI port range gets random port even if we set explicitly.

This Change will check if there is a specified port range explicitly mentioned so that it does not assign a random port.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Local Tested.

… (spark.ui.port) property if mentioned explicitly
@dongjoon-hyun
Copy link
Member

ok to test

@dongjoon-hyun
Copy link
Member

Thank you for moving to master, @rajatahujaatinmobi .

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-32039][YARN][WEBUI][master] Adding Check to not to set UI port (spark.ui.port) property if mentioned explicitly [SPARK-32039][YARN][WEBUI] Adding Check to not to set UI port (spark.ui.port) property if mentioned explicitly Jun 21, 2020
// so we don't conflict with other spark processes running on the same box
if (System.getProperty(UI_PORT.key) != null) {
System.setProperty(UI_PORT.key, "0")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @rajatahujaatinmobi . While reviewing this PR, I noticed that SPARK-29465 is a previous one for this issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, please see the discussions there, I think a range of ports would be much better. A single port set on a multi-tenant yarn cluster could easily end up with conflicts. What requirements do you have from Knox, a certain range or ports?
Normally on yarn I would expect you to use the resource manager web ui proxy on a secure cluster so you wouldn't be accessing the UI directly.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I am wrong

As per the discussions that UI port is always get assigned in the range based on spark.port.maxRetries. So we can always mention the starting port and afterward, it will try to start service based in between (starting port, starting port +spark.port.maxRetries). So we are including the range implicitly.

Copying the comment
other port types were spawned in the range (port_mentioned to port_mentioned+spark.port.maxRetries) except the UI port which started on random port.

We have a Knox proxy that accesses UI via RM link. And Knox proxy has a certain range of ports that it can access on the cluster which is (18000-18159). Unless we start web UI on that range, we can not access the UI. and we can achieve above the range if we set the following properties
spark.ui.port=18000
spark.port.maxRetries=159

Hence setting up a spark.ui.port as starting port is enough instead of setting up a random port number in the range.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, I think yet Spark could utilize the combination of spark.ui.port and spark.port.maxRetries to perform "range port".

What else the "range port" do you refer to? @tgravescs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yes you are correct you can get range that way. the ports originally wouldn't do the retries or at least we didn't allow configuring maxRetries.

Copy link
Author

@rajatahujaatinmobi rajatahujaatinmobi Jun 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I blocked ports from 18000 to 18050 so now spark.ui.port can not get port b/w that range. But in Spark properties, I mentioned spark.ui.port as 18018 and spark.port.maxRetries=60 so now spark service will start trying from 18018 port until 18018 + 60 in an increment way but ports are blocked until 18050 so spark.ui.port will pick up 18051 port. @Ngone51

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. This is what we expected, right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's expected. So I think we are good to go unless someone else has any other doubts.
@Ngone51 @tgravescs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for testing that, please update the comment like my below comment requests.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tgravescs I have done that. Can we proceed further?

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-32039][YARN][WEBUI] Adding Check to not to set UI port (spark.ui.port) property if mentioned explicitly [SPARK-29465][YARN][WEBUI] Adding Check to not to set UI port (spark.ui.port) property if mentioned explicitly Jun 21, 2020
@dongjoon-hyun
Copy link
Member

@rajatahujaatinmobi . This is a partial revert of SPARK-3627 and has been discussed before. Please see SPARK-29465.
The last advice was port range might be acceptable solution.

@SparkQA
Copy link

SparkQA commented Jun 22, 2020

Test build #124342 has finished for PR 28880 at commit ce4b6ea.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@rajatahujaatinmobi
Copy link
Author

@rajatahujaatinmobi . This is a partial revert of SPARK-3627 and has been discussed before. Please see SPARK-29465.
The last advice was port range might be acceptable solution.

If spark.ui.port is Set then it will start from that port and will try web ui until spark.port.maxRetries and its default value is 16. So it takes care of range.


spark.port.maxRetries | 16 | Maximum number of retries when binding to a port before giving up.     When a port is given a specific value (non 0), each subsequent retry will     increment the port used in the previous attempt by 1 before retrying. This     essentially allows it to try a range of ports from the start port specified     to port + maxRetries.
-- | -- | --


@dongjoon-hyun
Copy link
Member

@dongjoon-hyun
Copy link
Member

cc @tgravescs

System.setProperty(UI_PORT.key, "0")
// Set the web ui port to be ephemeral for yarn if not set explicitly
// so we don't conflict with other spark processes running on the same box
if (System.getProperty(UI_PORT.key) != null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be ==?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, changing the If condition.

@Ngone51
Copy link
Member

Ngone51 commented Jun 22, 2020

Seems like the change is firstly introduced in c6de982 rather than SPARK-3627.

@rajatahujaatinmobi
Copy link
Author

@HyukjinKwon
Copy link
Member

cc @tgravescs FYI

… (spark.ui.port) property if mentioned explicitly
@SparkQA
Copy link

SparkQA commented Jun 22, 2020

Test build #124349 has finished for PR 28880 at commit 10b9f7e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@tgravescs tgravescs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this will be fine as I don't think we set the default values as properties unless its explicitly set, but please test it. And test both cases where it isn't set it should go to 0 and when it is set it should iterate through the range with retries. From the bug of != vs == I'm assuming this code wasn't run on an actual yarn cluster, so please do that.

// other spark processes running on the same box
System.setProperty(UI_PORT.key, "0")
// Set the web ui port to be ephemeral for yarn if not set explicitly
// so we don't conflict with other spark processes running on the same box
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets update the comment to say if explicitly set, it acts as a port range based on spark.port.maxRetries so make sure that is sufficiently large for your setup to avoid port conflicts.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tgravescs Updated my comment. If that looks okay, can we merge this request ? And I would like to backport this change until spark 2.4

…ke_interval`

Change precision of seconds and its fraction from 8 to 18 to be able to construct intervals of max allowed microseconds value (long).

To improve UX of Spark SQL.

Yes

- Add tests to IntervalExpressionsSuite
- Add an example to the `MakeInterval` expression
- Add tests to `interval.sql`

Closes apache#28873 from MaxGekk/make_interval-sec-precision.

Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

[SPARK-31957][SQL] Cleanup hive scratch dir for the developer api startWithContext

Comparing to the long-running ThriftServer via start-script, we are more likely to hit the issue https://issues.apache.org/jira/browse/HIVE-10415 / https://issues.apache.org/jira/browse/SPARK-31626 in the developer API `startWithContext`

This PR apply SPARK-31626 to the developer API `startWithContext`

Fix the issue described in  SPARK-31626

Yes, the hive scratch dir will be deleted if cleanup is enabled for calling `startWithContext`

new test

Closes apache#28784 from yaooqinn/SPARK-31957.

Authored-by: Kent Yao <yaooqinn@hotmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

[SPARK-32039][YARN][WEBUI][Master] Adding Check to not to set UI port (spark.ui.port) property if mentioned explicitly

[SPARK-32039][YARN][WEBUI][Master] Adding Check to not to set UI port (spark.ui.port) property if mentioned explicitly

Merge branch 'ahujarajat261/SPARK-32039-change-yarn-webui-port-range-with-property-latest-spark' of https://github.com/rajatahujaatinmobi/spark-1 into ahujarajat261/SPARK-32039-change-yarn-webui-port-range-with-property-latest-spark
@SparkQA
Copy link

SparkQA commented Jun 27, 2020

Test build #124559 has finished for PR 28880 at commit b42c3ed.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Rajat Ahuja added 2 commits June 26, 2020 23:19
… (spark.ui.port) property if mentioned explicitly
@SparkQA
Copy link

SparkQA commented Jun 27, 2020

Test build #124562 has finished for PR 28880 at commit 8fe0aa9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@Ngone51
Copy link
Member

Ngone51 commented Jun 29, 2020

LGTM

Copy link
Author

@rajatahujaatinmobi rajatahujaatinmobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

How can we go ahead and merge it with master ?

// other spark processes running on the same box
System.setProperty(UI_PORT.key, "0")
// Set the web ui port to be ephemeral for yarn if not set explicitly
// so we don't conflict with other spark processes running on the same box
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tgravescs Updated my comment. If that looks okay, can we merge this request ? And I would like to backport this change until spark 2.4

@tgravescs
Copy link
Contributor

rekicked the checks, once they pass I can commit

Copy link
Author

@rajatahujaatinmobi rajatahujaatinmobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rekicked the checks, once they pass I can commit

continuous-integration/appveyor/pr is failing. Not sure why is that so?

// so we don't conflict with other spark processes running on the same box
if (System.getProperty(UI_PORT.key) != null) {
System.setProperty(UI_PORT.key, "0")
}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tgravescs I have done that. Can we proceed further?

@rajatahujaatinmobi
Copy link
Author

rekicked the checks, once they pass I can commit

continuous-integration/appveyor/pr is failing. Not sure why is that so?

rekicked the checks, once they pass I can commit
@tgravescs can you rerun the tests and commit/merge with master if it seems okay? I can not rerun the tests

@tgravescs
Copy link
Contributor

test this please

@SparkQA
Copy link

SparkQA commented Jun 30, 2020

Test build #124664 has finished for PR 28880 at commit 8fe0aa9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 30, 2020

Test build #124687 has finished for PR 28880 at commit 1896905.

  • This patch fails to generate documentation.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 30, 2020

Test build #124685 has finished for PR 28880 at commit 24ecfcf.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@rajatahujaatinmobi
Copy link
Author

test this please
Jenkins jobs seems to be doing fine
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124664/

Test build #124685 has finished for PR 28880 at commit 24ecfcf.

* This patch passes all tests.

* This patch merges cleanly.

* This patch adds no public classes.

@tgravescs Looks like all the test cases pass in the Jenkins Build. Shall we go ahead and merge it ?

@tgravescs
Copy link
Contributor

this is odd, looks like those are all R tests for some reason failing. @dongjoon-hyun have you seen that before?

@rajatahujaatinmobi
Copy link
Author

rajatahujaatinmobi commented Jul 1, 2020

this is odd, looks like those are all R tests for some reason failing. @dongjoon-hyun have you seen that before?

@dongjoon-hyun Any comments as to how can we proceed further?

@dongjoon-hyun
Copy link
Member

For me, it looks passed with the following options, @tgravescs and @rajatahujaatinmobi .

-Phadoop-3.2 -Phive-2.3 -Pyarn -Dtest.exclude.tags=org.apache.spark.tags.ExtendedHiveTest yarn/test network-yarn/test

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @rajatahujaatinmobi , @tgravescs , @Ngone51 , @HyukjinKwon .
Merged to master for Apache Spark 3.1.

@dongjoon-hyun
Copy link
Member

Hi, @rajatahujaatinmobi . What is your Apache JIRA id? I need to assign SPARK-29465 to you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants