Skip to content
5 changes: 5 additions & 0 deletions core/src/main/scala/org/apache/spark/SSLOptions.scala
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ private[spark] case class SSLOptions(
trustStorePassword: Option[String] = None,
trustStoreType: Option[String] = None,
protocol: Option[String] = None,
port: Int = 0,
enabledAlgorithms: Set[String] = Set.empty)
extends Logging {

Expand Down Expand Up @@ -147,6 +148,7 @@ private[spark] object SSLOptions extends Logging {
* $ - `[ns].trustStorePassword` - a password to the trust-store file
* $ - `[ns].trustStoreType` - the type of trust-store
* $ - `[ns].protocol` - a protocol name supported by a particular Java version
* $ - `[ns].port` - a port number
* $ - `[ns].enabledAlgorithms` - a comma separated list of ciphers
*
* For a list of protocols and ciphers supported by particular Java versions, you may go to
Expand Down Expand Up @@ -191,6 +193,8 @@ private[spark] object SSLOptions extends Logging {
val protocol = conf.getOption(s"$ns.protocol")
.orElse(defaults.flatMap(_.protocol))

val port = conf.getInt(s"$ns.port", defaultValue = defaults.map(_.port).getOrElse(0))

val enabledAlgorithms = conf.getOption(s"$ns.enabledAlgorithms")
.map(_.split(",").map(_.trim).filter(_.nonEmpty).toSet)
.orElse(defaults.map(_.enabledAlgorithms))
Expand All @@ -207,6 +211,7 @@ private[spark] object SSLOptions extends Logging {
trustStorePassword,
trustStoreType,
protocol,
port,
enabledAlgorithms)
}

Expand Down
17 changes: 14 additions & 3 deletions core/src/main/scala/org/apache/spark/ui/JettyUtils.scala
Original file line number Diff line number Diff line change
Expand Up @@ -307,15 +307,26 @@ private[spark] object JettyUtils extends Logging {
connectors += httpConnector

sslOptions.createJettySslContextFactory().foreach { factory =>
// If the new port wraps around, do not try a privileged port.

require(sslOptions.port == 0 || (1024 <= sslOptions.port && sslOptions.port < 65536),
"securePort should be between 1024 and 65535 (inclusive)," +
" or 0 for determined automatically.")

val securePort =
if (currentPort != 0) {
(currentPort + 400 - 1024) % (65536 - 1024) + 1024
if (sslOptions.port == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a problem with this logic. It ignores retries. Imagine that you're setting the base HTTP and SSL ports for the Spark UI (not for the history server), and you want multiple drivers on the same host. So you set (names may not be totally correct):

spark.ui.port=1234
spark.ssl.ui.port=5678

The first driver comes up and binds to 1234 and 5678. Then the second driver comes up and those two ports are used; startServiceOnPort will take care of retrying the HTTP port until maxRetries, but this code does not do the same for the SSL port: it will always try 5678. So the second driver will never run because it will fail to bind to the SSL port.

You should instead be using the port value passed to startServiceOnPort as the base to calculate the offset for the defined SSL port. That's not optimal, but it's probably the best you can do here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for comment. Yes, as you say,this logic will occur some conflicts.
But as I refered at the top of PR comment, startServiceOnPort is called from many unrelated methods.
So I conscious that this PR affects many unrelated components.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not unrelated. As I described, you code will break things in the normal web UI if you set that option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that some methods which call startServiceOnPort is related, and for example, NettyBlockTransferService.init or RestSubmissionServer.this is unrelated, I think.
But I'm trying to fix according to your comment. I'm in progress so please wait...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NettyBlockTransferService is unrelated, but as I mentioned, the Spark UI also supports SSL and would break with your previous version of the code.

// If the new port wraps around, do not try a privileged port
(currentPort + 400 - 1024) % (65536 - 1024) + 1024
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment in the code explaining the math done here ? It will help readability of the code.

} else {
// use sslOptions.port value as securePort
sslOptions.port
}
} else {
0
}
val scheme = "https"
// Create a connector on port securePort to listen for HTTPS requests
// Create a connector on port securePort to listen for HTTPS requests.

val connector = new ServerConnector(server, factory)
connector.setPort(securePort)

Expand Down
2 changes: 2 additions & 0 deletions core/src/test/scala/org/apache/spark/SSLOptionsSuite.scala
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,7 @@ class SSLOptionsSuite extends SparkFunSuite with BeforeAndAfterAll {
"TLS_RSA_WITH_AES_128_CBC_SHA, TLS_RSA_WITH_AES_256_CBC_SHA")
conf.set("spark.ui.ssl.enabledAlgorithms", "ABC, DEF")
conf.set("spark.ssl.protocol", "SSLv3")
conf.set("spark.ssl.port", "18999")

val defaultOpts = SSLOptions.parse(conf, "spark.ssl", defaults = None)
val opts = SSLOptions.parse(conf, "spark.ui.ssl", defaults = Some(defaultOpts))
Expand All @@ -128,6 +129,7 @@ class SSLOptionsSuite extends SparkFunSuite with BeforeAndAfterAll {
assert(opts.keyStorePassword === Some("12345"))
assert(opts.keyPassword === Some("password"))
assert(opts.protocol === Some("SSLv3"))
assert(opts.port === 18999)
assert(opts.enabledAlgorithms === Set("ABC", "DEF"))
}

Expand Down
10 changes: 10 additions & 0 deletions docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -1663,6 +1663,16 @@ Apart from these, the following properties are also available, and may be useful
page.
</td>
</tr>
<tr>
<td><code>spark.ssl.port</code></td>
<td>0</td>
<td>
Port number to listen on for SSL connections.
The SSL port should be between 1024 and 65535 (inclusive).
Default value of 0 means the port will be determined automatically.
The port can be specified for services individually, with properties like <code>spark.ssl.YYY.port</code>.
</td>
</tr>
<tr>
<td><code>spark.ssl.needClientAuth</code></td>
<td>false</td>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ private[ml] trait VectorIndexerParams extends Params with HasInputCol with HasOu
* - This helps process a dataset of unknown vectors into a dataset with some continuous
* features and some categorical features. The choice between continuous and categorical
* is based upon a maxCategories parameter.
* - Set maxCategories to the maximum number of categorical any categorical feature should have.
* - Set maxCategories to the maximum number of categories that any categorical feature should have.
* - E.g.: Feature 0 has unique values {-1.0, 0.0}, and feature 1 values {1.0, 3.0, 5.0}.
* If maxCategories = 2, then feature 0 will be declared categorical and use indices {0, 1},
* and feature 1 will be declared continuous.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -85,8 +85,10 @@ private[ml] trait DecisionTreeParams extends PredictorParams
* (default = 256 MB)
* @group expertParam
*/
final val maxMemoryInMB: IntParam = new IntParam(this, "maxMemoryInMB",
"Maximum memory in MB allocated to histogram aggregation.",
final val maxMemoryInMB: IntParam = new IntParam(this, "maxMemoryInMB", "Maximum memory in MB" +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise, looks unrelated

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't related to SPARK-16987 as you say. But I think the documentation is not sufficient, and I'd like to fix it. I think it's very trivial so we can handle at this PR.
Is it better to create another issue at jira for this?
And treeParams.scala's comment is same.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should not be in this pull request because it's not at all related. Sometimes it makes sense to touch up code while changing it for other reasons but that's not the case here.

I think these changes may be too insignificant for their own PR. If you can review a module for typos, that would be better to do it all at once.

For this particular change, I don't know if the parameter's help message needs to have all its documentation. It's just describing the basic nature of the parameter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rebased unrelated changes. Please confirm it.

" allocated to histogram aggregation." +
" If too small, then 1 node will be split per iteration," +
" and its aggregates may exceed this size.",
ParamValidators.gtEq(0))

/**
Expand Down