Skip to content
5 changes: 5 additions & 0 deletions core/src/main/scala/org/apache/spark/SSLOptions.scala
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ private[spark] case class SSLOptions(
trustStorePassword: Option[String] = None,
trustStoreType: Option[String] = None,
protocol: Option[String] = None,
port: Int = 0,
enabledAlgorithms: Set[String] = Set.empty)
extends Logging {

Expand Down Expand Up @@ -147,6 +148,7 @@ private[spark] object SSLOptions extends Logging {
* $ - `[ns].trustStorePassword` - a password to the trust-store file
* $ - `[ns].trustStoreType` - the type of trust-store
* $ - `[ns].protocol` - a protocol name supported by a particular Java version
* $ - `[ns].port` - a port number
* $ - `[ns].enabledAlgorithms` - a comma separated list of ciphers
*
* For a list of protocols and ciphers supported by particular Java versions, you may go to
Expand Down Expand Up @@ -191,6 +193,8 @@ private[spark] object SSLOptions extends Logging {
val protocol = conf.getOption(s"$ns.protocol")
.orElse(defaults.flatMap(_.protocol))

val port = conf.getInt(s"$ns.port", defaultValue = defaults.map(_.port).getOrElse(0))

val enabledAlgorithms = conf.getOption(s"$ns.enabledAlgorithms")
.map(_.split(",").map(_.trim).filter(_.nonEmpty).toSet)
.orElse(defaults.map(_.enabledAlgorithms))
Expand All @@ -207,6 +211,7 @@ private[spark] object SSLOptions extends Logging {
trustStorePassword,
trustStoreType,
protocol,
port,
enabledAlgorithms)
}

Expand Down
17 changes: 14 additions & 3 deletions core/src/main/scala/org/apache/spark/ui/JettyUtils.scala
Original file line number Diff line number Diff line change
Expand Up @@ -307,15 +307,26 @@ private[spark] object JettyUtils extends Logging {
connectors += httpConnector

sslOptions.createJettySslContextFactory().foreach { factory =>
// If the new port wraps around, do not try a privileged port.

require(sslOptions.port == 0 || (1024 <= sslOptions.port && sslOptions.port < 65536),
"securePort should be between 1024 and 65535 (inclusive)," +
" or 0 for determined automatically.")

val securePort =
if (currentPort != 0) {
(currentPort + 400 - 1024) % (65536 - 1024) + 1024
// If the new port wraps around, do not try a privileged port.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is duplicated and not correct here

if (1024 <= sslOptions.port && sslOptions.port < 65536) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me it seems easier to flip the if condition, and just check for the port being 0, rather than repeat this condition.

sslOptions.port
} else {
// If the new port wraps around, do not try a privilege port
(currentPort + 400 - 1024) % (65536 - 1024) + 1024
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment in the code explaining the math done here ? It will help readability of the code.

}
} else {
0
}
val scheme = "https"
// Create a connector on port securePort to listen for HTTPS requests
// Create a connector on port securePort to listen for HTTPS requests.

val connector = new ServerConnector(server, factory)
connector.setPort(securePort)

Expand Down
2 changes: 2 additions & 0 deletions core/src/test/scala/org/apache/spark/SSLOptionsSuite.scala
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,7 @@ class SSLOptionsSuite extends SparkFunSuite with BeforeAndAfterAll {
"TLS_RSA_WITH_AES_128_CBC_SHA, TLS_RSA_WITH_AES_256_CBC_SHA")
conf.set("spark.ui.ssl.enabledAlgorithms", "ABC, DEF")
conf.set("spark.ssl.protocol", "SSLv3")
conf.set("spark.ssl.port", "18999")

val defaultOpts = SSLOptions.parse(conf, "spark.ssl", defaults = None)
val opts = SSLOptions.parse(conf, "spark.ui.ssl", defaults = Some(defaultOpts))
Expand All @@ -128,6 +129,7 @@ class SSLOptionsSuite extends SparkFunSuite with BeforeAndAfterAll {
assert(opts.keyStorePassword === Some("12345"))
assert(opts.keyPassword === Some("password"))
assert(opts.protocol === Some("SSLv3"))
assert(opts.port === 18999)
assert(opts.enabledAlgorithms === Set("ABC", "DEF"))
}

Expand Down
10 changes: 10 additions & 0 deletions docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -1663,6 +1663,16 @@ Apart from these, the following properties are also available, and may be useful
page.
</td>
</tr>
<tr>
<td><code>spark.ssl.port</code></td>
<td>0</td>
<td>
Port number to listen on for SSL connections.
The SSL port should be between 1024 and 65535 (inclusive).
Default value of 0 means the port will be determined automatically.
Attention that the port should be separated for each particular protocols.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be clearer to write something like "Note that the SSL port for individual services can be overridden by setting values like spark.ssl.YYY.port as described above."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the code above you are ensuring that the SSL port is always between 1024 and 65536. You could document that here as "allowed range of values".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's clear what the last sentence means. I'd either remove it or say something like "the port can be specified for services individually, with properties like spark.ssl.YYYY.port, as described above."

</td>
</tr>
<tr>
<td><code>spark.ssl.needClientAuth</code></td>
<td>false</td>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ private[ml] trait VectorIndexerParams extends Params with HasInputCol with HasOu
* - This helps process a dataset of unknown vectors into a dataset with some continuous
* features and some categorical features. The choice between continuous and categorical
* is based upon a maxCategories parameter.
* - Set maxCategories to the maximum number of categorical any categorical feature should have.
* - Set maxCategories to the maximum number of categories which categorical feature should have.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look related? it should also be more like "number of categories that any categorical ..."

* - E.g.: Feature 0 has unique values {-1.0, 0.0}, and feature 1 values {1.0, 3.0, 5.0}.
* If maxCategories = 2, then feature 0 will be declared categorical and use indices {0, 1},
* and feature 1 will be declared continuous.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -85,8 +85,10 @@ private[ml] trait DecisionTreeParams extends PredictorParams
* (default = 256 MB)
* @group expertParam
*/
final val maxMemoryInMB: IntParam = new IntParam(this, "maxMemoryInMB",
"Maximum memory in MB allocated to histogram aggregation.",
final val maxMemoryInMB: IntParam = new IntParam(this, "maxMemoryInMB", "Maximum memory in MB" +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise, looks unrelated

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't related to SPARK-16987 as you say. But I think the documentation is not sufficient, and I'd like to fix it. I think it's very trivial so we can handle at this PR.
Is it better to create another issue at jira for this?
And treeParams.scala's comment is same.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should not be in this pull request because it's not at all related. Sometimes it makes sense to touch up code while changing it for other reasons but that's not the case here.

I think these changes may be too insignificant for their own PR. If you can review a module for typos, that would be better to do it all at once.

For this particular change, I don't know if the parameter's help message needs to have all its documentation. It's just describing the basic nature of the parameter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rebased unrelated changes. Please confirm it.

" allocated to histogram aggregation." +
" If too small, then 1 node will be split per iteration," +
" and its aggregates may exceed this size.",
ParamValidators.gtEq(0))

/**
Expand Down