Skip to content

Conversation

@vanzin
Copy link
Contributor

@vanzin vanzin commented Jan 13, 2017

The redirect handler was installed only for the root of the server;
any other context ended up being served directly through the HTTP
port. Since every sub page (e.g. application UIs in the history
server) is a separate servlet context, this meant that everything
but the root was accessible via HTTP still.

The change adds separate names to each connector, and binds contexts
to specific connectors so that content is only served through the
HTTPS connector when it's enabled. In that case, the only thing that
binds to the HTTP connector is the redirect handler.

Tested with new unit tests and by checking a live history server.

The redirect handler was installed only for the root of the server;
any other context ended up being served directly through the HTTP
port. Since every sub page (e.g. application UIs in the history
server) is a separate servlet context, this meant that everything
but the root was accessible via HTTP still.

The change adds separate names to each connector, and binds contexts
to specific connectors so that content is only served through the
HTTPS connector when it's enabled. In that case, the only thing that
binds to the HTTP connector is the redirect handler.

Tested with new unit tests and by checking a live history server.
@SparkQA
Copy link

SparkQA commented Jan 14, 2017

Test build #71353 has finished for PR 16582 at commit 983f490.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 14, 2017

Test build #71359 has finished for PR 16582 at commit 67df755.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor Author

vanzin commented Jan 14, 2017

Pinging some (random?) people @ajbozarth @srowen @sarutak

@ajbozarth
Copy link
Member

The code changes looks good to me, but my experience in code working with SSL is still small so someone with more experience should also double-check.

@sarutak
Copy link
Member

sarutak commented Jan 17, 2017

I'll take a look at this within the weekend.

@sarutak
Copy link
Member

sarutak commented Jan 20, 2017

@vanzin I'm looking into this change and it works well on standalone-mode but doesn't on yarn-mode.
I think it is because ResourceManager's web proxy might not handle https properly.
It seems httpclient in WebAppProxyServlet is not configured for SSL.
Do you have any idea?

The change itself seems good to me.

@vanzin
Copy link
Contributor Author

vanzin commented Jan 20, 2017

I think it is because ResourceManager's web proxy might not handle https properly.

Yeah, that's a known issue with enabling SSL for the web UI on YARN with self-signed certificates.

@sarutak
Copy link
Member

sarutak commented Jan 20, 2017

I understand. if there are no additional comments from anyone by tomorrow, I'll merge this.


// redirect the HTTP requests to HTTPS port
httpConnector.setName(REDIRECT_CONNECTOR_NAME)
collection.addHandler(createRedirectHttpsHandler(securePort, scheme))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed one point.
If a port is already used, collection.addHandler will take place more than twice leading redirection doesn't work properly.
Of course, it's not your fault. If you fix it in this PR together, it's good but it's a separate issue so I'll fix in another PR otherwise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually that is caused by my change, so let me fix it.

@SparkQA
Copy link

SparkQA commented Jan 21, 2017

Test build #71748 has finished for PR 16582 at commit eb0fcb7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@sarutak sarutak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It almost LGTM.
I leave some comments and have one question.
Why not simply remove old redirect handler like collection.removeHandler ?

gzipHandlers.foreach(collection.addHandler)
server.setHandler(collection)

server.setConnectors(connectors.toArray)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you move server.setConnectors(connectors.toArray) and gzipHandlers.foreach(collection.addHandler)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly for grouping. "This is where all handlers are added to the server."

In any case I have another change (#16625) that kinda moves all this stuff around anyway...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

O.K. That's what I thought.

addFilters(handlers, conf)

val gzipHandlers = handlers.map { h =>
h.setVirtualHosts(Array("@" + SPARK_CONNECTOR_NAME))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this code here? setVirtualHosts should always be called in addHandler for each handler right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ContextHandlerCollection.addHandler does not call setVirtualHosts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addHandler here means ServerInfo#addHandler, sorry for confusing.
And I noticed serverInfo in WebUI can be None and serverInfo.foreach(_.addHandler(handler)) is not called within WebUI#attachHandler in that case. So I understand it's reasonable change.
Can you add a test case for justification of this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what you mean. Without this change the UI does not work at all. The test I added already covers it.

Copy link
Member

@sarutak sarutak Jan 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this change the UI does not work at all. The test I added already covers it.

Hmm, it's funny. I commented out this change and ran the test case you added (UISuite and UISeleniumSuite) but it passed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course I understand this change is needed. I've confirmed it manually.

@vanzin
Copy link
Contributor Author

vanzin commented Jan 24, 2017

Why not simply remove old redirect handler like collection.removeHandler ?

I find it cleaner to just not do something than to do it then have to undo things when they fail.

@sarutak
Copy link
Member

sarutak commented Jan 25, 2017

O.K, It's reasonable.


val (conf, sslOptions) = sslEnabledConf()
serverInfo = JettyUtils.startJettyServer(
"0.0.0.0", 0, sslOptions, Seq[ServletContextHandler](newContext("/")), conf)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To test the neseccity of setVirtualHost in JettyUtils#StartJettyServer correctly, you might need to add another ServletContextHandler instance like newContext("/test0") and corresponding assertion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See commit 67df755 (the second commit in this PR). Tests failed before that commit.

@SparkQA
Copy link

SparkQA commented Jan 25, 2017

Test build #72000 has finished for PR 16582 at commit 5b65c69.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@sarutak
Copy link
Member

sarutak commented Jan 26, 2017

The latest change LGTM. Merging into master. Thanks @vanzin !

@asfgit asfgit closed this in d3dcb63 Jan 26, 2017
@sarutak
Copy link
Member

sarutak commented Jan 26, 2017

This change cannot be applied to branch-2.0 and branch-2.1 cleanly so please open other PRs for those branches. Thanks.

@vanzin
Copy link
Contributor Author

vanzin commented Jan 26, 2017

I wasn't really planning to backport this, unless someone is really interested in this functionality.

@vanzin vanzin deleted the SPARK-19220 branch January 26, 2017 17:29
vanzin pushed a commit to vanzin/spark that referenced this pull request Jan 26, 2017
The redirect handler was installed only for the root of the server;
any other context ended up being served directly through the HTTP
port. Since every sub page (e.g. application UIs in the history
server) is a separate servlet context, this meant that everything
but the root was accessible via HTTP still.

The change adds separate names to each connector, and binds contexts
to specific connectors so that content is only served through the
HTTPS connector when it's enabled. In that case, the only thing that
binds to the HTTP connector is the redirect handler.

Tested with new unit tests and by checking a live history server.

Author: Marcelo Vanzin <[email protected]>

Closes apache#16582 from vanzin/SPARK-19220.

(cherry picked from commit d3dcb63)
uzadude pushed a commit to uzadude/spark that referenced this pull request Jan 27, 2017
The redirect handler was installed only for the root of the server;
any other context ended up being served directly through the HTTP
port. Since every sub page (e.g. application UIs in the history
server) is a separate servlet context, this meant that everything
but the root was accessible via HTTP still.

The change adds separate names to each connector, and binds contexts
to specific connectors so that content is only served through the
HTTPS connector when it's enabled. In that case, the only thing that
binds to the HTTP connector is the redirect handler.

Tested with new unit tests and by checking a live history server.

Author: Marcelo Vanzin <[email protected]>

Closes apache#16582 from vanzin/SPARK-19220.
cmonkey pushed a commit to cmonkey/spark that referenced this pull request Feb 15, 2017
The redirect handler was installed only for the root of the server;
any other context ended up being served directly through the HTTP
port. Since every sub page (e.g. application UIs in the history
server) is a separate servlet context, this meant that everything
but the root was accessible via HTTP still.

The change adds separate names to each connector, and binds contexts
to specific connectors so that content is only served through the
HTTPS connector when it's enabled. In that case, the only thing that
binds to the HTTP connector is the redirect handler.

Tested with new unit tests and by checking a live history server.

Author: Marcelo Vanzin <[email protected]>

Closes apache#16582 from vanzin/SPARK-19220.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants