-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-19554][UI,YARN] Allow SHS URL to be used for tracking in YARN RM. #16946
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Allow an application to use the History Server URL as the tracking URL in the YARN RM, so there's still a link to the web UI somewhere in YARN even if the driver's UI is disabled. This is useful, for example, if an admin wants to disable the driver UI by default for applications, since it's harder to secure it (since it involves non trivial ssl certificate and auth management that admins may not want to expose to user apps). This needs to be opt-in, because of the way the YARN proxy works, so a new configuration was added to enable the option. The YARN RM will proxy requests to live AMs instead of redirecting the client, so pages in the SHS UI will not render correctly since they'll reference invalid paths in the RM UI. The proxy base support in the SHS cannot be used since that would prevent direct access to the SHS. So, to solve this problem, for the feature to work end-to-end, a new YARN-specific filter was added that detects whether the requests come from the proxy and redirects the client appropriatly. The SHS admin has to add this filter manually if they want the feature to work. Tested with new unit test, and by running with the documented configuration set in a test cluster. Also verified the driver UI is used when it's enabled.
|
Test build #72967 has finished for PR 16946 at commit
|
|
@tgravescs pinging you since this is all YARN-side (doesn't really touch the UI). |
|
Test build #73007 has finished for PR 16946 at commit
|
|
trying to get some eyes on this: @squito |
Turn the redirect handler into a servlet, and install user filters, so that they can be applied before redirection to the HTTPS port. This can be used, for example, to perform redirection to the SHS from the YARN RM proxy without having to deal with trust stores on the YARN RM config.
|
On vacation back next Monday and will review. |
|
Test build #73242 has finished for PR 16946 at commit
|
It may not be the best idea to install auth filters on the unencrypted connector, so don't do this and require admins to properly set up trust stores in YARN instead. We can add this as a new feature (with security properly accounted for) later on.
|
Test build #73287 has finished for PR 16946 at commit
|
squito
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
very minor comments
| val cookies = Array(new Cookie(YarnProxyRedirectFilter.COOKIE_NAME, "dr.who")) | ||
|
|
||
| val req = mock(classOf[HttpServletRequest]) | ||
| when(req.getCookies()).thenReturn(cookies, null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was really confused by this test at first -- I didn't know that thenReturn lets you specify multiple values for consecutive calls. For any one else as clueless as me, it would be helpful to drop in a comment here to draw attention to this, eg. "First request has cookies with a user name, second request does not".
docs/running-on-yarn.md
Outdated
| ## Using the Spark History Server to replace the Spark Web UI | ||
|
|
||
| It is possible to use the Spark History Server application page as the tracking URL for running | ||
| applications in scenarios where it may be desired to disable the built-in application UI. Two steps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: first sentence reads a little funny. maybe rephrase to:
It is possible to use the Spark History Server application page as the tracking URL for running
applications where built-in application UI is disabled. This may be desirable on secure clusters or to avoid the memory usage on the driver from the UI.
Up to you. maybe doens't even need the second sentence.
|
Test build #73296 has finished for PR 16946 at commit
|
|
Merging to master. @tgravescs we can address any feedback you might have when you're back. |
Allow an application to use the History Server URL as the tracking URL in the YARN RM, so there's still a link to the web UI somewhere in YARN even if the driver's UI is disabled. This is useful, for example, if an admin wants to disable the driver UI by default for applications, since it's harder to secure it (since it involves non trivial ssl certificate and auth management that admins may not want to expose to user apps). This needs to be opt-in, because of the way the YARN proxy works, so a new configuration was added to enable the option. The YARN RM will proxy requests to live AMs instead of redirecting the client, so pages in the SHS UI will not render correctly since they'll reference invalid paths in the RM UI. The proxy base support in the SHS cannot be used since that would prevent direct access to the SHS. So, to solve this problem, for the feature to work end-to-end, a new YARN-specific filter was added that detects whether the requests come from the proxy and redirects the client appropriatly. The SHS admin has to add this filter manually if they want the feature to work. Tested with new unit test, and by running with the documented configuration set in a test cluster. Also verified the driver UI is used when it's enabled. Author: Marcelo Vanzin <[email protected]> Closes apache#16946 from vanzin/SPARK-19554.
Allow an application to use the History Server URL as the tracking
URL in the YARN RM, so there's still a link to the web UI somewhere
in YARN even if the driver's UI is disabled. This is useful, for
example, if an admin wants to disable the driver UI by default for
applications, since it's harder to secure it (since it involves non
trivial ssl certificate and auth management that admins may not want
to expose to user apps).
This needs to be opt-in, because of the way the YARN proxy works, so
a new configuration was added to enable the option.
The YARN RM will proxy requests to live AMs instead of redirecting
the client, so pages in the SHS UI will not render correctly since
they'll reference invalid paths in the RM UI. The proxy base support
in the SHS cannot be used since that would prevent direct access to
the SHS.
So, to solve this problem, for the feature to work end-to-end, a new
YARN-specific filter was added that detects whether the requests come
from the proxy and redirects the client appropriatly. The SHS admin has
to add this filter manually if they want the feature to work.
Tested with new unit test, and by running with the documented configuration
set in a test cluster. Also verified the driver UI is used when it's
enabled.