-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-21890] Credentials not being passed to add the tokens #19140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
ok to test |
| // those will fail with an access control issue. So create new tokens with the logged in | ||
| // user as renewer. | ||
| val creds = fetchDelegationTokens( | ||
| val fetchCreds = fetchDelegationTokens( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also here the diff in spark2.2 and master
=> is missing PRINCPAL(aka spark.yarn.principal) config. Not sure if we need to do this now. Let me know your opinion @vanzin @tgravescs
sparkConf.get(PRINCIPAL).flatMap { renewer =>
val creds = new Credentials()
hadoopFSsToAccess(hadoopConf, sparkConf).foreach { dst =>
val dstFs = dst.getFileSystem(hadoopConf)
dstFs.addDelegationTokens(renewer, creds)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That code was in getTokenRenewalInterval; that call is only needed when principal and keytab are provided, so adding the code back should be ok. It shouldn't cause any issues if it's not there, though, aside from a wasted round trip to the NNs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to not call it if we don't need to so as long as adding the config back doesn't mess with the mesos side of things (since this is now common code) I think that would be good. the PRINCIPAL config is yarn specific config, but looking at SparkSubmit it appears to be using for mesos as well.
@vanzin do you happen to know if mesos is using that as well, I haven't kept up with mesos kerberos support. so not sure if more is going to happen there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure Mesos is not currently hooked up to the principal / keytab stuff. It just picks up the initial delegation token set, and when those expire, things stop working.
Adding the check back here is the right thing; it shouldn't affect Mesos when it adds support for principal / keytab (or if it does, it can be fixed at that time).
|
Previous discussion on this PR is here #19103 |
|
Test build #81425 has finished for PR 19140 at commit
|
9a4966d to
5424972
Compare
|
Test build #81426 has finished for PR 19140 at commit
|
|
Test build #81431 has finished for PR 19140 at commit
|
|
@redsanket can you please test this with a secure Hadoop environment using spark-submit (not Oozie), I don't want to bring in any regression here. |
|
@jerryshao yes will do no issues thanks |
|
Added principal check back and tested in secure hadoop env. Let me know if this looks fine with you @jerryshao @vanzin @tgravescs |
|
Test build #81471 has finished for PR 19140 at commit
|
|
Test build #81474 has finished for PR 19140 at commit
|
|
LGTM pending tests. |
|
Test build #81477 has finished for PR 19140 at commit
|
|
+1 |
|
didn't merge to branch 2.2, will handle under #19103 |
I observed this while running a oozie job trying to connect to hbase via spark.
It look like the creds are not being passed in thehttps://github.com/apache/spark/blob/branch-2.2/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/security/HadoopFSCredentialProvider.scala#L53 for 2.2 release.
More Info as to why it fails on secure grid:
Oozie client gets the necessary tokens the application needs before launching. It passes those tokens along to the oozie launcher job (MR job) which will then actually call the Spark client to launch the spark app and pass the tokens along.
The oozie launcher job cannot get anymore tokens because all it has is tokens ( you can't get tokens with tokens, you need tgt or keytab).
The error here is because the launcher job runs the Spark Client to submit the spark job but the spark client doesn't see that it already has the hdfs tokens so it tries to get more, which ends with the exception.
There was a change with SPARK-19021 to generalize the hdfs credentials provider that changed it so we don't pass the existing credentials into the call to get tokens so it doesn't realize it already has the necessary tokens.
https://issues.apache.org/jira/browse/SPARK-21890
Modified to pass creds to get delegation tokens