Skip to content

Conversation

@cmirash
Copy link

@cmirash cmirash commented Oct 22, 2016

What changes were proposed in this pull request?

Spark Thriftserver when running in HTTP mode with Kerberos enabled gives a 401 authentication error when receiving beeline HTTP request (with end user as kerberos principal). The similar command works with Hive Thriftserver.

What we find is Hive thriftserver CLI service creates both hive service and SPNego principal when kerberos is enabled whereas Spark Thriftserver only creates hive service principal.

CLIService.java

if (UserGroupInformation.isSecurityEnabled()) {
      try {
        HiveAuthFactory.loginFromKeytab(hiveConf);
        this.serviceUGI = Utils.getUGI();
      } catch (IOException e) {
        throw new ServiceException("Unable to login to kerberos with given principal/keytab", e);
      } catch (LoginException e) {
        throw new ServiceException("Unable to login to kerberos with given principal/keytab", e);
      }
  // Also try creating a UGI object for the SPNego principal
  String principal = hiveConf.getVar(ConfVars.HIVE_SERVER2_SPNEGO_PRINCIPAL);
  String keyTabFile = hiveConf.getVar(ConfVars.HIVE_SERVER2_SPNEGO_KEYTAB);
  if (principal.isEmpty() || keyTabFile.isEmpty()) {
    LOG.info("SPNego httpUGI not created, spNegoPrincipal: " + principal +
        ", ketabFile: " + keyTabFile);
  } else {
    try {
      this.httpUGI = HiveAuthFactory.loginFromSpnegoKeytabAndReturnUGI(hiveConf);
      LOG.info("SPNego httpUGI successfully created.");
    } catch (IOException e) {
      LOG.warn("SPNego httpUGI creation failed: ", e);
    }
  }
}

SparkSQLCLIService.scala

 if (UserGroupInformation.isSecurityEnabled) {
      try {
        HiveAuthFactory.loginFromKeytab(hiveConf)
        sparkServiceUGI = Utils.getUGI()
        setSuperField(this, "serviceUGI", sparkServiceUGI)
      } catch {
        case e @ (_: IOException | _: LoginException) =>
          throw new ServiceException("Unable to login to kerberos with given principal/keytab", e)
      }

The patch will add missing SPNego principal to Spark Thriftserver.

How was this patch tested?

Ran manual testing with beeline command through spark against kerberized cluster.
Ran Spark unit tests for hive, sql and catalyst.

@cmirash
Copy link
Author

cmirash commented Nov 1, 2016

@steveloughran @yhuai Will you be able to help review this PR?

@steveloughran
Copy link
Contributor

I'm not spark committer so can't review it well enough to get in; I was just watching it out of concern for the word "kerberos". How about you ask on the spark developer list for any volunteers to review.

Looking at the code, I'd drop using reflection for logging. You can just create an SLF4J log with the name "org.apache.hive.hiveserver2" (or whatever the full path is) and end up with the same log name as the parent class. As for the setting of the hive option, I do think it's an ugly pain which should really be addressed by making hive extensible —until then, what choice to do you have.

@cmirash
Copy link
Author

cmirash commented Nov 2, 2016

Thanks @steveloughran. I will check on the developer list.

} else {
try {
httpUGI = HiveAuthFactory.loginFromSpnegoKeytabAndReturnUGI(hiveConf)
setSuperField(this, "httpUGI", httpUGI)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have similar code in CLIService.java which is not setting the httpUGI field, do we need to make the behavior the same in both files ?

@lresende
Copy link
Member

@vanzin I believe this might be your realm :) Could you please help review this.

@cmirash
Copy link
Author

cmirash commented Nov 18, 2016

Thanks Luciano. I am looking at the changes and will add them soon.

@vanzin
Copy link
Contributor

vanzin commented Nov 22, 2016

ok to test

Copy link
Contributor

@vanzin vanzin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really familiar with the thrift server (Spark's or Hive's) so can't really comment much. Maybe someone more familiar with this code can take a look (use git blame to find out?).

}
}

// Also try creating a UGI object for the SPNego principal
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation here is wrong. Also, following blocks have code indented with 4 spaces instead of 2, and wrong indentation.

val principal = hiveConf.getVar(ConfVars.HIVE_SERVER2_SPNEGO_PRINCIPAL)
val keyTabFile = hiveConf.getVar(ConfVars.HIVE_SERVER2_SPNEGO_KEYTAB)
if (principal.isEmpty() || keyTabFile.isEmpty()) {
getAncestorField[Log](this, 3, "LOG").info(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This log message seems unnecessary.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @vanzin. The log message was added to replicate same behavior as the Hive Thriftserver code block. @steveloughran added the kerberos code and he added his comments above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know about the specific log policy here; but I do think using reflection to get a private stuff is dangerous. I know it happens a lot in this code but at some point it'd be nice to move it. you don't need to use reflection to get the same log as a parent, just go

LoggerFactory.getLog("org.apache.hive.thrift...whatever").info(s"something $key")

This is cleaner and not going to break if hive ever change logging frameworks.

@SparkQA
Copy link

SparkQA commented Nov 22, 2016

Test build #69034 has finished for PR 15594 at commit 448f91f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

Hi @cmirash, is it still active?

@asfgit asfgit closed this in 5d2750a May 18, 2017
zifeif2 pushed a commit to zifeif2/spark that referenced this pull request Nov 22, 2025
## What changes were proposed in this pull request?

This PR proposes to close PRs ...

  - inactive to the review comments more than a month
  - WIP and inactive more than a month
  - with Jenkins build failure but inactive more than a month
  - suggested to be closed and no comment against that
  - obviously looking inappropriate (e.g., Branch 0.5)

To make sure, I left a comment for each PR about a week ago and I could not have a response back from the author in these PRs below:

Closes apache#11129
Closes apache#12085
Closes apache#12162
Closes apache#12419
Closes apache#12420
Closes apache#12491
Closes apache#13762
Closes apache#13837
Closes apache#13851
Closes apache#13881
Closes apache#13891
Closes apache#13959
Closes apache#14091
Closes apache#14481
Closes apache#14547
Closes apache#14557
Closes apache#14686
Closes apache#15594
Closes apache#15652
Closes apache#15850
Closes apache#15914
Closes apache#15918
Closes apache#16285
Closes apache#16389
Closes apache#16652
Closes apache#16743
Closes apache#16893
Closes apache#16975
Closes apache#17001
Closes apache#17088
Closes apache#17119
Closes apache#17272
Closes apache#17971

Added:
Closes apache#17778
Closes apache#17303
Closes apache#17872

## How was this patch tested?

N/A

Author: hyukjinkwon <[email protected]>

Closes apache#18017 from HyukjinKwon/close-inactive-prs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants