Skip to content

Create GenericHiveRecordCursor within doAs block to pass the correct credentials to storage#16333

Merged
vkorukanti merged 1 commit intoprestodb:masterfrom
chliang71:avro-schema-auth
Jun 29, 2021
Merged

Create GenericHiveRecordCursor within doAs block to pass the correct credentials to storage#16333
vkorukanti merged 1 commit intoprestodb:masterfrom
chliang71:avro-schema-auth

Conversation

@chliang71
Copy link
Contributor

@chliang71 chliang71 commented Jun 24, 2021

Add authentication when creating Hive page source, this is needed when:

  1. table schema is stored as an Avro file schema on some remote path, e.g. avro.schema.url = hdfs://...
  2. the remote HDFS path requires authentication

Currently this code path in Presto does not include authentication, so this fetching remote Avro schema would fail. Causing exception something like below:

java.lang.RuntimeException: error initializing deserializer: com.facebook.presto.hive.avro.PrestoAvroSerDe
	at com.facebook.presto.hive.HiveUtil.initializeDeserializer(HiveUtil.java:443)
	at com.facebook.presto.hive.HiveUtil.getDeserializer(HiveUtil.java:396)
	at com.facebook.presto.hive.GenericHiveRecordCursor.<init>(GenericHiveRecordCursor.java:141)
	at com.facebook.presto.hive.GenericHiveRecordCursorProvider.createRecordCursor(GenericHiveRecordCursorProvider.java:79)
	at com.facebook.presto.hive.HivePageSourceProvider.createHivePageSource(HivePageSourceProvider.java:357)
	at com.facebook.presto.hive.HivePageSourceProvider.createPageSource(HivePageSourceProvider.java:125)
	at com.facebook.presto.spi.connector.classloader.ClassLoaderSafeConnectorPageSourceProvider.createPageSource(ClassLoaderSafeConnectorPageSourceProvider.java:51)
	at com.facebook.presto.split.PageSourceManager.createPageSource(PageSourceManager.java:58)
....
Caused by: org.ietf.jgss.GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
	at java.security.jgss/sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
	at java.security.jgss/sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:126)
	at java.security.jgss/sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:191)
	at java.security.jgss/sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:218)
	at java.security.jgss/sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:230)
	at java.security.jgss/sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:196)
	at jdk.security.jgss/com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192)
	... 68 more

Test plan - (Please fill in how you tested your changes)

Tested in our internal testing cluster, with this change, the exception no longer shows up.

== RELEASE NOTES ==
Hive Changes
* Fix a bug in reading Avro format table with schema located in a Kerberos enabled HDFS compliant filesystem.

@chliang71
Copy link
Contributor Author

@vkorukanti do you mind taking a look on this change? Thanks in advance!

@vkorukanti
Copy link
Contributor

This seems like generic code applicable to all page soures. Can you move the deserializer creation to within doAs block in GenericHiveRecordCursorProvider here and pass it as an argument to GenericHiveRecordCursor?

cc. @zhenxiao

@vkorukanti
Copy link
Contributor

Or just put the below block within a doAs block. It avoids passing a new param.

    return Optional.of(new GenericHiveRecordCursor<>(
                configuration,
                path,
                genericRecordReader(recordReader),
                length,
                schema,
                columns,
                hiveStorageTimeZone,
                typeManager));

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you can return the value returned by call hdfsEnvironment.doAs(...) like return hdfsEnvironment.doAs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! updated

@vkorukanti vkorukanti changed the title Add authentication when creating Hive page source Create GenericHiveRecordCursor within doAs block to pass the correct credentials to storage Jun 28, 2021
@vkorukanti vkorukanti merged commit 21d6dd3 into prestodb:master Jun 29, 2021
@ajaygeorge
Copy link
Contributor

Hi @chliang71 / @vkorukanti
Release notes are missing for this PR.
Please add them according to the guidelines mentioned here

@ajaygeorge ajaygeorge mentioned this pull request Jul 7, 2021
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants