Hive Meta Store impersonation access#15736
Conversation
8dbea7f to
a085454
Compare
|
ping @arhimondr Sorry that it has been a while since last time I worked on #14464, now that PR is staled since it has been a long time, I reopened it here. Please take another look, Thanks |
0746e03 to
bfc66b0
Compare
There was a problem hiding this comment.
This is still too high level API to wrap in the doAs statement. As explained in the comment here different metastore implementations (Hive / Glue / Inmemory / Thrift / File / etc.) have completely different authentication methods. The doAs mechanism is HiveMetastore specific (even Hadoop client specific). Using it at this level breaks the encapsulation and abstraction.
Additionally, historically in Presto we are trying to minimize the usage of the thread local contexts, as they tend to be very fragile. Think if there's a metastore implementation that does method invokation in a separate thread pool for efficiency (e.g.: multiple calls in parallel). The doAs based implementation will be broken for such a use case.
Thus I strongly recommend you passing the identity all the way down to the lower level interface and wrapping the invokations with doAs only at the ThriftHiveMetastore.class level
There was a problem hiding this comment.
Ping again, the doAs has been removed, and the new implementation is based on passing the MetastoreContext object explicitly in the interface.
4cdaf43 to
4b4af73
Compare
|
Sorry for the delay, the docker images were all deleted yesterday and I was trying to resolve that. I'll give reviewing this a shot. I don't have as much context as Andrii and people are pretty busy right now. Is any of this a cherrypick from Trino? |
|
Since the product tests aren't working we can't properly test this right now. Some of Andrii's feedback from the last PR hasn't been incorporated like baking stuff into the docker images and default value in MetastoreClientConfig. |
Hey @aweisberg Thanks for reviewing, basically, this is the PR that originally from #14464, and this is not exactly the trino PR of https://github.com/trinodb/trino/pull/1441/files, In this PR, it does the HMS impersonation access like above trino PR did, as well as adding metrics for each individual HMS, while the major concern from @arhimondr is the Thread Local Storage, I think I already addressed in the latest diff. For the Docker Image problem, I am wondering what's the plan here, are we going to fix the docker image repo and upgrade the memory setting (basically bump up Xmx) in the image? |
|
Right now we can't build the docker images because Cloudera put everything behind a paywall. We are trying to address that, but it might take a bit. I am asking about Trino because if there is code from Trino then we need co-author tags to reflect that. |
|
I'm sorry for the delay reviewing this. Both @arhimondr and myself have been pretty swamped with things that have deadlines and this is a non-trivial review. Especially for me as I don't have much context in this area so I would have to build it. |
That's OK. this diff is NOT ONLY containing the changes for HMS impersonation but also including the metrics break-down by multiple-HMS since it also enables the RR multiple HMS access here. It is a bit confusing for me how memory problems have been triggered by this diff. And when I tried to test it locally, as we discussed in the slack, even in loading the data, the yarn container has exited because of memeory issue, and I think the CI here has already been passed that step, it is not clear what's difference between local env and CI env which I feel like it is just a docker compose env..) |
|
Ping @aweisberg @arhimondr again, this seemed to take quite a long time, what's our plan here. |
...ive-metastore/src/main/java/com/facebook/presto/hive/metastore/thrift/StaticHiveCluster.java
Outdated
Show resolved
Hide resolved
presto-hive-metastore/src/main/java/com/facebook/presto/hive/metastore/MetastoreContext.java
Outdated
Show resolved
Hide resolved
presto-hive-metastore/src/main/java/com/facebook/presto/hive/metastore/MetastoreContext.java
Outdated
Show resolved
Hide resolved
presto-geospatial/src/test/java/com/facebook/presto/plugin/geospatial/TestSpatialJoins.java
Outdated
Show resolved
Hide resolved
...e-metastore/src/main/java/com/facebook/presto/hive/metastore/thrift/ThriftHiveMetastore.java
Outdated
Show resolved
Hide resolved
...e-metastore/src/main/java/com/facebook/presto/hive/metastore/thrift/ThriftHiveMetastore.java
Outdated
Show resolved
Hide resolved
...e-metastore/src/main/java/com/facebook/presto/hive/metastore/thrift/ThriftHiveMetastore.java
Outdated
Show resolved
Hide resolved
...e-metastore/src/main/java/com/facebook/presto/hive/metastore/thrift/ThriftHiveMetastore.java
Outdated
Show resolved
Hide resolved
...e-metastore/src/main/java/com/facebook/presto/hive/metastore/thrift/ThriftHiveMetastore.java
Outdated
Show resolved
Hide resolved
4b4af73 to
9f87556
Compare
|
@arhimondr Addressed most of the comments and please help another run, Also I don't find there is any new commits on docker-image repo, since the read timeout is still happening, I still believe we need to update the memory limit in the docker container for service like hms. @aweisberg do you have some updates on how to updating the docker-image? |
arhimondr
left a comment
There was a problem hiding this comment.
A couple of minor comments + missed comments from the previous review. Otherwise looks good to me.
...to-hive-metastore/src/main/java/com/facebook/presto/hive/metastore/CachingHiveMetastore.java
Outdated
Show resolved
Hide resolved
...metastore/src/main/java/com/facebook/presto/hive/metastore/HivePageSinkMetadataProvider.java
Outdated
Show resolved
Hide resolved
...ive-metastore/src/main/java/com/facebook/presto/hive/metastore/thrift/StaticHiveCluster.java
Outdated
Show resolved
Hide resolved
...e-metastore/src/main/java/com/facebook/presto/hive/metastore/thrift/ThriftHiveMetastore.java
Outdated
Show resolved
Hide resolved
...e-metastore/src/main/java/com/facebook/presto/hive/metastore/thrift/ThriftHiveMetastore.java
Outdated
Show resolved
Hide resolved
...metastore/src/main/java/com/facebook/presto/hive/metastore/thrift/ThriftMetastoreModule.java
Outdated
Show resolved
Hide resolved
...c/main/java/com/facebook/presto/hive/authentication/KerberosHiveMetastoreAuthentication.java
Outdated
Show resolved
Hide resolved
329d746 to
ade80d5
Compare
arhimondr
left a comment
There was a problem hiding this comment.
LGTM % nits.
Please take a look at the test failures, they seem to be related
presto-hive-metastore/src/main/java/com/facebook/presto/hive/MetastoreClientConfig.java
Outdated
Show resolved
Hide resolved
...to-hive-metastore/src/main/java/com/facebook/presto/hive/metastore/CachingHiveMetastore.java
Outdated
Show resolved
Hide resolved
...to-hive-metastore/src/main/java/com/facebook/presto/hive/metastore/CachingHiveMetastore.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Github must've done something weird. I was leaving this comment in the memoizeMetastore method
...to-hive-metastore/src/main/java/com/facebook/presto/hive/metastore/CachingHiveMetastore.java
Outdated
Show resolved
Hide resolved
...metastore/src/main/java/com/facebook/presto/hive/metastore/thrift/StaticMetastoreConfig.java
Outdated
Show resolved
Hide resolved
...metastore/src/main/java/com/facebook/presto/hive/metastore/thrift/StaticMetastoreConfig.java
Outdated
Show resolved
Hide resolved
...e-metastore/src/main/java/com/facebook/presto/hive/metastore/thrift/ThriftHiveMetastore.java
Outdated
Show resolved
Hide resolved
...store/src/main/java/com/facebook/presto/hive/metastore/thrift/ThriftHiveMetastoreClient.java
Outdated
Show resolved
Hide resolved
e07d946 to
46330fb
Compare
46330fb to
33babfa
Compare
This PR enabled the option of impersonation access of the Hive Metastore impersonation. It also includes the changes to enable multiple HMS instances load-balancing, and reporting the stats for each individual HMS instance.
33babfa to
cd5d841
Compare
This PR enabled the option of impersonation access of the Hive Metastore impersonation.
It also includes the changes to enable multiple HMS instances load-balancing and reporting the stats for each individual
HMS instance.
Test plan - (Please fill in how you tested your changes)
Hands-on testing when enabling the impersonation
Plan to add more integration testing cases during turning on the config
Fill in the release notes towards the bottom of the PR description.
See Release Notes Guidelines for details.