-
Notifications
You must be signed in to change notification settings - Fork 4.8k
HIVE-29284 Remove tight yarn-registry coupling from hive-exec #6146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
HIVE-29284 Remove tight yarn-registry coupling from hive-exec #6146
Conversation
|
hadoop-yarn-registry is also not needed for clusters managed by Kubernetes. |
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionState.java
Outdated
Show resolved
Hide resolved
|
Hi @deniskuzZ thanks for the comments, I'll add my responses to the individual threads. |
0bd9517 to
43220b4
Compare
@deniskuzZ |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM +1
Kubernetes manages resources on its own, it doesn't need YARN |
would the jar still be present in Hive dist/packaging?
|
@ishnagy, would the jar still be included in the Hive distribution tar.gz? Try building with the -Pdist profile and check whether it still present in the lib directory. |
YARN is not needed, correct regarding the pom changes: see this comment in the scope of TEZ-4008, please try to update dependencies to hadoop-registry if possible (if this part is already changed :) ) |
|
@abstractdog, @ayushtkn, I can’t find @ishnagy, are you using |
RegistryOperations is just a "random" class to locate the registry jar, I believe, what we really use from that artifact is ServiceRecord |
checking |
|
oh, ok, I think I get the source of the confusion now (or at least some of it).
(I thought you're telling meg to remove some additional "yarn" deps from some other k8s related component) |
|
I have found the registry classes only in not sure, if that's what expected, but at least the patch doesn't seem to remove any hadoop registry classes from the distributables. *I used to build the project. |
|
|
there are no registry classes in Hive dist. maybe we should just change the scope for |
hmm, it may work. I'll have to rerun a few builds to check if this is enough from the spark side. I'll try to post an update tomorrow with my results. |
not to abandon the "provided" dep method, |
It is, but I don’t understand why the JDBC jar ends up shading hadoop-yarn-server-resourcemanager. It was introduced by HIVE-19956, but the description is vague. it might just be a legacy leftover, because I don't see how it relates to the JDBC at all. In any case, your issue is with |



What changes were proposed in this pull request?
I am proposing to remove the tight dependency coupling between
hive-execandhadoop-yarn-registry(to address HIVE-29284).Why are the changes needed?
hive-execpulls inhadoop-yarn-registryas a direct dependency, but registry classes are only used for building a local resource map for LLAP. This map can be built without the actual class instances getting loaded, using class name literals only. Removinghadoop-yarn-registryas a direct dependency will prevent pulling in its whole transitive dependency tree when one only wants to usehive-execfunctionality without LLAP. (e.g apache spark)Does this PR introduce any user-facing change?
No
How was this patch tested?
This patch will be tested by the pre merge tests executed for this pull request.