-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[YUNIKORN-1851] Don't allow similar app ids by different users #628
Conversation
Codecov Report
@@ Coverage Diff @@
## master #628 +/- ##
==========================================
- Coverage 71.90% 71.43% -0.47%
==========================================
Files 51 51
Lines 8076 8133 +57
==========================================
+ Hits 5807 5810 +3
- Misses 2073 2126 +53
- Partials 196 197 +1
... and 4 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
I think this might be a case of the cure being worse than the disease. Shared environments like this are somewhat cooperative by nature, and reusing app ids can happen either accidentally or even automatically (especially for the autogenerated app id). Rejecting a pod in this case is very heavy-handed and will result in unexpected failures. Tracking users by app instead of pod leads to this (IMO small) issue, but I don't think we should disallow behavior that Kubernetes allows -- we should strive to be more compatible, not less. I don't see this issue being much worse than the ambiguity we have in the case of group quotas and multiple possible matches. It's not perfect, but probably good enough. I'm -1 on this approach due to the pain it will likely cause. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also need to hook this into the one application ID per namespace setting. If you have one app ID per namespace this will probably break things.
This does not affect re-using application IDs. You can still do that without a problem. If the application is not currently known in the shim the user info will be set without question. For the auto generated app ID: I do think that is a case we need to handle and not reject.
K8s does not have the concept of applications and or users on a workload so we have no compatibility look at. Looking at a simple case: I could set a Spark application ID on a pod as a completely different user, in a different namespace with a different service account etc. That pod will then be run as part of the already running Spark job in a queue which I might not have access to under a different quota. That is a huge gap which will show up as a problem. I am surprised that it has not as yet.
The group resolution is defined and far less of a problem than this is. |
8b03b29
to
abafc58
Compare
e980616
to
95d4d90
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found some smaller things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 LGTM
I'll let Wilred take a look at it too.
92fd1e7
to
466d1ef
Compare
We need to be careful about this one. https://issues.apache.org/jira/browse/YUNIKORN-1961 was filed a few days ago that indicates that Spark uses different auth info when submitting executor pods, so somehow we need to take that into acccount here. |
Closing this for now. We can't make this change, as it will break a lot of things (including Spark) as executors are launched using different security credentials (the spark service account) than the original driver. |
What is this PR for?
Detect same app ids being used by different users for app submission and don't allow when app is already running. By default, this check has been disabled (configurable).
What type of PR is it?
Todos
What is the Jira issue?
https://issues.apache.org/jira/browse/YUNIKORN-1851
How should this be tested?
Screenshots (if appropriate)
Questions: