-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-28149][K8S] Added variable to disable negative DNS caching #24962
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hi @srowen This is the other part, affecting only to k8s. |
|
This change has some hard-coded paths and sed code that will cause problems. @skonto proposed modifying the |
|
I've been reviewing the discussion we had in #24702. We clarified that we can't set individual via java args, only the full file via properties. Then @skonto sugested this aproach:
Which is exactly what I've implemented. But I am open to implement it in any other way.
|
|
Hi |
|
@jlpedrosa I think the idiomatic way to do this (going forward) will be to:
Config map support is slated to be supported via user supplied YAML, see: This feature will land in 3.0, so I'd propose verifying that this can be done without new code starting with 3.0 |
|
Looking at the docs of master, I see that it si not possible to mount a config map as a file. The only types of mounts supported are: hostPath: mounts a file or directory from the host node’s filesystem into a pod. Do you want me to open a separate ticket to add that feature? "Capability to mount config maps" |
|
@jlpedrosa I'm guessing no specific doc was added because config-files were intended to be just one of an arbitrary number of things that could be managed via templates, starting with #22146 If config-files are not working via pod template files, that needs to be reconciled somehow. |
|
I'm not sure if I've explained myself. What I was trying to say is that spark (3.0/master) has capabilities to mount different types of volumes into the pods without temaplates. That [VolumeType] can be only of the types I mentioned before (as in the doc).
If we would extend the capabilities then it would be something like |
|
Parachuting into the conversation: we should not extend the config system to replicate functionality available through pod templates. That was the whole point of pod templates... |
|
I've tested the approach suggested and it works, the process is a bit cumbersome for the user. So it is a question of usability,
sample template: |
|
Instead of trying to modify the java.security file, how about adding code in Spark to change the property programmatically? Seems like all you have to do is call: Just need to add code so that is done during driver and executor startup somehow. Shouldn't be hard. |
|
Hardcoding this behavior seems not quite right to me on general principles. However, the question would be "what if the DNS failure was 'real' instead of just a transitory startup artifact?" And maybe from a security point of view "what if the failure was part of some attack - trying to connect to something hostile?" In the benign case maybe not so bad. I'm less sure about a hypothetical vulnerability case. Thinking about the original problem I have a more basic question: I am wondering why it is not possible to reconfigure longer timeouts. |
|
Nobody suggested hardcoding anything. |
|
@jlpedrosa my working assumption was that the security file could be mounted as a config-map, without having to mess with custom images (steps 1 & 5). If that is truly necessary, then I'd consider it a reasonable case for some kind of code modification. As @vanzin alluded to above, the motivation for pod templates was to make k8s-related modifications possible without having to re-code the back end to pass through k8s config every time a user wanted to set some new feature. However, there was also the implicit assumption that it should be possible without a great deal of gymnastics from the user. |
|
@erikerlandson I'd say step 1) should be and standard good practice to keep them in your own repos (not even sure if there are in public "official" repos for the spark images, quick google does not seem to found official ones). Point 5) was clearly miss explained by me, k8s best practices would say that if you change your code, you should upload a new version of the image (instead of loading it from hdfs). The file can be mounted just with the template and pushing the config map. Image 5 should contain only the generated jar with the job, not the file. As you can see on #24702 Initially I proposed to go the approach @vanzin suggested (which I agree with). Taking aside the "changing existing behaviour", personally I think that default behaviour of java of caching forever DNS resolution failures is a bit undesirable choice by default. I don't think that is done by any other programming language by default. I'd suggest to add a new config property to spark so it is actionable for all schedulers, and keep the JVM default behaviour (as it was). But I am open to implement it in any way you guys consider, just let me know the approach. |
|
Other benefit I could see to the approach suggested by @vanzin is that it would be easier to be ported to 2.4 branch. |
|
Can one of the admins verify this patch? |
|
I'll close this for now, feel free to update your branch to reopen the PR. |
What changes were proposed in this pull request?
This PR makes possible disable the negative dns caching in JVM by adding a variable.
How was this patch tested?
I could not see a place to run this tests automatically, so I run them manually by executing
docker run: