-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-32149][SHUFFLE] Improve file path name normalisation at block resolution within the external shuffle service #28967
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -17,15 +17,13 @@ | |
|
|
||
| package org.apache.spark.network.shuffle; | ||
|
|
||
| import java.io.File; | ||
| import java.io.IOException; | ||
| import java.io.InputStream; | ||
| import java.io.InputStreamReader; | ||
| import java.nio.charset.StandardCharsets; | ||
|
|
||
| import com.fasterxml.jackson.databind.ObjectMapper; | ||
| import com.google.common.io.CharStreams; | ||
| import org.apache.commons.lang3.SystemUtils; | ||
| import org.apache.spark.network.shuffle.protocol.ExecutorShuffleInfo; | ||
| import org.apache.spark.network.util.MapConfigProvider; | ||
| import org.apache.spark.network.util.TransportConf; | ||
|
|
@@ -145,29 +143,4 @@ public void jsonSerializationOfExecutorRegistration() throws IOException { | |
| assertEquals(shuffleInfo, mapper.readValue(legacyShuffleJson, ExecutorShuffleInfo.class)); | ||
| } | ||
|
|
||
| @Test | ||
| public void testNormalizeAndInternPathname() { | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi, @attilapiros . Could you explain why we need to remove the existing test coverage in this
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @dongjoon-hyun sure, here you are: The When the test was created in the first place if they could call So now as we can call indirectly the
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So the test existed because Spark has been dealing with normalization by itself ( If we cannot rely on the JDK implementation then createNormalizedInternedPathname should be just rewritten to the optimized one and this test would then keep as it is, but I'm afraid it's good direction we don't trust JDK implementation.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Let's assume JDK introduces a problem then the path is not totally normalized but still that string is interned when you use this PR so you saved the bytes. Your normalized path could be even better Regarding memory saving you are as good as close to get to
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here are the JDK tests for normalized paths:
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My bad “if” was missed between “direction” and “we”. Sorry about that. I made clear I have some sort of belief for JDK implementation and the maintenance, otherwise I would suggest to just port the optimized code into here. That said, to avoid further miscommunication, I’m positive on removing test, and I said it even previous comment.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @dongjoon-hyun Are you OK with the answer? If you're OK with it I'll move this forward.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yep~ |
||
| String sep = File.separator; | ||
| String expectedPathname = sep + "foo" + sep + "bar" + sep + "baz"; | ||
| assertPathsMatch("/foo", "bar", "baz", expectedPathname); | ||
| assertPathsMatch("//foo/", "bar/", "//baz", expectedPathname); | ||
| assertPathsMatch("/foo/", "/bar//", "/baz", expectedPathname); | ||
| assertPathsMatch("foo", "bar", "baz///", "foo" + sep + "bar" + sep + "baz"); | ||
| assertPathsMatch("/", "", "", sep); | ||
| assertPathsMatch("/", "/", "/", sep); | ||
| if (SystemUtils.IS_OS_WINDOWS) { | ||
| assertPathsMatch("/foo\\/", "bar", "baz", expectedPathname); | ||
| } else { | ||
| assertPathsMatch("/foo\\/", "bar", "baz", sep + "foo\\" + sep + "bar" + sep + "baz"); | ||
| } | ||
| } | ||
|
|
||
| private void assertPathsMatch(String p1, String p2, String p3, String expectedPathname) { | ||
| String normPathname = | ||
| ExecutorDiskUtils.createNormalizedInternedPathname(p1, p2, p3); | ||
| assertEquals(expectedPathname, normPathname); | ||
| File file = new File(normPathname); | ||
| String returnedPath = file.getPath(); | ||
| assertEquals(normPathname, returnedPath); | ||
| } | ||
| } | ||
Uh oh!
There was an error while loading. Please reload this page.