Skip to content

Conversation

@PavithraRamachandran
Copy link
Contributor

What changes were proposed in this pull request?

In IPv6 scenario, the current logic to split hostname and port is not correct.

Why are the changes needed?

to support IPV6 deployment scenario

Does this PR introduce any user-facing change?

No

How was this patch tested?

UT and IPV6 spark deployment with yarn

@PavithraRamachandran PavithraRamachandran changed the title [SPARK-32103] [yarn] Handle host/port split in IPV6 in YarnRMClient [SPARK-32103] [yarn] Handle host/port split in IPV6 in YarnRMClient Jun 26, 2020
@PavithraRamachandran PavithraRamachandran changed the title [SPARK-32103] [yarn] Handle host/port split in IPV6 in YarnRMClient [SPARK-32103] Handle host/port split in IPV6 in YarnRMClient Jun 26, 2020
@dongjoon-hyun
Copy link
Member

ok to test

val prefix = WebAppUtils.getHttpSchemePrefix(conf)
val proxies = WebAppUtils.getProxyHostsAndPortsForAmFilter(conf)
val hosts = proxies.asScala.map(_.split(":").head)
val hosts = proxies.asScala.map(proxy => proxy.splitAt(proxy.lastIndexOf(":"))._1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @PavithraRamachandran . If possible, could you check the code base if this is the last instance or not?

Copy link
Contributor Author

@PavithraRamachandran PavithraRamachandran Jul 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have identified few more places where IPV6 is not handled, i shall update my PR soon

@SparkQA
Copy link

SparkQA commented Jun 27, 2020

Test build #124558 has finished for PR 28931 at commit 3e2eb3a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-32103] Handle host/port split in IPV6 in YarnRMClient [SPARK-32103][YARN] Handle IPv6 host/port split in YarnRMClient Jun 28, 2020
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, new implementation seems to introduce a new assumption and less robust as a result. For example, when we get host name only without port due to YARN API change, it will fail. It may be unlikely, but we had better keep the robustness here if there is no other reason @PavithraRamachandran .

scala> val proxies = Seq("host")
proxies: Seq[String] = List(host)

scala> proxies.map(_.split(":").head)
res16: Seq[String] = List(host)

scala> proxies.map(proxy => proxy.splitAt(proxy.lastIndexOf(":"))._1)
res17: Seq[String] = List("")

@dongjoon-hyun
Copy link
Member

Gentle ping, @PavithraRamachandran .

@PavithraRamachandran
Copy link
Contributor Author

@dongjoon-hyun thank you for identifying the missed scenario, i shall handle it and update the PR

@SparkQA
Copy link

SparkQA commented Jul 2, 2020

Test build #124919 has finished for PR 28931 at commit 98a737b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 2, 2020

Test build #124920 has finished for PR 28931 at commit becec4e.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Retest this please.

@PavithraRamachandran
Copy link
Contributor Author

@dongjoon-hyun i am getting 502 proxy error when i try to access the CI to check for failures. Could u help me?

@dongjoon-hyun
Copy link
Member

GitHub Action CI is okay, but UCB AmbLab Jenkins CI seems to be completely down now. In this case, there is no way for us to do. Let's wait for a while.

@SparkQA
Copy link

SparkQA commented Jul 5, 2020

Test build #124942 has finished for PR 28931 at commit becec4e.

  • This patch fails to generate documentation.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Thank you for updating, @PavithraRamachandran .

@dongjoon-hyun
Copy link
Member

Retest this please

@dongjoon-hyun
Copy link
Member

cc @dbtsai

Utils.checkHost("0.0.0.0")
intercept[AssertionError] {
Utils.checkHost("0.0.0.0:0")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, it would be great if we check the error message to prevent future regression.

assert(Utils.buildLocationMetadata(paths, 25) == "[path0, path1, path2, path3]")
}

test("checkHosts support IPV6 and IPV4") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • checkHosts -> checkHost
  • support -> supports
  • IPV6 and IPV4 -> both IPv4 and IPv6?

assert(hostnamePort._1.equals("0.0.0.0"))
assert(hostnamePort._2 === 0)


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit. repeated blank lines.

@SparkQA
Copy link

SparkQA commented Jul 8, 2020

Test build #125379 has finished for PR 28931 at commit 4c9cb2e.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 10, 2020

Test build #125609 has finished for PR 28931 at commit 768e7c4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-32103][YARN] Handle IPv6 host/port split in YarnRMClient [SPARK-32103][CORE] Support IPv6 host/port in core module Jul 10, 2020
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @PavithraRamachandran .
Merged to master for Apache Spark 3.1.0. Definitely, this is a good step toward IPv6 support.

@PavithraRamachandran
Copy link
Contributor Author

thank u for merging @dongjoon-hyun

@gatorsmile
Copy link
Member

Is it the only blocker for IPV6 support?

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Jul 15, 2020

Hi, @gatorsmile . Technically, this only handles host/port parsing inside core module. I'm sure that this is a meaningful step inside Spark. However, we didn't test anything on IPv6. Like what we did for JDK11, I expect lots of hurdle both inside and outside Spark.

wangyum pushed a commit that referenced this pull request May 26, 2023
### What changes were proposed in this pull request?
In IPv6 scenario, the current logic to split hostname and port is not correct.

### Why are the changes needed?
to support IPV6 deployment scenario

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
UT and IPV6 spark deployment with yarn

Closes #28931 from PavithraRamachandran/ipv6_issue.

Authored-by: Pavithraramachandran <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants