Skip to content

Conversation

@pan3793
Copy link
Member

@pan3793 pan3793 commented Dec 8, 2022

What changes were proposed in this pull request?

Backport #38901 to branch-3.2.

Fix the condition for judging Netty prefer direct memory on executor start, the logic should match org.apache.spark.network.client.TransportClientFactory.

Why are the changes needed?

The check logical was added in SPARK-27991, the original intention is to avoid potential Netty OOM issue when Netty uses direct memory to consume shuffle data, but the condition is not sufficient, this PR updates the logic to match org.apache.spark.network.client.TransportClientFactory

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Manual testing.

… executor start

### What changes were proposed in this pull request?

Fix the condition for judging Netty prefer direct memory on executor start, the logic should match `org.apache.spark.network.client.TransportClientFactory`.

### Why are the changes needed?

The check logical was added in SPARK-27991, the original intention is to avoid potential Netty OOM issue when Netty uses direct memory to consume shuffle data, but the condition is not sufficient, this PR updates the logic to match `org.apache.spark.network.client.TransportClientFactory`

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Manual testing.

Closes apache#38901 from pan3793/SPARK-41376.

Authored-by: Cheng Pan <[email protected]>
Signed-off-by: Sean Owen <[email protected]>
@dongjoon-hyun
Copy link
Member

The linter failure is irrelevant, but there is PySpark failure. Could you re-trigger the failed job?
Screen Shot 2022-12-08 at 2 06 34 PM

@pan3793
Copy link
Member Author

pan3793 commented Dec 9, 2022

Re-triggered.

@pan3793
Copy link
Member Author

pan3793 commented Dec 9, 2022

@dongjoon-hyun the pyspark and lint CI fail consistently, and I see there were also failed on previous commits. Sorry I'm not familiar w/ Python, cc @zhengruifeng @Yikun would you please take a look what happened on branch-3.2? Thanks.

@dongjoon-hyun
Copy link
Member

As I mentioned here, it's irrelevant to this PR and a known issue. You can ignore that, @pan3793 .

@Yikun
Copy link
Member

Yikun commented Dec 9, 2022

For the pyspark failare, let's see Yikun@3840beb works or not: https://github.com/Yikun/spark/actions/runs/3655117402/jobs/6176152837 If test passed, I can submit the PR to upstream.

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Dec 9, 2022

Thank you, @Yikun . It seems that your test works, but linter job failed at SparkR issue again.

Any way, please make a PR, @Yikun . It looks reasonable enough for Python fix.

Yikun added a commit that referenced this pull request Dec 9, 2022
### What changes were proposed in this pull request?
According to https://pandas.pydata.org/docs/reference/api/pandas.io.formats.style.Styler.to_latex.html:
`pandas.io.formats.style.Styler.to_latex` introduced since 1.3.0, so before panda 1.3.0, should skip the check

```
ERROR [0.180s]: test_style (pyspark.pandas.tests.test_dataframe.DataFrameTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/__w/spark/spark/python/pyspark/pandas/tests/test_dataframe.py", line 5795, in test_style
    check_style()
  File "/__w/spark/spark/python/pyspark/pandas/tests/test_dataframe.py", line 5793, in check_style
    self.assert_eq(pdf_style.to_latex(), psdf_style.to_latex())
AttributeError: 'Styler' object has no attribute 'to_latex'
```

Related: 58375a8

### Why are the changes needed?
This test break the 3.2 branch pyspark test (with python 3.6 + pandas 1.1.x), so I think better add the `skipIf` it.

See also #38982 (comment)

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- CI passed
- Test on 3.2 branch: Yikun#194, https://github.com/Yikun/spark/actions/runs/3655564439/jobs/6177030747

Closes #39002 from Yikun/skip-check.

Authored-by: Yikun Jiang <[email protected]>
Signed-off-by: Yikun Jiang <[email protected]>
Yikun added a commit that referenced this pull request Dec 9, 2022
### What changes were proposed in this pull request?
According to https://pandas.pydata.org/docs/reference/api/pandas.io.formats.style.Styler.to_latex.html:
`pandas.io.formats.style.Styler.to_latex` introduced since 1.3.0, so before panda 1.3.0, should skip the check

```
ERROR [0.180s]: test_style (pyspark.pandas.tests.test_dataframe.DataFrameTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/__w/spark/spark/python/pyspark/pandas/tests/test_dataframe.py", line 5795, in test_style
    check_style()
  File "/__w/spark/spark/python/pyspark/pandas/tests/test_dataframe.py", line 5793, in check_style
    self.assert_eq(pdf_style.to_latex(), psdf_style.to_latex())
AttributeError: 'Styler' object has no attribute 'to_latex'
```

Related: 58375a8

### Why are the changes needed?
This test break the 3.2 branch pyspark test (with python 3.6 + pandas 1.1.x), so I think better add the `skipIf` it.

See also #38982 (comment)

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- CI passed

Closes #39007 from Yikun/branch-3.3-check.

Authored-by: Yikun Jiang <[email protected]>
Signed-off-by: Yikun Jiang <[email protected]>
Yikun added a commit that referenced this pull request Dec 9, 2022
### What changes were proposed in this pull request?
According to https://pandas.pydata.org/docs/reference/api/pandas.io.formats.style.Styler.to_latex.html:
`pandas.io.formats.style.Styler.to_latex` introduced since 1.3.0, so before panda 1.3.0, should skip the check

```
ERROR [0.180s]: test_style (pyspark.pandas.tests.test_dataframe.DataFrameTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/__w/spark/spark/python/pyspark/pandas/tests/test_dataframe.py", line 5795, in test_style
    check_style()
  File "/__w/spark/spark/python/pyspark/pandas/tests/test_dataframe.py", line 5793, in check_style
    self.assert_eq(pdf_style.to_latex(), psdf_style.to_latex())
AttributeError: 'Styler' object has no attribute 'to_latex'
```

Related: 58375a8

### Why are the changes needed?
This test break the 3.2 branch pyspark test (with python 3.6 + pandas 1.1.x), so I think better add the `skipIf` it.

See also #38982 (comment)

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
CI passed

Closes #39008 from Yikun/branch-3.2-style-check.

Authored-by: Yikun Jiang <[email protected]>
Signed-off-by: Yikun Jiang <[email protected]>
dongjoon-hyun pushed a commit that referenced this pull request Dec 9, 2022
…ic on executor start

### What changes were proposed in this pull request?

Backport #38901 to branch-3.2.

Fix the condition for judging Netty prefer direct memory on executor start, the logic should match `org.apache.spark.network.client.TransportClientFactory`.

### Why are the changes needed?

The check logical was added in SPARK-27991, the original intention is to avoid potential Netty OOM issue when Netty uses direct memory to consume shuffle data, but the condition is not sufficient, this PR updates the logic to match `org.apache.spark.network.client.TransportClientFactory`

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Manual testing.

Closes #38982 from pan3793/SPARK-41376-3.2.

Authored-by: Cheng Pan <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
@dongjoon-hyun
Copy link
Member

Merged to branch-3.2 too. Thank you, @pan3793 and @Yikun .

beliefer pushed a commit to beliefer/spark that referenced this pull request Dec 18, 2022
### What changes were proposed in this pull request?
According to https://pandas.pydata.org/docs/reference/api/pandas.io.formats.style.Styler.to_latex.html:
`pandas.io.formats.style.Styler.to_latex` introduced since 1.3.0, so before panda 1.3.0, should skip the check

```
ERROR [0.180s]: test_style (pyspark.pandas.tests.test_dataframe.DataFrameTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/__w/spark/spark/python/pyspark/pandas/tests/test_dataframe.py", line 5795, in test_style
    check_style()
  File "/__w/spark/spark/python/pyspark/pandas/tests/test_dataframe.py", line 5793, in check_style
    self.assert_eq(pdf_style.to_latex(), psdf_style.to_latex())
AttributeError: 'Styler' object has no attribute 'to_latex'
```

Related: apache@58375a8

### Why are the changes needed?
This test break the 3.2 branch pyspark test (with python 3.6 + pandas 1.1.x), so I think better add the `skipIf` it.

See also apache#38982 (comment)

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- CI passed
- Test on 3.2 branch: Yikun#194, https://github.com/Yikun/spark/actions/runs/3655564439/jobs/6177030747

Closes apache#39002 from Yikun/skip-check.

Authored-by: Yikun Jiang <[email protected]>
Signed-off-by: Yikun Jiang <[email protected]>
sunchao pushed a commit to sunchao/spark that referenced this pull request Jun 2, 2023
### What changes were proposed in this pull request?
According to https://pandas.pydata.org/docs/reference/api/pandas.io.formats.style.Styler.to_latex.html:
`pandas.io.formats.style.Styler.to_latex` introduced since 1.3.0, so before panda 1.3.0, should skip the check

```
ERROR [0.180s]: test_style (pyspark.pandas.tests.test_dataframe.DataFrameTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/__w/spark/spark/python/pyspark/pandas/tests/test_dataframe.py", line 5795, in test_style
    check_style()
  File "/__w/spark/spark/python/pyspark/pandas/tests/test_dataframe.py", line 5793, in check_style
    self.assert_eq(pdf_style.to_latex(), psdf_style.to_latex())
AttributeError: 'Styler' object has no attribute 'to_latex'
```

Related: apache@58375a8

### Why are the changes needed?
This test break the 3.2 branch pyspark test (with python 3.6 + pandas 1.1.x), so I think better add the `skipIf` it.

See also apache#38982 (comment)

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
CI passed

Closes apache#39008 from Yikun/branch-3.2-style-check.

Authored-by: Yikun Jiang <[email protected]>
Signed-off-by: Yikun Jiang <[email protected]>
sunchao pushed a commit to sunchao/spark that referenced this pull request Jun 2, 2023
…ic on executor start

### What changes were proposed in this pull request?

Backport apache#38901 to branch-3.2.

Fix the condition for judging Netty prefer direct memory on executor start, the logic should match `org.apache.spark.network.client.TransportClientFactory`.

### Why are the changes needed?

The check logical was added in SPARK-27991, the original intention is to avoid potential Netty OOM issue when Netty uses direct memory to consume shuffle data, but the condition is not sufficient, this PR updates the logic to match `org.apache.spark.network.client.TransportClientFactory`

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Manual testing.

Closes apache#38982 from pan3793/SPARK-41376-3.2.

Authored-by: Cheng Pan <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants