-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-41376][CORE][3.2] Correct the Netty preferDirectBufs check logic on executor start #38982
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… executor start ### What changes were proposed in this pull request? Fix the condition for judging Netty prefer direct memory on executor start, the logic should match `org.apache.spark.network.client.TransportClientFactory`. ### Why are the changes needed? The check logical was added in SPARK-27991, the original intention is to avoid potential Netty OOM issue when Netty uses direct memory to consume shuffle data, but the condition is not sufficient, this PR updates the logic to match `org.apache.spark.network.client.TransportClientFactory` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manual testing. Closes apache#38901 from pan3793/SPARK-41376. Authored-by: Cheng Pan <[email protected]> Signed-off-by: Sean Owen <[email protected]>
|
Re-triggered. |
|
@dongjoon-hyun the pyspark and lint CI fail consistently, and I see there were also failed on previous commits. Sorry I'm not familiar w/ Python, cc @zhengruifeng @Yikun would you please take a look what happened on branch-3.2? Thanks. |
|
For the pyspark failare, let's see Yikun@3840beb works or not: https://github.com/Yikun/spark/actions/runs/3655117402/jobs/6176152837 If test passed, I can submit the PR to upstream. |
### What changes were proposed in this pull request? According to https://pandas.pydata.org/docs/reference/api/pandas.io.formats.style.Styler.to_latex.html: `pandas.io.formats.style.Styler.to_latex` introduced since 1.3.0, so before panda 1.3.0, should skip the check ``` ERROR [0.180s]: test_style (pyspark.pandas.tests.test_dataframe.DataFrameTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/__w/spark/spark/python/pyspark/pandas/tests/test_dataframe.py", line 5795, in test_style check_style() File "/__w/spark/spark/python/pyspark/pandas/tests/test_dataframe.py", line 5793, in check_style self.assert_eq(pdf_style.to_latex(), psdf_style.to_latex()) AttributeError: 'Styler' object has no attribute 'to_latex' ``` Related: 58375a8 ### Why are the changes needed? This test break the 3.2 branch pyspark test (with python 3.6 + pandas 1.1.x), so I think better add the `skipIf` it. See also #38982 (comment) ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - CI passed - Test on 3.2 branch: Yikun#194, https://github.com/Yikun/spark/actions/runs/3655564439/jobs/6177030747 Closes #39002 from Yikun/skip-check. Authored-by: Yikun Jiang <[email protected]> Signed-off-by: Yikun Jiang <[email protected]>
### What changes were proposed in this pull request? According to https://pandas.pydata.org/docs/reference/api/pandas.io.formats.style.Styler.to_latex.html: `pandas.io.formats.style.Styler.to_latex` introduced since 1.3.0, so before panda 1.3.0, should skip the check ``` ERROR [0.180s]: test_style (pyspark.pandas.tests.test_dataframe.DataFrameTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/__w/spark/spark/python/pyspark/pandas/tests/test_dataframe.py", line 5795, in test_style check_style() File "/__w/spark/spark/python/pyspark/pandas/tests/test_dataframe.py", line 5793, in check_style self.assert_eq(pdf_style.to_latex(), psdf_style.to_latex()) AttributeError: 'Styler' object has no attribute 'to_latex' ``` Related: 58375a8 ### Why are the changes needed? This test break the 3.2 branch pyspark test (with python 3.6 + pandas 1.1.x), so I think better add the `skipIf` it. See also #38982 (comment) ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - CI passed Closes #39007 from Yikun/branch-3.3-check. Authored-by: Yikun Jiang <[email protected]> Signed-off-by: Yikun Jiang <[email protected]>
### What changes were proposed in this pull request? According to https://pandas.pydata.org/docs/reference/api/pandas.io.formats.style.Styler.to_latex.html: `pandas.io.formats.style.Styler.to_latex` introduced since 1.3.0, so before panda 1.3.0, should skip the check ``` ERROR [0.180s]: test_style (pyspark.pandas.tests.test_dataframe.DataFrameTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/__w/spark/spark/python/pyspark/pandas/tests/test_dataframe.py", line 5795, in test_style check_style() File "/__w/spark/spark/python/pyspark/pandas/tests/test_dataframe.py", line 5793, in check_style self.assert_eq(pdf_style.to_latex(), psdf_style.to_latex()) AttributeError: 'Styler' object has no attribute 'to_latex' ``` Related: 58375a8 ### Why are the changes needed? This test break the 3.2 branch pyspark test (with python 3.6 + pandas 1.1.x), so I think better add the `skipIf` it. See also #38982 (comment) ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? CI passed Closes #39008 from Yikun/branch-3.2-style-check. Authored-by: Yikun Jiang <[email protected]> Signed-off-by: Yikun Jiang <[email protected]>
…ic on executor start ### What changes were proposed in this pull request? Backport #38901 to branch-3.2. Fix the condition for judging Netty prefer direct memory on executor start, the logic should match `org.apache.spark.network.client.TransportClientFactory`. ### Why are the changes needed? The check logical was added in SPARK-27991, the original intention is to avoid potential Netty OOM issue when Netty uses direct memory to consume shuffle data, but the condition is not sufficient, this PR updates the logic to match `org.apache.spark.network.client.TransportClientFactory` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manual testing. Closes #38982 from pan3793/SPARK-41376-3.2. Authored-by: Cheng Pan <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
### What changes were proposed in this pull request? According to https://pandas.pydata.org/docs/reference/api/pandas.io.formats.style.Styler.to_latex.html: `pandas.io.formats.style.Styler.to_latex` introduced since 1.3.0, so before panda 1.3.0, should skip the check ``` ERROR [0.180s]: test_style (pyspark.pandas.tests.test_dataframe.DataFrameTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/__w/spark/spark/python/pyspark/pandas/tests/test_dataframe.py", line 5795, in test_style check_style() File "/__w/spark/spark/python/pyspark/pandas/tests/test_dataframe.py", line 5793, in check_style self.assert_eq(pdf_style.to_latex(), psdf_style.to_latex()) AttributeError: 'Styler' object has no attribute 'to_latex' ``` Related: apache@58375a8 ### Why are the changes needed? This test break the 3.2 branch pyspark test (with python 3.6 + pandas 1.1.x), so I think better add the `skipIf` it. See also apache#38982 (comment) ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - CI passed - Test on 3.2 branch: Yikun#194, https://github.com/Yikun/spark/actions/runs/3655564439/jobs/6177030747 Closes apache#39002 from Yikun/skip-check. Authored-by: Yikun Jiang <[email protected]> Signed-off-by: Yikun Jiang <[email protected]>
### What changes were proposed in this pull request? According to https://pandas.pydata.org/docs/reference/api/pandas.io.formats.style.Styler.to_latex.html: `pandas.io.formats.style.Styler.to_latex` introduced since 1.3.0, so before panda 1.3.0, should skip the check ``` ERROR [0.180s]: test_style (pyspark.pandas.tests.test_dataframe.DataFrameTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/__w/spark/spark/python/pyspark/pandas/tests/test_dataframe.py", line 5795, in test_style check_style() File "/__w/spark/spark/python/pyspark/pandas/tests/test_dataframe.py", line 5793, in check_style self.assert_eq(pdf_style.to_latex(), psdf_style.to_latex()) AttributeError: 'Styler' object has no attribute 'to_latex' ``` Related: apache@58375a8 ### Why are the changes needed? This test break the 3.2 branch pyspark test (with python 3.6 + pandas 1.1.x), so I think better add the `skipIf` it. See also apache#38982 (comment) ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? CI passed Closes apache#39008 from Yikun/branch-3.2-style-check. Authored-by: Yikun Jiang <[email protected]> Signed-off-by: Yikun Jiang <[email protected]>
…ic on executor start ### What changes were proposed in this pull request? Backport apache#38901 to branch-3.2. Fix the condition for judging Netty prefer direct memory on executor start, the logic should match `org.apache.spark.network.client.TransportClientFactory`. ### Why are the changes needed? The check logical was added in SPARK-27991, the original intention is to avoid potential Netty OOM issue when Netty uses direct memory to consume shuffle data, but the condition is not sufficient, this PR updates the logic to match `org.apache.spark.network.client.TransportClientFactory` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manual testing. Closes apache#38982 from pan3793/SPARK-41376-3.2. Authored-by: Cheng Pan <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>

What changes were proposed in this pull request?
Backport #38901 to branch-3.2.
Fix the condition for judging Netty prefer direct memory on executor start, the logic should match
org.apache.spark.network.client.TransportClientFactory.Why are the changes needed?
The check logical was added in SPARK-27991, the original intention is to avoid potential Netty OOM issue when Netty uses direct memory to consume shuffle data, but the condition is not sufficient, this PR updates the logic to match
org.apache.spark.network.client.TransportClientFactoryDoes this PR introduce any user-facing change?
No.
How was this patch tested?
Manual testing.