-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-41412][CONNECT][FOLLOW-UP] Fix test_cast to pass with ANSI mode on #39034
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Member
Author
|
cc @amaliujia @zhengruifeng FYI |
zhengruifeng
approved these changes
Dec 12, 2022
Contributor
|
LGTM |
Member
Author
|
Merged to master. |
Contributor
|
late LGTM! |
HyukjinKwon
added a commit
that referenced
this pull request
Dec 13, 2022
…ke the tests to pass with/without ANSI mode ### What changes were proposed in this pull request? This PR is another followup of #39034 that, instead, make the tests to pass with/without ANSI mode. ### Why are the changes needed? Spark Connect uses isolated Spark session so setting the configuration in PySpark side does not take an effect. Therefore, the test still fails, see https://github.com/apache/spark/actions/runs/3681383627/jobs/6228030132. We should make the tests pass with/without ANSI mode for now. ### Does this PR introduce _any_ user-facing change? No, test-only ### How was this patch tested? Manually tested via: ```bash SPARK_ANSI_SQL_MODE=true ./python/run-tests --testnames 'pyspark.sql.tests.connect.test_connect_column' ``` Closes #39050 from HyukjinKwon/SPARK-41412. Authored-by: Hyukjin Kwon <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>
beliefer
pushed a commit
to beliefer/spark
that referenced
this pull request
Dec 18, 2022
…e on ### What changes were proposed in this pull request? This PR is a followup of apache#38970 which makes the test pass with ANSI mode on. ### Why are the changes needed? To recover the build with ANSI mode on. Currently it's broke as follows: ``` ====================================================================== ERROR [2.651s]: test_cast (pyspark.sql.tests.connect.test_connect_column.SparkConnectTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "/__w/spark/spark/python/pyspark/sql/tests/connect/test_connect_column.py", line 119, in test_cast df.select(df.id.cast(x)).toPandas(), df2.select(df2.id.cast(x)).toPandas() File "/__w/spark/spark/python/pyspark/sql/connect/dataframe.py", line 1466, in toPandas return self._session.client._to_pandas(query) File "/__w/spark/spark/python/pyspark/sql/connect/client.py", line 333, in _to_pandas return self._execute_and_fetch(req) File "/__w/spark/spark/python/pyspark/sql/connect/client.py", line 418, in _execute_and_fetch for b in self._stub.ExecutePlan(req, metadata=self._builder.metadata()): File "/usr/local/lib/python3.9/dist-packages/grpc/_channel.py", line 426, in __next__ return self._next() File "/usr/local/lib/python3.9/dist-packages/grpc/_channel.py", line 826, in _next raise self grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with: status = StatusCode.UNKNOWN details = "[DATATYPE_MISMATCH.CAST_WITH_CONF_SUGGESTION] Cannot resolve "id" due to data type mismatch: cannot cast "BIGINT" to "BINARY" with ANSI mode on. If you have to cast "BIGINT" to "BINARY", you can set "spark.sql.ansi.enabled" as 'false'.; 'Project [unresolvedalias(cast(id#31L as binary), None)] +- SubqueryAlias spark_catalog.default.test_connect_basic_table_1 +- Relation spark_catalog.default.test_connect_basic_table_1[id#31L,name#32] parquet " debug_error_string = "UNKNOWN:Error received from peer ipv4:127.0.0.1:15002 {created_time:"2022-12-09T01:54:45.378316841+00:00", grpc_status:2, grpc_message:"[DATATYPE_MISMATCH.CAST_WITH_CONF_SUGGESTION] Cannot resolve \"id\" due to data type mismatch: cannot cast \"BIGINT\" to \"BINARY\" with ANSI mode on.\nIf you have to cast \"BIGINT\" to \"BINARY\", you can set \"spark.sql.ansi.enabled\" as \'false\'.;\n\'Project [unresolvedalias(cast(id#31L as binary), None)]\n+- SubqueryAlias spark_catalog.default.test_connect_basic_table_1\n +- Relation spark_catalog.default.test_connect_basic_table_1[id#31L,name#32] parquet\n"}" > ``` https://github.com/apache/spark/actions/runs/3671813752 ### Does this PR introduce _any_ user-facing change? No, test-only. ### How was this patch tested? This PR fixes the unittest to make passed. I manually tested. Closes apache#39034 from HyukjinKwon/SPARK-41412-followup. Authored-by: Hyukjin Kwon <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>
beliefer
pushed a commit
to beliefer/spark
that referenced
this pull request
Dec 18, 2022
…ke the tests to pass with/without ANSI mode ### What changes were proposed in this pull request? This PR is another followup of apache#39034 that, instead, make the tests to pass with/without ANSI mode. ### Why are the changes needed? Spark Connect uses isolated Spark session so setting the configuration in PySpark side does not take an effect. Therefore, the test still fails, see https://github.com/apache/spark/actions/runs/3681383627/jobs/6228030132. We should make the tests pass with/without ANSI mode for now. ### Does this PR introduce _any_ user-facing change? No, test-only ### How was this patch tested? Manually tested via: ```bash SPARK_ANSI_SQL_MODE=true ./python/run-tests --testnames 'pyspark.sql.tests.connect.test_connect_column' ``` Closes apache#39050 from HyukjinKwon/SPARK-41412. Authored-by: Hyukjin Kwon <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
This PR is a followup of #38970 which makes the test pass with ANSI mode on.
Why are the changes needed?
To recover the build with ANSI mode on. Currently it's broke as follows:
https://github.com/apache/spark/actions/runs/3671813752
Does this PR introduce any user-facing change?
No, test-only.
How was this patch tested?
This PR fixes the unittest to make passed. I manually tested.