Commit 00cb2f9
committed
[SPARK-28881][PYTHON][TESTS] Add a test to make sure toPandas with Arrow optimization throws an exception per maxResultSize
### What changes were proposed in this pull request?
This PR proposes to add a test case for:
```bash
./bin/pyspark --conf spark.driver.maxResultSize=1m
spark.conf.set("spark.sql.execution.arrow.enabled",True)
```
```python
spark.range(10000000).toPandas()
```
```
Empty DataFrame
Columns: [id]
Index: []
```
which can result in partial results (see #25593 (comment)). This regression was found between Spark 2.3 and Spark 2.4, and accidentally fixed.
### Why are the changes needed?
To prevent the same regression in the future.
### Does this PR introduce any user-facing change?
No.
### How was this patch tested?
Test was added.
Closes #25594 from HyukjinKwon/SPARK-28881.
Authored-by: HyukjinKwon <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>1 parent ab1819d commit 00cb2f9
1 file changed
+30
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
25 | | - | |
| 25 | + | |
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
| |||
421 | 421 | | |
422 | 422 | | |
423 | 423 | | |
| 424 | + | |
| 425 | + | |
| 426 | + | |
| 427 | + | |
| 428 | + | |
| 429 | + | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
| 447 | + | |
| 448 | + | |
| 449 | + | |
| 450 | + | |
| 451 | + | |
| 452 | + | |
424 | 453 | | |
425 | 454 | | |
426 | 455 | | |
| |||
0 commit comments