Skip to content

Commit d160af3

Browse files
zhengruifengdongjoon-hyun
authored andcommitted
[MINOR][PYTHON][DOCS] Fix the doctest of pivot
### What changes were proposed in this pull request? ### Why are the changes needed? Fix the doctest of `pivot`, to make sure the example works ### Does this PR introduce _any_ user-facing change? doc-only change ### How was this patch tested? enabled doc-test ### Was this patch authored or co-authored using generative AI tooling? no Closes #52814 from zhengruifeng/py_test_pivot. Authored-by: Ruifeng Zheng <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 2063c36) Signed-off-by: Dongjoon Hyun <[email protected]>
1 parent b5bc199 commit d160af3

File tree

1 file changed

+19
-16
lines changed

1 file changed

+19
-16
lines changed

python/pyspark/sql/group.py

Lines changed: 19 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -456,7 +456,7 @@ def pivot(self, pivot_col: str, values: Optional[List["LiteralType"]] = None) ->
456456
457457
Examples
458458
--------
459-
>>> from pyspark.sql import Row
459+
>>> from pyspark.sql import Row, functions as sf
460460
>>> df1 = spark.createDataFrame([
461461
... Row(course="dotNET", year=2012, earnings=10000),
462462
... Row(course="Java", year=2012, earnings=20000),
@@ -474,28 +474,30 @@ def pivot(self, pivot_col: str, values: Optional[List["LiteralType"]] = None) ->
474474
|dotNET|2013| 48000|
475475
| Java|2013| 30000|
476476
+------+----+--------+
477+
477478
>>> df2 = spark.createDataFrame([
478479
... Row(training="expert", sales=Row(course="dotNET", year=2012, earnings=10000)),
479480
... Row(training="junior", sales=Row(course="Java", year=2012, earnings=20000)),
480481
... Row(training="expert", sales=Row(course="dotNET", year=2012, earnings=5000)),
481482
... Row(training="junior", sales=Row(course="dotNET", year=2013, earnings=48000)),
482483
... Row(training="expert", sales=Row(course="Java", year=2013, earnings=30000)),
483-
... ]) # doctest: +SKIP
484-
>>> df2.show() # doctest: +SKIP
485-
+--------+--------------------+
486-
|training| sales|
487-
+--------+--------------------+
488-
| expert|{dotNET, 2012, 10...|
489-
| junior| {Java, 2012, 20000}|
490-
| expert|{dotNET, 2012, 5000}|
491-
| junior|{dotNET, 2013, 48...|
492-
| expert| {Java, 2013, 30000}|
493-
+--------+--------------------+
484+
... ])
485+
>>> df2.show(truncate=False)
486+
+--------+---------------------+
487+
|training|sales |
488+
+--------+---------------------+
489+
|expert |{dotNET, 2012, 10000}|
490+
|junior |{Java, 2012, 20000} |
491+
|expert |{dotNET, 2012, 5000} |
492+
|junior |{dotNET, 2013, 48000}|
493+
|expert |{Java, 2013, 30000} |
494+
+--------+---------------------+
494495
495496
Compute the sum of earnings for each year by course with each course as a separate column
496497
497498
>>> df1.groupBy("year").pivot(
498-
... "course", ["dotNET", "Java"]).sum("earnings").sort("year").show()
499+
... "course", ["dotNET", "Java"]
500+
... ).sum("earnings").sort("year").show()
499501
+----+------+-----+
500502
|year|dotNET| Java|
501503
+----+------+-----+
@@ -512,9 +514,10 @@ def pivot(self, pivot_col: str, values: Optional[List["LiteralType"]] = None) ->
512514
|2012|20000| 15000|
513515
|2013|30000| 48000|
514516
+----+-----+------+
515-
>>> df2.groupBy(
516-
... "sales.year").pivot("sales.course").sum("sales.earnings").sort("year").show()
517-
... # doctest: +SKIP
517+
518+
>>> df2.groupBy("sales.year").pivot(
519+
... "sales.course"
520+
... ).agg(sf.sum("sales.earnings")).sort("year").show()
518521
+----+-----+------+
519522
|year| Java|dotNET|
520523
+----+-----+------+

0 commit comments

Comments
 (0)