-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-46135][PYTHON][DOCS] Fix table format error in ipynb docs #44049
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| "outputs": [], | ||
| "source": [ | ||
| "!$HOME/sbin/start-connect-server.sh --packages org.apache.spark:spark-connect_2.12:$SPARK_VERSION" | ||
| "!$HOME/sbin/start-connect-server.sh --packages org.apache.spark:spark-connect_2.13:$SPARK_VERSION" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, I made some changes because currently our scala version is 2.13, 2.12 is no longer supported.
| "output_type": "execute_result" | ||
| "name": "stdout", | ||
| "output_type": "stream", | ||
| "text": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's present the output results in text format instead of text/html format to avoid formatting errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, this example should show the output nicely as spark.sql.repl.eagerEval.enabled is enabled. Wonder if we can fix the docs instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@HyukjinKwon Is the following presentation style appropriate for this special case?
| "source": [ | ||
| "df.toPandas()" | ||
| "from tabulate import tabulate\n", | ||
| "print(tabulate(df.toPandas(), headers = 'keys', tablefmt = 'psql'))" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, the output format looks fine but the whole point of using spark.sql.repl.eagerEval.enabled is to show a pretty table format without applying any other operations in the notebook.
Can you maybe just manually fix the output text/html to be compatible with both the sphinx dark theme and jupyter notebook?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I understand what you mean.
For this document, I have modified the style to maintain using spark.sql.repl.eagerEval.enabled purpose to show a pretty table format without applying any other operations in the notebook.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For python/docs/source/getting_started/quickstart_df.ipynb, are we going to do something similar?
Because this example does not use spark.sql.repl.eagerEval.enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
Nice fix! +1 for #44049 (comment), otherwise it looks good to me. |
| "outputs": [ | ||
| { | ||
| "data": { | ||
| "text/html": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually can you also manually fix the HTML here instead of using print(psdf)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that's possible, it would really be great :-).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I'll give it a try.
| " }\n", | ||
| "</style>\n", | ||
| "<table border=\"1\" class=\"dataframe\">\n", | ||
| "<table border=\"1\" class=\"dataframe\" style=\"table-layout: auto;margin-right: auto;margin-left: 0;\">\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The example result in this place is incorrect, we need to correct it
https://spark.apache.org/docs/latest/api/python/getting_started/quickstart_ps.html#Grouping
HyukjinKwon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM thanks for fixing this @panbingkun !!!
|
Merged to master. |
### What changes were proposed in this pull request? 1.After pr apache#44012, the output format of some 'ipynb' tables displayed in HTML format has been disrupted. The pr aims to fix table format error in ipynb docs. - Before: <img width="792" alt="image" src="https://github.com/apache/spark/assets/15246973/2095a2ac-f0b5-44bd-a3c2-ce742d041243"> - After: <img width="739" alt="image" src="https://github.com/apache/spark/assets/15246973/ec0be72d-4dc0-44f4-ab75-d9668e32fc51"> 2.Fix some minor errors. ### Why are the changes needed? Fix bug. ### Does this PR introduce _any_ user-facing change? Yes, only for docs. ### How was this patch tested? Manually test. Pass GA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#44049 from panbingkun/SPARK-46135. Authored-by: panbingkun <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>







What changes were proposed in this pull request?
1.After pr #44012, the output format of some 'ipynb' tables displayed in HTML format has been disrupted. The pr aims to fix table format error in ipynb docs.
Before:

After:

2.Fix some minor errors.
Why are the changes needed?
Fix bug.
Does this PR introduce any user-facing change?
Yes, only for docs.
How was this patch tested?
Manually test.
Pass GA.
Was this patch authored or co-authored using generative AI tooling?
No.