-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-39456][DOCS][PYTHON] Fix broken function links in the auto-generated pandas API support list documentation #36895
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…erated pandas API support list documentation
|
@HyukjinKwon Modified to automatically generate documentation for only newly declared or overridden functions in its own class. Most broken links have been fixed. For Cases A, B, and C, it seems that we can add documents in
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from inspect import getmembers, isfunction
class Parent:
x = 0
def __init__(self):
pass
def parent_function():
print("parent")
class Child(Parent):
x = 1
def __init__(self):
pass
def child_function(self):
print("child")
p = Parent()
c = Child()
# Get all child + parent functions
[i for i in dict([m for m in getmembers(Child, isfunction)])]
# Get all child functions
[i for i in Child.__dict__]Here is a simple Test to help review, this change exclude parent only functions. so LGTM to me.
@beobest2 If you could print the diff list before and after this PR may also help to review. : )
|
The difference was calculated and written as follows. Please refer to it when reviewing.
--- DIFF --- pyspark.pandas.CategoricalIndex.all |
|
Can one of the admins verify this patch? |
HyukjinKwon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
Merged to master. |
What changes were proposed in this pull request?
In the auto-generated documentation on pandas API support list, there are cases where the link of the function property provided in the document is not connected, so it needs to be corrected.
The current 'supported API generation' function dynamically compares the modules of
PySpark.pandasandpandasto find the difference.At this time, the inherited class is also aggregated, and the link is not generated correctly (such as
CategoricalIndex.all()is used internally by inheritingIndex.all().) because it does not match the pattern of each API document.So, I modified it in such a way that it is created by excluding methods that exist in the parent class.
Why are the changes needed?
To link to the correct API document.
Does this PR introduce any user-facing change?
Yes, the "Supported pandas APIs" page has changed as below.

How was this patch tested?
Manually check the links in the documents & the existing doc build should be passed.