-
Notifications
You must be signed in to change notification settings - Fork 508
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
suppress warning message of pandas_on_spark to_spark #1058
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the suppress is validated in our customize build of FLAML, which could greatly reduce redundant and unformatted information from console.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The solution is OK. Just a reminder that these functions are on the critical path and the import statement inside them can slow down the performance.
Thanks for the reminder. Below is a simple test for the performance. def test_import():
import warnings
warnings.filterwarnings("ignore")
y = 3 + 5
x = y + 2
return x, y
def test_no_import():
y = 3 + 5
x = y + 2
return x, y
if __name__ == "__main__":
import timeit
num_calls = int(1e6)
print(timeit.timeit("test_import()", setup="from __main__ import test_import", number=num_calls))
print(timeit.timeit("test_no_import()", setup="from __main__ import test_no_import", number=num_calls)) Results on my local machine:
Looks like the defect is acceptable. |
For non-spark case 0.5s overhead each call is considered as big because it happens many times in the inner loop, not just once. |
The overhead is for 1e6 calls. |
Why are these changes needed?
To suppress warning message of pandas_on_spark to_spark like below:
The warning message could appear many times during the AutoML/tuning process, which is annoying and unnecessary.
Related issue number
Checks