-
Notifications
You must be signed in to change notification settings - Fork 7.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Model serialization] Exporting TabularLearner via learn.export() leads to huge file size #2945
Comments
There's a temporary solution here: https://walkwithfastai.com/tab.export#Exporting-our-TabularPandas |
I've narrowed down the issue to the |
Should be fixed now. |
Hi, I upgraded Fastai to the lastest version (2.1.5) and I'm still experiencing the same issue. I created a notebook on Colab to reproduce the problem: |
I've done the same (after upgrading to 2.1.5) and I also experienced the same issue: https://colab.research.google.com/drive/1zhSKeJCB5CvTiQKgYWubey9w1VzbNiG2?usp=sharing |
@claudiobottari I've found the issue. It's after we fit the model |
Here's a minimal reproducer showing that there is a duplicate validation dataframe being added after fit: https://gist.github.com/muellerzr/df3fc4a12b021be85639afddab3c5d32 @jph00 we should reopen this issue |
The problem is that |
@muellerz can you provide a snippet on how to deserialize a bugged model and re-serialize it with the correct size? |
Please confirm you have the latest versions of fastai, fastcore, fastscript, and nbdev prior to reporting a bug (delete one): YES
Describe the bug
Exporting TabularLearner via learn.export() leads to huge Pickle file size (>80MB).
To Reproduce
Steps to reproduce the behavior:
TabularLearner
learn.export(filepath)
Expected behavior
The Pickle file should be smaller in size.
Error with full stack trace
N/A
Additional context
By creating different learners with DataFrames of varying size, I noticed that the size of the pickled file increases with the dataset dimension, although after re-loading the serialized file
learn.dls
is empty as expected.The text was updated successfully, but these errors were encountered: