-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Long import time #6726
Comments
Thanks for the report. I think one resaon is that we import all the io libraries non-lazy (I think since the backend refactor). And many of the dependecies still use pkg_resources instead of importlib.metadata (which is considetably slower). We'd need to take a look at the lazy loader. |
Useful for debugging: |
I just had another look at this using python -X importtime -c "import llvmlite" 2> import.log and tuna for the visualization.
This should bring it down a bit by another 0.25 s, but I agree it would be nice to have it even lower. |
Some other projects are considering lazy imports as well: https://scientific-python.org/specs/spec-0001/ |
I think we could rework our backend solution to do the imports lazy: |
I just checked, many backends are importing their external dependencies at module level with a try-except block. However, many backends also check for ImportErrors (not ModuleNotFoundError) that occur when a library is not correctly installed. I am not sure if in this case the backend should simply be disabled like it is now (At least cfgrib is raising a warning instead)? Not sure how much it actually saves, but should be ~0.2s (at least on my machine, but depends on the number of intalled backends, the fewer are installed the faster the import should be). |
Nice. Does it work on python 3.8?
Sounds OK to error when trying to use the backend. |
according to the docu it exists since 3.4. |
In developing #7172, there are also some places where class types are used to check for features: Dask and sparse and big contributors due to their need to resolve the class name in question. Ultimately. I think it is important to maybe constrain the problem. Are we ok with 100 ms over numpy + pandas? 20 ms? On my machines, the 0.5 s that xarray is close to seems long... but everytime I look at it, it seems to "just be a python problem". |
What is your issue?
Importing the xarray package takes a significant amount of time. For instance:
compared to others
I am obviously not surprised that importing xarray takes longer than importing Pandas, Numpy or the datetime module, but 1.5 s is something you clearly notice when it is done e.g. by a command-line application.
I inquired about import performance and found out about a lazy module loader proposal by the Scientific Python community. AFAIK SciPy uses a similar system to populate its namespaces without import time penalty. Would it be possible for xarray to use delayed imports when relevant?
The text was updated successfully, but these errors were encountered: