Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jedi is slow for pandas completion for pd.read_csv dataframes #1696

Closed
hwalinga opened this issue Nov 15, 2020 · 5 comments
Closed

Jedi is slow for pandas completion for pd.read_csv dataframes #1696

hwalinga opened this issue Nov 15, 2020 · 5 comments

Comments

@hwalinga
Copy link

This is very similar to #520. That one is solved, so I wonder if the same resolution can be applied to this case.

The case here applies to a dataframe not created by pd.DataFrame but by pd.read_csv (my typical use case).

As you can see, it is very slow here:

 %timeit jedi.Script("import pandas as pd; a = pd.read_csv('file'); a.v1.cat.").completions()
4.06 s ± 332 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

While on my machine the example from #520 works great:

%timeit jedi.Script("import pandas as pd; a = pd.DataFrame({'v1': ['a','b','c']}); a.v1.cat.").completions()
35 ms ± 4.12 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Also, somewhat unrelated. I use deoplete for the completions and disable jedi completions with let g:jedi#completions_enabled = 0, but still these issues slow down vim completely for me. deoplete-vim is completely usable in these scenarios, but I use jedi-vim for renaming and goto and because of this that is unusable.


jedi: 0.17.2
Python: 3.9.0
Linux (Debian 10)

@Shougo
Copy link

Shougo commented Nov 15, 2020

I think you should upload the sample CSV file.

@hwalinga
Copy link
Author

No, it does not matter if the file does not exists, or is big, or is small. (As I tested it.) I don't think jedi actually executes the code. I think it just analyzes the structure of it.

@davidhalter
Copy link
Owner

davidhalter commented Dec 5, 2020

I can confirm (although a bit faster):

1.89 s ± 111 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

However this is a known issue and part of #1059, this cannot simply be fixed. I have tried my best with pandas, but the library is so big that without good caching it's simply hopeless.

If somebody wants to work on this, I'm also willing to accept stuff like if in_pandas:, but it's definitely not like Jedi can deal with this at the moment nor in the future. Sorry.

@hwalinga
Copy link
Author

hwalinga commented Dec 5, 2020

Given the similarities with #520 I hoped it there would be some possibility, but understandable. Thanks for looking into this.

@jdtsmith
Copy link

jdtsmith commented Dec 7, 2021

I found this as well, with iPython 7.29: df.[Tab] takes 3-5s. As in this case, the one and only very slow entry is df.T (the dataframe transpose); all other completions are a few tens of ms total.

I wonder if it is possible to simply hard-code skipping the type determination for the .T attribute?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants