Jedi is slow for pandas completion for pd.read_csv dataframes #1696

hwalinga · 2020-11-15T21:57:46Z

This is very similar to #520. That one is solved, so I wonder if the same resolution can be applied to this case.

The case here applies to a dataframe not created by pd.DataFrame but by pd.read_csv (my typical use case).

As you can see, it is very slow here:

 %timeit jedi.Script("import pandas as pd; a = pd.read_csv('file'); a.v1.cat.").completions()
4.06 s ± 332 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

While on my machine the example from #520 works great:

%timeit jedi.Script("import pandas as pd; a = pd.DataFrame({'v1': ['a','b','c']}); a.v1.cat.").completions()
35 ms ± 4.12 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Also, somewhat unrelated. I use deoplete for the completions and disable jedi completions with let g:jedi#completions_enabled = 0, but still these issues slow down vim completely for me. deoplete-vim is completely usable in these scenarios, but I use jedi-vim for renaming and goto and because of this that is unusable.

jedi: 0.17.2
Python: 3.9.0
Linux (Debian 10)

The text was updated successfully, but these errors were encountered:

Shougo · 2020-11-15T22:52:19Z

I think you should upload the sample CSV file.

hwalinga · 2020-11-16T14:57:29Z

No, it does not matter if the file does not exists, or is big, or is small. (As I tested it.) I don't think jedi actually executes the code. I think it just analyzes the structure of it.

davidhalter · 2020-12-05T16:32:11Z

I can confirm (although a bit faster):

1.89 s ± 111 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

However this is a known issue and part of #1059, this cannot simply be fixed. I have tried my best with pandas, but the library is so big that without good caching it's simply hopeless.

If somebody wants to work on this, I'm also willing to accept stuff like if in_pandas:, but it's definitely not like Jedi can deal with this at the moment nor in the future. Sorry.

hwalinga · 2020-12-05T18:35:51Z

Given the similarities with #520 I hoped it there would be some possibility, but understandable. Thanks for looking into this.

jdtsmith · 2021-12-07T21:24:58Z

I found this as well, with iPython 7.29: df.[Tab] takes 3-5s. As in this case, the one and only very slow entry is df.T (the dataframe transpose); all other completions are a few tens of ms total.

I wonder if it is possible to simply hard-code skipping the type determination for the .T attribute?

davidhalter closed this as completed Dec 5, 2020

hwalinga mentioned this issue Dec 5, 2020

Slow autocompletion in python/ipython console for large DataFrame containing strings pandas-dev/pandas#37947

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jedi is slow for pandas completion for pd.read_csv dataframes #1696

Jedi is slow for pandas completion for pd.read_csv dataframes #1696

hwalinga commented Nov 15, 2020

Shougo commented Nov 15, 2020

hwalinga commented Nov 16, 2020

davidhalter commented Dec 5, 2020 •

edited

Loading

hwalinga commented Dec 5, 2020

jdtsmith commented Dec 7, 2021

Jedi is slow for pandas completion for pd.read_csv dataframes #1696

Jedi is slow for pandas completion for pd.read_csv dataframes #1696

Comments

hwalinga commented Nov 15, 2020

Shougo commented Nov 15, 2020

hwalinga commented Nov 16, 2020

davidhalter commented Dec 5, 2020 • edited Loading

hwalinga commented Dec 5, 2020

jdtsmith commented Dec 7, 2021

davidhalter commented Dec 5, 2020 •

edited

Loading