Spyder does not free up memory after closing windows with datasets in the Variable explorer #3513

v-iashin · 2016-10-07T20:08:14Z

Description

What steps will reproduce the problem?

0.~Optionally, display memory usage (Preferences --> General --> Advanced Settings --> Check the box named 'Show memory usage')

Read a large data set. For ex.: .csv 300 MB (the link may not be assessible after November, 6th, 2016)
using the following code

import pandas as pd
raw = pd.read_csv('transactions.csv')

2.~Once you have read it into your RAM, try to glimpse at it by double clicking in the Variable Explorer. You will see a window that displays your data set.
3. Close this window
4. Repeat Step 2 until your RAM is full.

What is the expected output? What do you see instead?

It is reasonable to show only a couple of hundreds of rows and when a user wants more -- load more.
Free up RAM after Step 3.

Version and main components

Spyder Version: 3.0.0
Python Version: 3.5.2
Qt Versions: 4.8.7, PyQt4 (API v2) 4.11.4 on Linux

Dependencies

pyflakes >=0.6.0 :  1.3.0 (OK)
pep8 >=0.6       :  1.7.0 (OK)
pygments >=2.0   :  2.1.3 (OK)
qtconsole >=4.2.0:  4.2.1 (OK)
nbconvert >=4.0  :  4.2.0 (OK)
pandas >=0.13.1  :  0.17.1 (OK)
numpy >=1.7      :  1.11.0 (OK)
sphinx >=0.6.6   :  1.4.6 (OK)
rope >=0.9.4     :  0.9.4-1 (OK)
jedi >=0.8.1     :  0.9.0 (OK)
psutil >=0.3     :  4.3.1 (OK)
matplotlib >=1.0 :  1.5.1 (OK)
sympy >=0.7.3    :  None (NOK)
pylint >=0.25    :  1.6.4 (OK)

The text was updated successfully, but these errors were encountered:

ccordoba12 · 2016-10-07T21:10:42Z

Thanks for reporting. We'll take a look at it for Spyder 3.1.

jitseniesen · 2016-10-17T14:50:42Z

I tried it out, and it looks to me that the memory is released, eventually. It does however take a while before the freed memory shows up in the memory usage displayed in the Spyder.

It looks like all data is copied when you open the dataframe editor based on the amount of time it takes to open the window. @ccordoba12 Should it? From reading the code, I get the impression that only a small amount of data is supposed to be copied, not the whole dataframe. I'm finding it really hard to understand the communication between the Spyder process and the IPython process, so an overview would be helpful.

It is not necessary to use the specific data set mentioned by the original poster. Instead, you can use for instance:

import pandas as pd
import numpy as np
df = pd.DataFrame(np.zeros((13*10**6, 10)))

This creates a data set occupying roughly 1GB.

ccordoba12 · 2016-10-18T02:32:50Z

It looks like all data is copied when you open the dataframe editor

That's correct, data is serialized in the kernel and sent to Spyder so we can show it with our different editors.

I get the impression that only a small amount of data is supposed to be copied, not the whole dataframe

Nop, what we do is to show data in chunks in the dataframe editor, but at any moment we have access to the full dataframe. I don't know how we could do it otherwise :-)

an overview would be helpful

Ok, so these are the steps we follow to show a value in one of our editors:

We ask the console for the value of a variable:

https://github.com/spyder-ide/spyder/blob/master/spyder/widgets/variableexplorer/namespacebrowser.py#L333
This sends a petition to the kernel:

https://github.com/spyder-ide/spyder/blob/master/spyder/widgets/ipythonconsole/namespacebrowser.py#L72
The kernel serializes the value it was asked for, with the help of publish_data:

https://github.com/spyder-ide/spyder/blob/master/spyder/utils/ipython/spyder_kernel.py#L132
That value is received by Spyder in _handle_data_message, deserialized and saved in _kernel_value:

https://github.com/spyder-ide/spyder/blob/master/spyder/widgets/ipythonconsole/namespacebrowser.py#L143
Finally, this value is passed to our editors:

https://github.com/spyder-ide/spyder/blob/master/spyder/widgets/variableexplorer/collectionseditor.py#L365

I know this is very complex, but we have to do all this because the kernel runs in an external process (which can be local or remote, i.e. in a different server :-)

jitseniesen · 2016-10-18T13:10:07Z

Thanks very much for taking the time to write that all down; I'm sure that will be helpful.

Nop, what we do is to show data in chunks in the dataframe editor, but at any moment we have access to the full dataframe. I don't know how we could do it otherwise :-)

I think it was the following comment which gave me the wrong idea.

https://github.com/spyder-ide/spyder/blob/master/spyder/widgets/variableexplorer/collectionseditor.py#L1383

This is however only done in the remote case, which means 'on a different server' in this context.

nkurkjy · 2016-10-20T14:23:44Z

I hope this is something that gets resolved ASAP. This consistently slows down my entire machine and causes the IDE to crash regularly (every 10 minutes or so) when I'm working with large datasets.

Is the only solution right now to restart my kernel every time I look at a couple dataframes?

Version and main components

Spyder Version: 3.0.0
Python Version: 2.7.12
Qt Versions: 5.6.0, PyQt5 5.6 on Windows

Dependencies

pyflakes >=0.5.0 :  1.0.0 (OK)
pep8 >=0.6       :  1.7.0 (OK)
pygments >=2.0   :  2.1 (OK)
qtconsole >=4.2.0:  4.2.1 (OK)
nbconvert >=4.0  :  4.1.0 (OK)
pandas >=0.13.1  :  0.19.0 (OK)
numpy >=1.7      :  1.11.2 (OK)
sphinx >=0.6.6   :  1.3.5 (OK)
rope >=0.9.4     :  0.9.4 (OK)
jedi >=0.8.1     :  0.9.0 (OK)
matplotlib >=1.0 :  1.5.1 (OK)
sympy >=0.7.3    :  0.7.6.1 (OK)
pylint >=0.25    :  1.5.4 (OK)

ccordoba12 · 2016-10-20T14:36:48Z

We need to garbage-collect values we grab from the kernel after users close our viewers. I'll try to do that for 3.0.2 :-)

ccordoba12 added this to the v3.1 milestone Oct 7, 2016

ccordoba12 modified the milestones: v3.0.2, v3.1 Oct 20, 2016

ccordoba12 assigned dalthviz and ccordoba12 and unassigned dalthviz Oct 25, 2016

ccordoba12 added type:Bug component:IPython Console labels Nov 3, 2016

ccordoba12 changed the title ~~Spyder does not free up memory after closing the windows with datasets in Variable explorer~~ Spyder does not free up memory after closing windows with datasets in the Variable explorer Nov 8, 2016

ccordoba12 mentioned this issue Nov 8, 2016

Free memory when closing a Variable Explorer editor #3664

Merged

ccordoba12 closed this as completed in ad3ed43 Nov 8, 2016

ccordoba12 added the component:Variable Explorer label Nov 8, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spyder does not free up memory after closing windows with datasets in the Variable explorer #3513

Spyder does not free up memory after closing windows with datasets in the Variable explorer #3513

v-iashin commented Oct 7, 2016 •

edited

Loading

ccordoba12 commented Oct 7, 2016

jitseniesen commented Oct 17, 2016

ccordoba12 commented Oct 18, 2016

jitseniesen commented Oct 18, 2016

nkurkjy commented Oct 20, 2016 •

edited

Loading

ccordoba12 commented Oct 20, 2016

Spyder does not free up memory after closing windows with datasets in the Variable explorer #3513

Spyder does not free up memory after closing windows with datasets in the Variable explorer #3513

Comments

v-iashin commented Oct 7, 2016 • edited Loading

Description

Version and main components

Dependencies

ccordoba12 commented Oct 7, 2016

jitseniesen commented Oct 17, 2016

ccordoba12 commented Oct 18, 2016

jitseniesen commented Oct 18, 2016

nkurkjy commented Oct 20, 2016 • edited Loading

Version and main components

Dependencies

ccordoba12 commented Oct 20, 2016

v-iashin commented Oct 7, 2016 •

edited

Loading

nkurkjy commented Oct 20, 2016 •

edited

Loading