-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible memory leak using python 2.7, zodb 5 and cacheMinimize #203
Comments
I can reproduce the memory growth exactly as described with ZODB 5.4.0 and CPython 2.7.15 on macOS. Adding a |
I was able to simplify the case I was observing. The pair from ZODB._compat import PersistentUnpickler
def load_persistent(arg):
return None
db_root = conn.root()
for i in range(NB_READS):
o = conn._storage.load(db_root._p_oid)
unpickler = PersistentUnpickler(None, load_persistent, BytesIO(o[0]))
unpickler.load()
unpickler.load()
del unpickler
del o That still appears to leak a tuple on every iteration. (The tuple and memory growth still goes away when we call It can be further simplified and still "leak" a tuple: import zodbpickle.fastpickle as cPickle
for i in range(NB_READS):
p3 = '\x80\x02}q\x02U\x04dataq\x03}q\x04U\x06foobarq\x05C\x08\x00\x00\x00\x00\x00\x00\x00\x01cBTrees.OOBTree\nOOBTree\nq\x06\x86Qss.'
p2 = '\x80\x02}q\x02U\x04dataq\x03}q\x04U\x06foobarq\x05c_codecs\nencode\nq\x06X\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01q\x07U\x06latin1q\x08\x86RcBTrees.OOBTree\nOOBTree\nq\t\x86Qss.'
unpickler = cPickle.Unpickler(BytesIO(p3))
unpickler.persistent_load = load_persistent
unpickler.load()
del unpickler Here, Neither protocol version leaks under Python 3. So it appears that the implementation of BINPERSID in @azmeuk can you confirm that adding a call to |
Thank you for taking some time for this issue. I can confirm this script does not seem to leak anymore with python 2 and zodb 5, when I watch the memory consumption with system monitoring tools, like import gc, transaction, ZODB
from BTrees.OOBTree import OOBTree
conn = ZODB.DB(None).open()
conn.root()["foobar"] = OOBTree()
transaction.commit()
for i in range(10 ** 7):
conn.root()["foobar"]
conn.cacheMinimize()
gc.collect()
conn.close() However, with a slightly more complicated script it still seems to leak. For instance, here is the script I used to create a memory chart:
# leak.py
import os, psutil, transaction, gc, ZODB
from BTrees.OOBTree import OOBTree
output = "{}.data".format(os.path.basename(os.environ["VIRTUAL_ENV"]))
nocollect = "nocollect" in os.environ["VIRTUAL_ENV"]
process = psutil.Process(os.getpid())
conn = ZODB.DB(None).open()
conn.root()["foobar"] = OOBTree()
transaction.commit()
with open(output, "w") as fd:
for i in range(10 ** 5):
conn.root()["foobar"]
conn.cacheMinimize()
nocollect or gc.collect()
fd.write("{} {}\n".format(i, process.memory_full_info().rss))
conn.close() I don't understand why the first script has a steady memory consumption while if I try to monitor it, it behaves differently. That feels quantic :) In my application daemon, the |
Indeed. I think it has to do with the resolution of the monitoring tools and the differences between RSS, USS and VMS. For me, the operating system supplied tools did not function at a fine-enough resolution to detect the memory growth which still occurs using Here's the script that uses just zodbpickle to produce the problem: from __future__ import print_function
import sys
from io import BytesIO
import gc as gc_
import os
import psutil
import zodbpickle.fastpickle as cPickle
process = psutil.Process(os.getpid())
do_gc = True
p3 = b'\x80\x02}q\x02U\x04dataq\x03}q\x04U\x06foobarq\x05C\x08\x00\x00\x00\x00\x00\x00\x00\x01cBTrees.OOBTree\nOOBTree\nq\x06\x86Qss.'
p2 = b'\x80\x02}q\x02U\x04dataq\x03}q\x04U\x06foobarq\x05c_codecs\nencode\nq\x06X\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01q\x07U\x06latin1q\x08\x86RcBTrees.OOBTree\nOOBTree\nq\t\x86Qss.'
# Assigning to output and do_gc based on environment elided...
def load_persistent(ref):
return None
with open(output, "w") as fd:
for i in range(50000):
unpickler = cPickle.Unpickler(BytesIO(pickle))
unpickler.persistent_load = load_persistent
unpickler.load()
del unpickler
if do_gc:
gc_.collect()
mem_info = process.memory_full_info()
print(i, mem_info.uss, mem_info.vms, mem_info.pfaults, file=fd)
if i % 100 == 0:
print('.', end='', file=sys.stderr)
sys.stderr.flush()
fd.flush()
print() Here's the USS growth. We can see that only Python 2.7 with pickle protocol 3 grows, whether or not we perform garbage collection: And here's VMS (only Python 2.7 is shown because the base memory usage of 2 and 3 are so different). Again, only the protocol 3 case grows (in a step function, as expected, due to the way the OS and Python internally allocate memory): objgraph is not able to find any Python-object-level leaks. |
Fixed with the release of zodbpickle 1.0.1. I'll open a PR that specifies that as a minimum version for the next ZODB release. |
This is required to avoid a memory leak on Python 2.7. See #203.
Thank you! |
This is required to avoid a memory leak on Python 2.7. See #203.
After upgrading from zodb 4.4.5 to zodb 5.4.0 I experienced an increasing memory consumption in a daemon process I developed. At some point it saturated the whole memory in my server so I tried to understand what was going on, and I managed to reproduce the memory growth with this script:
If you run this script with python 2.7 and zodb 5 the memory consumption of the python process will grow slowly but steadily. The size of the chunks I observed was sometimes 256kB, sometimes 4kB, but I do not really know how to reproduce specifically a growth by 256 or 4kB.
If you comment the line
conn.cacheMinimize()
or the lineconn.root()["foobar"]
the memory usage will not grow.I wrote a proof of concept that generates a memory consumption chart comparison :
I also asked other developers at my company to test how the memory consumption behave on their computers with python 2.7, 3.4, 3.5, 3.6 and zodb 4.4.5 and zodb 5.4.0. The only case that always grows is python 2 and zodb 5. You can see the results here, with the Linux distributions used. We did not test on Mac or Windows.
The text was updated successfully, but these errors were encountered: