Closed
Description
To reproduce, put the following in repro.py
:
import numpy as np
from numcodecs import blosc
from numcodecs.blosc import Blosc
if __name__ == "__main__":
arr = np.random.random(25_000_000) # 200MB
blosc.compress(arr, b'lz4', 5, Blosc.SHUFFLE)
Then run the following (with numcodecs 0.15.1 and memray 1.16.0 installed):
memray run -f -o output.bin repro.py
memray flamegraph -f output.bin
The blosc.compress()
function allocates 400MB, rather than the expected 200MB (this is in addition to the original 200MB arr
). The reason is that on this line, where the destination buffer is resized by slicing, another memory allocation occurs, since slicing a bytes
object creates a copy.
lz4 and zstd suffer from the same problem.
#656 is possibly related, but doesn't fix the issue.
Metadata
Metadata
Assignees
Labels
No labels