Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfaults when run in Python's "development mode" #46

Closed
cgohlke opened this issue Aug 20, 2022 · 5 comments
Closed

Segfaults when run in Python's "development mode" #46

cgohlke opened this issue Aug 20, 2022 · 5 comments

Comments

@cgohlke
Copy link
Contributor

cgohlke commented Aug 20, 2022

On Windows, blosc2.compress and blosc2.compress2 reproducible crash when run on any current Python version in "development mode" with blosc2 0.3.1 official or local builds, e.g.:

> py -3.10-64 -X dev -c"from blosc2 import compress;compress(b'\x00')"
Debug memory block at address p=00000245606EFD60: API 'o'
    66 bytes originally requested
    The 7 pad bytes at p-7 are FORBIDDENBYTE, as expected.
    The 8 pad bytes at tail=00000245606EFDA2 are not all FORBIDDENBYTE (0xfd):
        at tail+0: 0x00 *** OUCH
        at tail+1: 0x00 *** OUCH
        at tail+2: 0xfd
        at tail+3: 0xfd
        at tail+4: 0xfd
        at tail+5: 0xfd
        at tail+6: 0xfd
        at tail+7: 0xfd
    Data at p: 00 00 00 00 00 00 00 00 ... 00 00 00 00 00 00 00 00

Enable tracemalloc to get the memory block allocation traceback

Fatal Python error: _PyMem_DebugRawFree: bad trailing pad byte
Python runtime state: initialized

Current thread 0x00003b64 (most recent call first):
  File "<string>", line 1 in <module>

Extension modules: blosc2.blosc2_ext (total: 1)
@cgohlke
Copy link
Contributor Author

cgohlke commented Aug 20, 2022

I can also reproduce this crash in the imagecodecs library so this likely originates in the c-blosc library. Unfortunately I don't get any useful stack trace...

@cgohlke
Copy link
Contributor Author

cgohlke commented Aug 21, 2022

Inputs of length 2 also crash. Lengths >= 3 inputs work.

@cgohlke
Copy link
Contributor Author

cgohlke commented Aug 21, 2022

Apparently the output buffer is too small for input lengths <= 2. This works for me but it should be fixed or enforced/documented in c-blosc2:

diff --git a/blosc2/blosc2_ext.pyx b/blosc2/blosc2_ext.pyx
index b5fc8ce..7f46433 100644
--- a/blosc2/blosc2_ext.pyx
+++ b/blosc2/blosc2_ext.pyx
@@ -397,7 +397,7 @@ cpdef compress(src, int32_t typesize=8, int clevel=9, shuffle=BLOSC_SHUFFLE, cna
     cdef int32_t len_src = <int32_t> len(src)
     cdef Py_buffer *buf = <Py_buffer *> malloc(sizeof(Py_buffer))
     PyObject_GetBuffer(src, buf, PyBUF_SIMPLE)
-    dest = bytes(buf.len + BLOSC2_MAX_OVERHEAD)
+    dest = bytes(max(buf.len, 3) + BLOSC2_MAX_OVERHEAD)
     cdef int32_t len_dest =  <int32_t> len(dest)
     cdef int size
     cdef int shuffle_ = shuffle.value if isinstance(shuffle, Enum) else shuffle
@@ -585,7 +585,7 @@ def compress2(src, **kwargs):
     cdef Py_buffer *buf = <Py_buffer *> malloc(sizeof(Py_buffer))
     PyObject_GetBuffer(src, buf, PyBUF_SIMPLE)
     cdef int size
-    cdef int32_t len_dest = <int32_t> (buf.len + BLOSC2_MAX_OVERHEAD)
+    cdef int32_t len_dest = <int32_t> (max(buf.len, 3) + BLOSC2_MAX_OVERHEAD)
     dest = bytes(len_dest)
     _dest = <void*> <char *> dest

@FrancescAlted
Copy link
Member

Mmmh, from your description it looks like I introduced this issue in Blosc/c-blosc2@29d7709. Unfortunately, I cannot reproduce the error in a new test: Blosc/c-blosc2@55e9b06#diff-673ada4866f9bbc5e91d59f712c94e6918ddbd0e3f961581673e119d363b3d3eR241. It would be nice if we can reproduce your issue, so if you can figure out what is missing in my test, shout!

Anyway, I am re-introducing the check for small buffers, although reducing the BLOSC_MIN_BUFFERSIZE symbol from 128 to 32 (I think this should be safe enough, but tell me if you are still getting issues).

@cgohlke
Copy link
Contributor Author

cgohlke commented Aug 21, 2022

I can confirm that c-blosc2 2.3.1.dev fixes this issue. No more crashes with small buffers in python-blosc2 and imagecodecs.

It's difficult to reproduce memory corruption errors. Try to enable Page Heap Verification in some Windows CI runs and run python-blosc2 tests in development mode -X dev.

@cgohlke cgohlke closed this as completed Aug 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants