Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion failed: (md->work_bufs[BUFFER_SPAN].size == 0) - Thread safety related? #570

Open
boazsegev opened this issue Aug 7, 2016 · 3 comments

Comments

@boazsegev
Copy link

First off, thanks for an amazing gem!

I did come across a glitch... I think.

I was testing the rendering system under high stress.

I ran the system in multi threaded mode (running on 16 threads).

While testing, I disabled my own cache management to stress test the rendering engine (rather then the caching system which was storing the rendered string).

I repeatedly asked for the same file to be rendered. 12 threads were doing the asking, 16 threads were doing the rendering, it was all running through Iodine's async IO using HTTP requests, about 200 concurrent requests at a time.

Every time I ran the test, my server would crash with the error:

Assertion failed: (md->work_bufs[BUFFER_SPAN].size == 0), function sd_markdown_render, file markdown.c, line 2898

I'm not complaining about the server crashing, I think it's the right thing to do when experiencing data corruption, but I am worried about the underline cause and the risk of production time failures.

I'm assuming this is multi-threading related, since I didn't get the error in single thread mode.

The following is just my guess (my 2¢) and I hope this helps track down the issue:

At first look, the C code doesn't seem to exit the Global VM Lock, so I thought I might be wrong, however, after a quick read through the code (that's a lot of code and very nicely written, wow), I think the issue is with the markdown struct storing stateful rendering data.

I believe that md->refs and md->work_bufs are susceptible to corruption. It also seems to me that these could be separated from the markdown engine into a separate "state struct", so that multi-threaded rendering with the same markdown render engine wouldn't cause data corruption.

I'm thankful for your time and hope for a fix or a workaround.

jivanvl pushed a commit to gitlabhq/gitlabhq that referenced this issue Oct 3, 2017
The Redcarpet library is not thread-safe as described in
vmg/redcarpet#570. Since we instantiate
the Redcarpet renderer in a class variable, multiple Sidekiq threads
can access the work buffer and corrupt the state. We work around
this issue by memoizing the renderer on a thread basis.

Closes #36637
@neilvcarvalho
Copy link

We were able to reproduce this error with this spec:

    it 'renders wonky content' do
      str = %Q[---------------------------------------------------------------------------\nTypeError                                 Traceback (most recent call last)\nCell In[1], line 8\n      5 img = Image.open(\"/mnt/data/file-oT44n0LEldQZ2ymgGHzrwz8Q\")\n      7 # Use pytesseract to do OCR on the image\n----> 8 text = pytesseract.image_to_string(img)\n     10 text\n\nFile ~/.local/lib/python3.11/site-packages/pytesseract/pytesseract.py:413, in image_to_string(image, lang, config, nice, output_type, timeout)\n    408 \"\"\"\n    409 Returns the result of a Tesseract OCR run on the provided image to string\n    410 \"\"\"\n    411 args = [image, 'txt', lang, config, nice, timeout]\n--> 413 return {\n    414     Output.BYTES: lambda: run_and_get_output(*(args + [True])),\n    415     Output.DICT: lambda: {'text': run_and_get_output(*args)},\n    416     Output.STRING: lambda: run_and_get_output(*args),\n    417 }[output_type]()\n\nFile ~/.local/lib/python3.11/site-packages/pytesseract/pytesseract.py:416, in image_to_string.<locals>.<lambda>()\n    408 \"\"\"\n    409 Returns the result of a Tesseract OCR run on the provided image to string\n    410 \"\"\"\n    411 args = [image, 'txt', lang, config, nice, timeout]\n    413 return {\n    414     Output.BYTES: lambda: run_and_get_output(*(args + [True])),\n    415     Output.DICT: lambda: {'text': run_and_get_output(*args)},\n--> 416     Output.STRING: lambda: run_and_get_output(*args),\n    417 }[output_type]()\n\nFile ~/.local/lib/python3.11/site-packages/pytesseract/pytesseract.py:273, in run_and_get_output(image, extension, lang, config, nice, timeout, return_bytes)\n    263 def run_and_get_output(\n    264     image,\n    265     extension='',\n   (...)\n    270     return_bytes=False,\n    271 ):\n--> 273     with save(image) as (temp_name, input_filename):\n    274         kwargs = {\n    275             'input_filename': input_filename,\n    276             'output_filename_base': temp_name,\n   (...)\n    281             'timeout': timeout,\n    282         }\n    284         run_tesseract(**kwargs)\n\nFile /usr/local/lib/python3.11/contextlib.py:137, in _GeneratorContextManager.__enter__(self)\n    135 del self.args, self.kwds, self.func\n    136 try:\n--> 137     return next(self.gen)\n    138 except StopIteration:\n    139     raise RuntimeError(\"generator didn't yield\") from None\n\nFile ~/.local/lib/python3.11/site-packages/pytesseract/pytesseract.py:194, in save(image)\n    192     yield f.name, realpath(normpath(normcase(image)))\n    193     return\n--> 194 image, extension = prepare(image)\n    195 input_file_name = f.name + extsep + extension\n    196 image.save(input_file_name, format=image.format)\n\nFile ~/.local/lib/python3.11/site-packages/pytesseract/pytesseract.py:175, in prepare(image)\n    173 extension = 'PNG' if not image.format else image.format\n    174 if extension not in SUPPORTED_FORMATS:\n--> 175     raise TypeError('Unsupported image format/type')\n    177 if 'A' in image.getbands():\n    178     # discard and replace the alpha channel with white background\n    179     background = Image.new(RGB_MODE, image.size, (255, 255, 255))\n\nTypeError: Unsupported image format/type\n]
      expect { helper.render_markdown(str) }.not_to raise_error
    end

This spec itself passes, but the next helper.render_markdown call will crash. It looks like this text messes up with some Redcarpet internal state. We were caching the Redcarpet renderer in a class constant and tried moving it to Thread.current, and using a mutex. None worked.

@boazsegev
Copy link
Author

boazsegev commented Mar 25, 2024

IMHO, looking at the new data, this issue may be caused by a belated non-zero memory failure rather than a race condition.

This often happens when tests run for the first time and either the stack memory or the malloc memory is all zeros (the initial state offered by the OS, especially in containers). In such cases a whole family of bugs can't be detected, such as variable initialization bugs.

Later on, as malloc starts to return junk data (which usually happens only after the free list starts to get utilized), or as the stack unfolds and the next cycle uses "dirty" memory, non-initialization issues and str* issues (i.e., strlen) start to raise their heads as weird bugs show up.

@boazsegev
Copy link
Author

Looking a bit at the code, could be related to this:

md = malloc(sizeof(struct sd_markdown));
if (!md)
return NULL;
memcpy(&md->cb, callbacks, sizeof(struct sd_callbacks));

A temporary test to see if this could fix anything would be to add:

	md = malloc(sizeof(struct sd_markdown));
	if (!md)
		return NULL;
        *md = (struct sd_markdown){0}; /* added line to test if this helps */
	memcpy(&md->cb, callbacks, sizeof(struct sd_callbacks));

Or, replace with:

	md = malloc(sizeof(struct sd_markdown));
	if (!md)
		return NULL;
        *md = (struct sd_markdown){.cb = callbacks}; /* copies data and zeros out the rest */

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants