-
-
Notifications
You must be signed in to change notification settings - Fork 510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Searchd 6.2.12 thread hangs when executing a SELECT query #1774
Comments
I've uploaded additional data to s3.manticoresearch.com, manticore/write-only/issue-1774: core.26109.gz - searchd process core dump; |
What's interesting, after the 1st select thread hangs, the same query made from a new connection repeatedly works correctly:
|
Would you like to try to reproduce it in the latest dev version (https://mnt.cr/nightly) ? |
We only face the issue on a production system, where we'd rather not use nightly builds unless absolutely necessary. We're unable to reproduce the issue on a test system, although the test system doesn't have the same load as the production environment. |
An update: unfortunately, we were unable to reproduce the issue in our test systems, including an exact clone of the affected system. After we rebooted the affected system the issue disappeared and we couldn't reproduce it in the last few days. It is unclear what caused specific queries to get stuck, as the affected system didn't show any abnormal readings or other problems, but it's no longer happening after the system reboot. |
BTW I tried to reproduce this issue on our side on 6.2.12 by sending the mentioned above select query with some concurrency and a few concurrent inserts/updates concurrent threads and it didn't hang or crash. |
Same here, we were unable to reproduce the issue with rather high request rate with high concurrency. We're rather puzzled, as it's unclear what could have been the culprit. |
Describe the bug
Searchd 6.2.12 thread hangs indefinitely when executing a select query and consumes 100% of a CPU core. The query cannot be killed with
kill
CLI command, searchd must be killed with system tools to clear the stuck thread. Index checks run fine and show no issues.To Reproduce
Steps to reproduce the behavior:
Expected behavior
The query is expected to complete and return results in a finite time. We have another system with identical OS and software versions and similarly sized index (with somewhat different data), where the same query returns results very quickly without any issues.
Describe the environment:
uname -a
if on a Unix-like system):Messages from log files:
There's nothing relevant in searchd.log and query.log.
Additional context
I'll provide additional information separately, as some files are too large to upload to GitHub.
threads-output.txt
lsof-output.txt
gdb-output.txt
The text was updated successfully, but these errors were encountered: