-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xtdb can get into a zombie state after out-of-memory (OOM) #652
Comments
Alternative to #676 is to fix it upstream: dekkers/xtdb-http-multinode#3 |
dekkers/xtdb-http-multinode#3 is merged, we'll need to update docs so that xtdb-http-multinode v1.0.4 is used. |
This issue can be closed now, right? |
I think there is still some docs work to do, also see #681 (comment) |
Created #703 to improve docs, so everything is this issue is resolved now. |
In a new setup crux ended up in a zombie state again. It looks like the
And after increasing memory it was this:
Let's investigate why the OOM option is missing in the default setup. |
Apparently we forgot to add the option in the Dockerfile, dekkers/xtdb-http-multinode#8 fixes this. |
dekkers/xtdb-http-multinode#8 was merged and because we use the |
xtdb can get into a zombie state after the JVM being out-of-memory (OOM). In my local environment with a lot of findings and OOIs, xtdb logged the following message:
after which for every query the following message was logged:
Octopoes logged the follow messages; note that xtdb has trouble parsing a completely valid query (
Query didn't match expected structure
):After restarting the xtdb container, it is functioning properly again.
This issue is primarily meant to document this xtdb zombie state so we know how to handle it if it occurs in the future. I haven't gotten around to debugging xtdb internals to properly resolve it. As a workaround, we could start the xtdb JVM using the
ExitOnOutOfMemoryError
(exit JVM on the first occurrence of an out-of-memory error) orCrashOnOutOfMemoryError
(which produces crash files for debugging) flag.Relates to #429
The text was updated successfully, but these errors were encountered: