You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
We are using jetstream bus in several places. We've noticed that after some time that it working correctly, when one of the pods restarted by the k8s and it is not coming up To Reproduce
Steps to reproduce the behavior:
Create jetstream bus with 3 pods and persistent storage
Wait for k8s to restart pod, also was able to reproduce it by deleting one of the pods 1 day after they were alive
Expected behavior
Pod coming up successfuly and syncing with the cluster
Environment (please complete the following information):
[7] 2025/02/05 23:24:52.702284 [WRN] Filestore [_meta_] Stream state magic and version mismatch
[7] 2025/02/05 23:24:52.702306 [WRN] Filestore [_meta_] Recovering stream state from index errored: corrupt state file
[7] 2025/02/05 23:24:52.720351 [WRN] Filestore [S-R3F-TbLcjFQM] Stream state magic and version mismatch
[7] 2025/02/05 23:24:52.720367 [WRN] Filestore [S-R3F-TbLcjFQM] Recovering stream state from index errored: corrupt state file
[7] 2025/02/05 23:24:52.734598 [WRN] Filestore [default] Stream state magic and version mismatch
[7] 2025/02/05 23:24:52.734611 [WRN] Filestore [default] Recovering stream state from index errored: corrupt state file
[7] 2025/02/05 23:24:52.751038 [WRN] Filestore [C-R3F-ptSslgK2] Stream state magic and version mismatch
[7] 2025/02/05 23:24:52.751063 [WRN] Filestore [C-R3F-ptSslgK2] Recovering stream state from index errored: corrupt state file
[7] 2025/02/05 23:24:52.793448 [WRN] Consumer create failed for 'js > default > group-3223285732': error creating store for consumer: cipher: message authentication failed (10104)
Once it happens to one pod, it will 100% happen to 2 other pods when they will be restarted
To fix the issue we have to delete the PVC of the pod, after it pod is coming up correctly. So to recover the cluster we need to do it for all 3 pods (can be done one by one or all together)
Describe the bug
We are using jetstream bus in several places. We've noticed that after some time that it working correctly, when one of the pods restarted by the k8s and it is not coming up
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Pod coming up successfuly and syncing with the cluster
Environment (please complete the following information):
Additional context
Logs:
And further logs are combination of pings and this message
till the moment k8s restarting the pod due to failed healthcheck
there are several errors there, probably some are the reasons of the others:
not sure if this is relevant as i saw the same on the healthy pods as well
event bus configuration
Message from the maintainers:
If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.
The text was updated successfully, but these errors were encountered: