Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better handling of validation errors #172

Open
ePaul opened this issue Jul 6, 2023 · 0 comments
Open

Better handling of validation errors #172

ePaul opened this issue Jul 6, 2023 · 0 comments

Comments

@ePaul
Copy link
Member

ePaul commented Jul 6, 2023

Current situation

When a batch of events is submitted to Nakadi, and one of them fails due to a validation error, Nakadi will reject the whole batch. In the answer, the failed one will be marked as failed, but the other ones as aborted.

Nakadi-Producer will then retry all of them in the next run, running into the same error again.
So not just the failed events are blocked from submitting, but also other events which end up in the same batch. In the extreme case, this can end up blocking all event sending of a service.

This behavior of Nakadi is there to guarantee the order of events submitted together. But as Nakadi-producer doesn't guarantee that order anyways, there is no point in this in our case.

Possible improvement

If some events are failing a validation and others are aborted, the aborted ones should be retried before the failing ones.
We could maybe also reduce the retry frequency of events with validation failures, as those won't get valid by themselves, only by a change in the event type's schema on Nakadi side.
We shouldn't just skip the events completely though, so they do show up in the monitoring and the problem can be fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant