Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dead Letter Queue for stuck events #371

Open
adyach opened this issue Sep 12, 2022 · 2 comments
Open

Dead Letter Queue for stuck events #371

adyach opened this issue Sep 12, 2022 · 2 comments

Comments

@adyach
Copy link
Contributor

adyach commented Sep 12, 2022

The events which the consumer is not able to process block the whole pipeline. Those events have to be manually skipped in order to unblock the pipeline and continue the processing of the following events. It would be helpful to have a feature configured that allows to skip not processable events, continue processing of the events, but publish "broken events" to another event type for later investigation.

Dead letter queue is a functionality that many teams are missing and every team have implemented their own local version of DLQ. This is very costly and in the absence of best practices, often the implementations differ vastly from each other. A central implementation of DQL for Nakadi can greatly reduce the software/technology complexity and cost for many individual teams.

@dehora
Copy link
Owner

dehora commented Oct 12, 2022

A central implementation of DQL for Nakadi can greatly reduce the software/technology complexity

Can you clarify what you expect from a client with a DLQ over a distributed ordered log? Typically DLQs imply storage offload and a common solution would mean imposing common storage on client users. For example this feature is built into SQS because its competing consumer message popping protocol allows that to makes sense whereas it's not available in nakadi alternatives like Kinesis or Kafka which all have stronger implications around ordering. The existing client has enough affordances to allow a user to skip past the event and checkpoint.

Dead letter queue is a functionality that many teams are missing and every team have implemented their own local version of DLQ. This is very costly and in the absence of best practices

You need to back up and qualify statements like "every team", "very costly" and "differ vastly". I suspect for the latter that teams do this differently is in part related to processing semantics of events and how they are related to each other. Bear in mind a structurally invalid event won't get into nakadi for the most part because the service insists on a schema.

As an exercise, it would help to consider how nakadi itself would offer a DLQ (again akin to SQS, perhaps by exposing an API for it, rather than a service configuration: every event type is in a unique resource space as is the consumer subscription, so it seems possible to define an API for it). That might indicate what can be generalised and what is specific to event data and their consumers.

@ePaul
Copy link

ePaul commented Oct 26, 2022

A central implementation of DQL for Nakadi can greatly reduce the software/technology complexity

Can you clarify what you expect from a client with a DLQ over a distributed ordered log?

A simple idea would be something like If processing an event (from a subscription) fails (with an exception in the callback), it is published to some other event type, and the subscription cursor is then committed anyways (if the submission was successful).
(The semantics might need to be refined for batch-consumption.)

The client configuration would just have the second event type when setting up a consumer.

Of course, this means that these events will be processed out of order (if at all), but this is often preferable to not being able to process any other events on this partition. Builders should be empowered to make this decision.

This can be set up on top of Nakadi-Java (and I guess this is what teams are doing), but having it integrated in a client makes it easier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants