Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible risk: Database counter overflow #160

Open
ePaul opened this issue Nov 29, 2022 · 1 comment
Open

Possible risk: Database counter overflow #160

ePaul opened this issue Nov 29, 2022 · 1 comment
Assignees
Labels

Comments

@ePaul
Copy link
Member

ePaul commented Nov 29, 2022

Background / Current situation

When the library sends out events:

  • Events are stored in a Postgresql DB table with an ID column of type SERIAL (= auto-incrementing 4-byte integer).
  • When reading from the DB, the ID is mapped to an Java Integer (which is also a 4-byte type).
  • This is then mapped to a Java long (8 byte integer) (by a simple cast, which means positive numbers will be prefixed with 0, negative ones with 1).
  • This long value is then used as the last 8 bytes of a (16-byte) UUID, prefixing with 0.
  • This UUID is converted to a string and used as metadata.eid for the events submitted to Nakadi.

The positive range of the 4-byte types is from 1 to 2147483647.

After producing this amount of events, the database sequence will refuse to produce more values, which means event production is broken.

A manual reset to negative values is possible, which then will result in the eid continuing from 00000000-0000-0000-ffff-ffff80000000 (counting up). When 0 is reached again (i.e. after reaching 00000000-0000-0000-ffff-ffffffffffff), it will start reuse the originally used eid values starting with 00000000-0000-0000-0000-000000000000, which also means that the sequence is then out of order.

Goal

We need to come up with a way of preventing this from happening.

Possible ideas:

  • use a larger counter type like BIGSERIAL (8 bytes) (and adjust the Java code to use Long to take advantage of it)
  • Introduce some (manual) way of resetting the counter, but use a configured prefix for the eid to keep the relative order.
  • ...
@ePaul
Copy link
Member Author

ePaul commented Jun 5, 2024

So now it happened the first time in an actual application (internal link).

It was decided to reset the counter to a negative value, to give a bit more time. Fortunately in this case it seems like nobody was using the eid for ordering purposes.

But this gives this a bit more urgency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants