Update lib/events/dynamoevents to use aws-sdk-go-v2#44363
Merged
rosstimothy merged 1 commit intomasterfrom Jul 23, 2024
Merged
Conversation
953de8d to
811d9ac
Compare
b08d9ec to
eb52d5b
Compare
This is a continuation of converting dynamodb components to use the latest version of the sdk that was started in #44356. This should have feature parity with the existing backend except for prometheus metrics. In an attempt to isolate the changes here the metrics are omitted for the time being and will be added in a follow up. In addition, a few of the events test suite cases were updated to be more reliable when testing against a real backend.
811d9ac to
55311ac
Compare
Contributor
Author
|
Friendly ping @smallinsky @lxea |
smallinsky
approved these changes
Jul 23, 2024
zmb3
approved these changes
Jul 23, 2024
hugoShaka
added a commit
that referenced
this pull request
Sep 5, 2025
This commit fixes a data loss bug causing the DynamoDB cursor EventIndex field to be sightly changed due to conversion issues. As this field is used to index events, this could lead to paginated queries not returning the right events, either returning events from before or after the requirested page. In the worst case, this could cause a livelock as the query continuisly processes the same events. The data loss issue is caused by improper JSON unmarshalling of large integers. This happened because of this reasons: - JSON is fundamentally flawed as it offers a single number type "binary64" for all numbers, whether they are integers or float. Go's encoding/json library uses field types to detect if the number should be stored in an int64 or a float64. - [The AWS SDK v2 migration PR](#44363) changed the cursor JSON unmarshalling logic and unmarshalled the cursor into `map[string]any`. This caused every integer field of `event` to round-trip through float64. - [The Emit event fallback PR](#40854) changed the EventIndex value from a small incremental integer to a large unix nanosecond timestamp in case of conflict. The large value was no longer safe for storage in a float64. The combination of those 3 factors caused the cursor EventIndex to get corrupted and caused unexpected event query index offsets. When preseted with a non-existing document, DynamoDB still hashes it and starts the query from its supposed location in the index. This is why this issue has not been detected for so long. Its consequences were: - duplicated events returned on 2 consecutive pages (this case was handled properly by the event forwarder as it keeps track of the last processed event) - livelock if the number of duplicated events exceed the page size - non-forwarded events if the index offset was in the future
hugoShaka
added a commit
that referenced
this pull request
Sep 5, 2025
This commit fixes a data loss bug causing the DynamoDB cursor EventIndex field to be sightly changed due to conversion issues. As this field is used to index events, this could lead to paginated queries not returning the right events, either returning events from before or after the requirested page. In the worst case, this could cause a livelock as the query continuisly processes the same events. The data loss issue is caused by improper JSON unmarshalling of large integers. This happened because of this reasons: - JSON is fundamentally flawed as it offers a single number type "binary64" for all numbers, whether they are integers or float. Go's encoding/json library uses field types to detect if the number should be stored in an int64 or a float64. - [The AWS SDK v2 migration PR](#44363) changed the cursor JSON unmarshalling logic and unmarshalled the cursor into `map[string]any`. This caused every integer field of `event` to round-trip through float64. - [The Emit event fallback PR](#40854) changed the EventIndex value from a small incremental integer to a large unix nanosecond timestamp in case of conflict. The large value was no longer safe for storage in a float64. The combination of those 3 factors caused the cursor EventIndex to get corrupted and caused unexpected event query index offsets. When preseted with a non-existing document, DynamoDB still hashes it and starts the query from its supposed location in the index. This is why this issue has not been detected for so long. Its consequences were: - duplicated events returned on 2 consecutive pages (this case was handled properly by the event forwarder as it keeps track of the last processed event) - livelock if the number of duplicated events exceed the page size - non-forwarded events if the index offset was in the future
hugoShaka
added a commit
that referenced
this pull request
Sep 5, 2025
This commit fixes a data loss bug causing the DynamoDB cursor EventIndex field to be sightly changed due to conversion issues. As this field is used to index events, this could lead to paginated queries not returning the right events, either returning events from before or after the requirested page. In the worst case, this could cause a livelock as the query continuisly processes the same events. The data loss issue is caused by improper JSON unmarshalling of large integers. This happened because of this reasons: - JSON is fundamentally flawed as it offers a single number type "binary64" for all numbers, whether they are integers or float. Go's encoding/json library uses field types to detect if the number should be stored in an int64 or a float64. - [The AWS SDK v2 migration PR](#44363) changed the cursor JSON unmarshalling logic and unmarshalled the cursor into `map[string]any`. This caused every integer field of `event` to round-trip through float64. - [The Emit event fallback PR](#40854) changed the EventIndex value from a small incremental integer to a large unix nanosecond timestamp in case of conflict. The large value was no longer safe for storage in a float64. The combination of those 3 factors caused the cursor EventIndex to get corrupted and caused unexpected event query index offsets. When preseted with a non-existing document, DynamoDB still hashes it and starts the query from its supposed location in the index. This is why this issue has not been detected for so long. Its consequences were: - duplicated events returned on 2 consecutive pages (this case was handled properly by the event forwarder as it keeps track of the last processed event) - livelock if the number of duplicated events exceed the page size - non-forwarded events if the index offset was in the future
hugoShaka
added a commit
that referenced
this pull request
Sep 8, 2025
This commit fixes a data loss bug causing the DynamoDB cursor EventIndex field to be sightly changed due to conversion issues. As this field is used to index events, this could lead to paginated queries not returning the right events, either returning events from before or after the requirested page. In the worst case, this could cause a livelock as the query continuisly processes the same events. The data loss issue is caused by improper JSON unmarshalling of large integers. This happened because of this reasons: - JSON is fundamentally flawed as it offers a single number type "binary64" for all numbers, whether they are integers or float. Go's encoding/json library uses field types to detect if the number should be stored in an int64 or a float64. - [The AWS SDK v2 migration PR](#44363) changed the cursor JSON unmarshalling logic and unmarshalled the cursor into `map[string]any`. This caused every integer field of `event` to round-trip through float64. - [The Emit event fallback PR](#40854) changed the EventIndex value from a small incremental integer to a large unix nanosecond timestamp in case of conflict. The large value was no longer safe for storage in a float64. The combination of those 3 factors caused the cursor EventIndex to get corrupted and caused unexpected event query index offsets. When preseted with a non-existing document, DynamoDB still hashes it and starts the query from its supposed location in the index. This is why this issue has not been detected for so long. Its consequences were: - duplicated events returned on 2 consecutive pages (this case was handled properly by the event forwarder as it keeps track of the last processed event) - livelock if the number of duplicated events exceed the page size - non-forwarded events if the index offset was in the future
github-merge-queue bot
pushed a commit
that referenced
this pull request
Sep 9, 2025
This commit fixes a data loss bug causing the DynamoDB cursor EventIndex field to be sightly changed due to conversion issues. As this field is used to index events, this could lead to paginated queries not returning the right events, either returning events from before or after the requirested page. In the worst case, this could cause a livelock as the query continuisly processes the same events. The data loss issue is caused by improper JSON unmarshalling of large integers. This happened because of this reasons: - JSON is fundamentally flawed as it offers a single number type "binary64" for all numbers, whether they are integers or float. Go's encoding/json library uses field types to detect if the number should be stored in an int64 or a float64. - [The AWS SDK v2 migration PR](#44363) changed the cursor JSON unmarshalling logic and unmarshalled the cursor into `map[string]any`. This caused every integer field of `event` to round-trip through float64. - [The Emit event fallback PR](#40854) changed the EventIndex value from a small incremental integer to a large unix nanosecond timestamp in case of conflict. The large value was no longer safe for storage in a float64. The combination of those 3 factors caused the cursor EventIndex to get corrupted and caused unexpected event query index offsets. When preseted with a non-existing document, DynamoDB still hashes it and starts the query from its supposed location in the index. This is why this issue has not been detected for so long. Its consequences were: - duplicated events returned on 2 consecutive pages (this case was handled properly by the event forwarder as it keeps track of the last processed event) - livelock if the number of duplicated events exceed the page size - non-forwarded events if the index offset was in the future
backport-bot-workflows bot
pushed a commit
that referenced
this pull request
Sep 9, 2025
This commit fixes a data loss bug causing the DynamoDB cursor EventIndex field to be sightly changed due to conversion issues. As this field is used to index events, this could lead to paginated queries not returning the right events, either returning events from before or after the requirested page. In the worst case, this could cause a livelock as the query continuisly processes the same events. The data loss issue is caused by improper JSON unmarshalling of large integers. This happened because of this reasons: - JSON is fundamentally flawed as it offers a single number type "binary64" for all numbers, whether they are integers or float. Go's encoding/json library uses field types to detect if the number should be stored in an int64 or a float64. - [The AWS SDK v2 migration PR](#44363) changed the cursor JSON unmarshalling logic and unmarshalled the cursor into `map[string]any`. This caused every integer field of `event` to round-trip through float64. - [The Emit event fallback PR](#40854) changed the EventIndex value from a small incremental integer to a large unix nanosecond timestamp in case of conflict. The large value was no longer safe for storage in a float64. The combination of those 3 factors caused the cursor EventIndex to get corrupted and caused unexpected event query index offsets. When preseted with a non-existing document, DynamoDB still hashes it and starts the query from its supposed location in the index. This is why this issue has not been detected for so long. Its consequences were: - duplicated events returned on 2 consecutive pages (this case was handled properly by the event forwarder as it keeps track of the last processed event) - livelock if the number of duplicated events exceed the page size - non-forwarded events if the index offset was in the future
hugoShaka
added a commit
that referenced
this pull request
Sep 11, 2025
This commit fixes a data loss bug causing the DynamoDB cursor EventIndex field to be sightly changed due to conversion issues. As this field is used to index events, this could lead to paginated queries not returning the right events, either returning events from before or after the requirested page. In the worst case, this could cause a livelock as the query continuisly processes the same events. The data loss issue is caused by improper JSON unmarshalling of large integers. This happened because of this reasons: - JSON is fundamentally flawed as it offers a single number type "binary64" for all numbers, whether they are integers or float. Go's encoding/json library uses field types to detect if the number should be stored in an int64 or a float64. - [The AWS SDK v2 migration PR](#44363) changed the cursor JSON unmarshalling logic and unmarshalled the cursor into `map[string]any`. This caused every integer field of `event` to round-trip through float64. - [The Emit event fallback PR](#40854) changed the EventIndex value from a small incremental integer to a large unix nanosecond timestamp in case of conflict. The large value was no longer safe for storage in a float64. The combination of those 3 factors caused the cursor EventIndex to get corrupted and caused unexpected event query index offsets. When preseted with a non-existing document, DynamoDB still hashes it and starts the query from its supposed location in the index. This is why this issue has not been detected for so long. Its consequences were: - duplicated events returned on 2 consecutive pages (this case was handled properly by the event forwarder as it keeps track of the last processed event) - livelock if the number of duplicated events exceed the page size - non-forwarded events if the index offset was in the future
github-merge-queue bot
pushed a commit
that referenced
this pull request
Sep 11, 2025
This commit fixes a data loss bug causing the DynamoDB cursor EventIndex field to be sightly changed due to conversion issues. As this field is used to index events, this could lead to paginated queries not returning the right events, either returning events from before or after the requirested page. In the worst case, this could cause a livelock as the query continuisly processes the same events. The data loss issue is caused by improper JSON unmarshalling of large integers. This happened because of this reasons: - JSON is fundamentally flawed as it offers a single number type "binary64" for all numbers, whether they are integers or float. Go's encoding/json library uses field types to detect if the number should be stored in an int64 or a float64. - [The AWS SDK v2 migration PR](#44363) changed the cursor JSON unmarshalling logic and unmarshalled the cursor into `map[string]any`. This caused every integer field of `event` to round-trip through float64. - [The Emit event fallback PR](#40854) changed the EventIndex value from a small incremental integer to a large unix nanosecond timestamp in case of conflict. The large value was no longer safe for storage in a float64. The combination of those 3 factors caused the cursor EventIndex to get corrupted and caused unexpected event query index offsets. When preseted with a non-existing document, DynamoDB still hashes it and starts the query from its supposed location in the index. This is why this issue has not been detected for so long. Its consequences were: - duplicated events returned on 2 consecutive pages (this case was handled properly by the event forwarder as it keeps track of the last processed event) - livelock if the number of duplicated events exceed the page size - non-forwarded events if the index offset was in the future
github-merge-queue bot
pushed a commit
that referenced
this pull request
Sep 11, 2025
This commit fixes a data loss bug causing the DynamoDB cursor EventIndex field to be sightly changed due to conversion issues. As this field is used to index events, this could lead to paginated queries not returning the right events, either returning events from before or after the requirested page. In the worst case, this could cause a livelock as the query continuisly processes the same events. The data loss issue is caused by improper JSON unmarshalling of large integers. This happened because of this reasons: - JSON is fundamentally flawed as it offers a single number type "binary64" for all numbers, whether they are integers or float. Go's encoding/json library uses field types to detect if the number should be stored in an int64 or a float64. - [The AWS SDK v2 migration PR](#44363) changed the cursor JSON unmarshalling logic and unmarshalled the cursor into `map[string]any`. This caused every integer field of `event` to round-trip through float64. - [The Emit event fallback PR](#40854) changed the EventIndex value from a small incremental integer to a large unix nanosecond timestamp in case of conflict. The large value was no longer safe for storage in a float64. The combination of those 3 factors caused the cursor EventIndex to get corrupted and caused unexpected event query index offsets. When preseted with a non-existing document, DynamoDB still hashes it and starts the query from its supposed location in the index. This is why this issue has not been detected for so long. Its consequences were: - duplicated events returned on 2 consecutive pages (this case was handled properly by the event forwarder as it keeps track of the last processed event) - livelock if the number of duplicated events exceed the page size - non-forwarded events if the index offset was in the future
mmcallister
pushed a commit
that referenced
this pull request
Sep 22, 2025
This commit fixes a data loss bug causing the DynamoDB cursor EventIndex field to be sightly changed due to conversion issues. As this field is used to index events, this could lead to paginated queries not returning the right events, either returning events from before or after the requirested page. In the worst case, this could cause a livelock as the query continuisly processes the same events. The data loss issue is caused by improper JSON unmarshalling of large integers. This happened because of this reasons: - JSON is fundamentally flawed as it offers a single number type "binary64" for all numbers, whether they are integers or float. Go's encoding/json library uses field types to detect if the number should be stored in an int64 or a float64. - [The AWS SDK v2 migration PR](#44363) changed the cursor JSON unmarshalling logic and unmarshalled the cursor into `map[string]any`. This caused every integer field of `event` to round-trip through float64. - [The Emit event fallback PR](#40854) changed the EventIndex value from a small incremental integer to a large unix nanosecond timestamp in case of conflict. The large value was no longer safe for storage in a float64. The combination of those 3 factors caused the cursor EventIndex to get corrupted and caused unexpected event query index offsets. When preseted with a non-existing document, DynamoDB still hashes it and starts the query from its supposed location in the index. This is why this issue has not been detected for so long. Its consequences were: - duplicated events returned on 2 consecutive pages (this case was handled properly by the event forwarder as it keeps track of the last processed event) - livelock if the number of duplicated events exceed the page size - non-forwarded events if the index offset was in the future
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is a continuation of converting dynamodb components to use the latest version of the sdk that was started in
#44356.
This should have feature parity with the existing backend except for prometheus metrics. In an attempt to isolate the changes here the metrics are omitted for the time being and will be added in a follow up.
In addition, a few of the events test suite cases were updated to be more reliable when testing against a real backend.