Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pagination on event table with dynamo DB doesn't keep the sort order. #128

Open
bbouchard31 opened this issue May 4, 2018 · 5 comments
Open

Comments

@bbouchard31
Copy link

Hi,

I'm trying to paginate over the event table by giving a limit and a skip value to the getEvents method.
I'm using dynamoDB and my issue is that the order of the events is totally random because of the dynamoDB scan.
So there is no way for me to rebuild my state without doing snapshots.
Is there a way to keep the order when paginating over all the events of a specific aggregate?

@adrai
Copy link
Contributor

adrai commented May 4, 2018

@developmentalmadness @chriscosgriff can you confirm/help?

@chriscosgriff
Copy link
Contributor

chriscosgriff commented May 5, 2018

@bbouchard31 Can you confirm whether the results for an individual aggregate stream/instance are arriving in a random order for you?

Although I haven't found AWS documentation explicitly stating this, my experience has been that the DynamoDB scan operation will return items for an individual partition key (aggregateID in the case of node-eventstore) in the order in which they were originally stored, however they won't necessarily all come in the same batch as results are gathered from multiple partitions during the scan operation. If you or @developmentalmadness have a different understanding please let me know.

In terms of the getEvents method specifically, in our project we wanted to utilise ES6 generators, so we created our own getEvents function that utilises scan and is used for rebuilding projections. So far after 6 months of production usage we have not had any known errors in terms of events for an individual aggregate stream arriving out of sequence during a rebuild (which we explicitly check for), but we only have approx. 50,000 events so it may be we are not at the scale required to experience issues with the scan operation.

@bbouchard31
Copy link
Author

@chriscosgriff Indeed it works for an individual aggregateID because there is no scan but a query that's performed. Switching to mongoDB everything works fine. But that's maybe our usage that's not optimal as we need to rebuild ours states for all of our aggregates without knowing the different aggregatesIDs. Thanks for your quick answers.

@chriscosgriff
Copy link
Contributor

Sorry @bbouchard31, I realise after reading the question I asked you above for a second time, it was not clear enough. I was meaning when performing a scan (not query), like in the way I described we rebuild our projections in my final paragraph. For me, when performing a scan the events for individual aggregates arrive in order, but potentially not all of the events come in the same page/batch and they may also be interspersed with events for other aggregates, practically this shouldn't be a problem though as if you're rebuilding the projection state for all of the aggregates you just keep processing until you have worked through all the records in the table after which all aggregates will be up to date.

@bbouchard31
Copy link
Author

bbouchard31 commented May 7, 2018

@chriscosgriff on our side the scan did send some events for the same aggregateIds in the wrong order. Again that may be a configuration issue on our side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants