Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inefficiencies and correctness of sqlite db reads and writes #40

Closed
ellisonbg opened this issue Nov 15, 2022 · 2 comments · Fixed by #42
Closed

Inefficiencies and correctness of sqlite db reads and writes #40

ellisonbg opened this issue Nov 15, 2022 · 2 comments · Fixed by #42

Comments

@ellisonbg
Copy link

I believe our current approach to reading and writing to/from the sqlite database in the SQLiteYStore is inefficient and could lead to incorrect reads that cause difficult to debug bugs.

The approach we are using currently causes 2 db writes for each update, the first one is (roomid, update) and the second is (roomid, metadata). See https://github.com/y-crdt/ypy-websocket/blob/main/ypy_websocket/ystore.py#L153 for details. These two writes are done asynchronously. In the read phase, we query all rows for the path/room and then read the alternating rows as updates and metadata.

There are a couple of issues with this:

  1. By storing the update and metadata as separate rows in the database (rather than as separate columns) we are doubling the number or rows, doubling the number of writes, and adding extra processing logic in python to separate out the update and metadata.
  2. Because the writes of the path and update are async, it is possible that other read/writes can lead to the interleaving of the alternating update/metadata writes being broken. Thus, we could end up with (update1, update2, metadata1, metadata2). When this is read, it would end up erroneously treating update2 as metadata1 and metadata1 as update2.
  3. We are relying on an assumption that the insert order will be preserved in the read order, which is not the case for sqlite (see https://stackoverflow.com/questions/5674385/sqlite-insertion-order-vs-query-order). If this assumption is broken, we would again have mixed updates and metadata.

To fix this, I propose the following:

  1. Write/read each update as a single row with (roomid, update, metadata) columns.
  2. If we need to preserve insert order upon read, use ORDERBY ROWID. If we write each update as a single row though, this may not be needed though.
@davidbrochart
Copy link
Collaborator

davidbrochart commented Nov 15, 2022

Thanks for the detailed issue @ellisonbg, I will open a PR.
Also, I think we should use locks when writing to the file-based YStores.

@ellisonbg
Copy link
Author

I think the file based stores are probably more robust than the sqlite one, but agree that using locks to write would be a good idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants