Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrency control in multi-user environments #1069

Open
SebAlbert opened this issue Feb 19, 2018 · 7 comments
Open

Concurrency control in multi-user environments #1069

SebAlbert opened this issue Feb 19, 2018 · 7 comments

Comments

@SebAlbert
Copy link

Dear PostgREST community

Is there an "idiomatic" way (meaning PostgREST principles) to handle concurrent updates in a multi-user environment? I.e., with PostgREST as "backend", having a frontend (say webapp) for users to retrieve, edit, and then store changes to entities (rows in tables). How would you handle two users attempting to edit the very same entity (row) concurrently, while maintaining integrity?

There are some suggestions over here: https://blog.4psa.com/rest-best-practices-managing-concurrent-updates/ - how would this translate to PostgREST?

How do you handle this in real world applications?

Sebastian

@begriffs
Copy link
Member

Concurrency issues are handled by the database. I wrote an article about the different kinds of concurrency problems and how PostgreSQL handles them: https://begriffs.com/posts/2017-08-01-practical-guide-sql-isolation.html

See which of those issues might apply in your situation. PostgREST uses the "Read Committed" level of transaction isolation. Your question makes me think maybe this level should be configurable from the PostgREST server configuration file.

Another thing is that we don't yet have a way to specify that multiple client requests happen in a single transaction. Some ideas for the interface are proposed in #286. Currently to execute queries in a transaction you must create a stored procedure and have the client execute it via /rpc/function_name.

@begriffs
Copy link
Member

One more note: if we turn the isolation higher we can run into the serialization_failure exception. The client would need to retry the request if it received that exception, or else we'd need to add logic inside PostgREST to retry with perhaps exponential backoff.

@jackfirth
Copy link

jackfirth commented Feb 19, 2018

@begriffs HTTP etags and conditional requests would be a good way to expose retryable concurrent update failures to clients. See the ETag and If-Unmodified-Since headers along with the 412 Precondition Failed status code. Also, PostgREST can include a Retry-After header in 412 responses so clients don't have to implement the logic for calculating exponential backoff themselves - they can just wait however long PostgREST tells them to.

@SebAlbert
Copy link
Author

My concern really wasn't about database transactions (although it would be neat if we could control them for PostgREST, too!) - it is usually strongly discouraged anyway to hold a database connection with a transaction open during the time a user (human being in front of the computer :-)) takes to edit fields of some entity (and maybe goes away for any amount of time - maybe even forever).

@jackfirth I don't think exponential backoff (i.e., just retrying the very same request later) applies to the case when an update is rejected because the entity has been updated by somebody else in the meantime (which is the meaning of "If-Unmodified-Since"). Something that is NOT unmodified since time X will never again be unmodified since the same time X, I think.

@mordae
Copy link

mordae commented Mar 12, 2018

Raising serialization_failure exception is a good way to signal client that their update failed and that they need to reload the data in order to retry. I would advise against automatic retries, not all transactions are idempotent.

+1 for making the isolation level configurable.

@steve-chavez
Copy link
Member

Once #1176 is done, this could be solved with an If-Match: <etag>. For an example, see https://www.hl7.org/fhir/http.html#concurrency.

@steve-chavez
Copy link
Member

steve-chavez commented Mar 23, 2023

So I've been checking the WedDAV RFC and it has the LOCK/UNLOCK HTTP methods which could solve this use case. These can be mapped to PostgreSQL advisory locks, which can be used for distributed locking(this is also related to this idea on #286). No need to leave a transaction open in this case.

So basically, we could do:

LOCK /tbl?id=eq.1

200 OK
Lock-Token: <token>

The user can then store the Lock-Token locally and then use it for:

UNLOCK /tbl?id=eq.1
Lock-Token: <token>

200 OK

This Lock-Token would get mapped to a session-level advisory lock(see pg docs).

Whenever others users try to PATCH/DELETE the same resource, they'd get an error:

PATCH /tbl?id=eq.1

423 LOCKED

Seems this should work in theory.


We'd have to somehow restrict PATCH/DELETE to only be based on PK equality, similar to how PUT is restricted right now. Otherwise we wouldn't be able to prevent a PATCH /tbl?id=lt.1000&id=gt.500 on the resource.


Since releasing session-level advisory locks is up to the application, we'd need a way to do this. We would definitely need a timeout. One lazy idea would be to leave it to the pool idle timeout, but it's not guaranteed that a connection would be idle. A connection lifetime(nikita-volkov/hasql-pool#28) could also be too long.

So we would need to maintain some state in postgREST, which would be the advisory locks ids. Then use another thread that periodically frees them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

5 participants