sql: support rolling back nested transactions containing schema changes #10735

knz · 2016-11-16T17:23:23Z

CockroachDB 20.1 is introducing support for nested transactions via postgres' (and SQL standard) savepoint protocol. SAVEPOINT, RELEASE SAVEPOINT, COMMIT SAVEPOINT.

However as of 20.1, a (nested) transaction containing schema changes—or any type of DDL statement, really—cannot be rewound. This is because the logic needed to invalidate cached schema elements in the transaction has not been implemented yet.

To advertise compatibility with PostgreSQL to a level that satisfies most uses in client apps, this gap remains to be closed.

Note: the discussion starting in Nov 2016 up to and including February 2020 was about the initial design of nested transactions in CockroachDB. Discussion from March 2020 onward is about rolling back over DDL statements.

Jira issue: CRDB-6134

bdarnell · 2016-11-17T05:51:49Z

It's not just a followup txn at the same timestamp: when you ROLLBACK TO SAVEPOINT, everything before the SAVEPOINT is still visible, but everything after the SAVEPOINT has been rolled back. That's not something we can feasibly do.

We could easily fake SAVEPOINT in a way that technically has the correct semantics, although it wouldn't actually make savepoints useful: whenever ROLLBACK TO SAVEPOINT is executed, return a TransactionRetryError to restart the whole transaction from zero (and SAVEPOINT and RELEASE SAVEPOINT commands become no-ops). This would make us compatible with drivers/frameworks that unconditionally use savepoints.

knz · 2016-11-17T07:39:03Z

Thanks Ben indeed what you say is the basic behavior for cases where a client just wants the syntax to work.

We discussed this with Andrei yesterday and I started to understand what's needed. Conceptually what we need here is "sub-transactions" in kv, which are able to see both their own intents and that of their parent(s). A lot of plumbing will be needed but I believe there's not much more to it than that (conceptually, that is).

tlvenn · 2017-01-15T04:39:26Z

Any chance to see some progress on this front ? I am surprised that only ecto/elixir is leveraging savepoints in the ORM world to assist with testing.

Thanks a lot in advance !

knz · 2017-01-17T12:47:17Z

Hi Christian, thanks for the heads up.

I do not know how far savepoints are exploited by different frameworks, but as a data point we've been working on compatibility with various other ORMs last quarter and savepoints didn't come up as a requirement yet.

Nevertheless, there is a large chance to see some progress on this front eventually, since we can already foresee a technical approach that amounts to an incremental change. However it's not on our short-term roadmap for now -- I predict our focus will be on stability and performance for the first half of 2017. As usual, our roadmap can be influenced, feel free to contact @dianasaur323 for details.

tlvenn · 2017-01-17T12:53:22Z

Thanks for the feedback @knz, much appreciated. It's true that Ecto is doing something pretty fancy/advanced in its use of savepoints for tests so no wonder other ORMs dont leverage it much... I will investigate on Ecto side for a workaround for the time being.

tbg · 2017-02-21T08:25:59Z

diesel-rs also uses savepoints for everything with the vanilla PG-backend, and while it's not an extremely widespread ORM (or so I suspect), I also had the impression that going with @bdarnell's suggestion above would make the most sense as I can't picture how we would "properly" implement savepoints on top of the current architecture, even in the longer term. Making it so that a failed subtransaction busts savepoints all the way to the top level transaction (which would then be retried) seems straightforward enough and useful. This might mean that some uses of savepoints which make sense for PG don't make a whole lot of sense in Cockroach, but that will be hard to avoid either way.

tlvenn · 2017-02-21T09:20:10Z

Could you maintain some kind of log per transaction where savepoints are acting as markers and when a rollback to a given savepoint is requested, you would rollback the txn and then start another followup txn at the same timestamp replaying the log until the marker is reached ?

tbg · 2017-02-21T09:36:51Z

You could, but that seems dangerous (begin; /* 1000mb of inserts */; savepoint; something_that_errors) and the only difference is that the server replays, not the client. The server would unconditionally have to buffer everything as it can never be certain that there won't be a future savepoint.

tlvenn · 2017-02-22T02:25:21Z

Isn't it how Postgres achieve some kind of HA / Replication, by streaming and replaying their XLOG ? At the same time I am not so clear on @bdarnell suggestion. Would emitting a TransactionRetryError to the client allow for the client driver to transparently restart the txn ? Or is it impacting the end user code somehow ?

petermattis · 2017-02-22T02:55:52Z

@tschottdorf To be fair, 1000mb of inserts in a single transaction will likely cause problems even without savepoints.

I'm curious, would the proposal on the table (to interpret rollback to savepoint as restarting the entire transaction) be compatible with the usage by ORMs? Seems likely to me that it wouldn't if savepoints are being used for simplifying testing.

I'm not sure I understand all of the semantics of savepoints. Seems like we get part or all of the way there if creating a savepoint incremented the logical part of the transaction timestamp and we had a mechanism to revert all intents that were >=savepoint_timestamp. We'd also need to maintain multiple intent versions at each key, but that doesn't seem impossible. I'm also not sure of the ramifications of having writes within a transaction that are at different timestamps (though only the logical part would differ). Or perhaps instead of incrementing the logical part of the transaction timestamp, there is an additional optional savepoint sequence number associated with a transaction.
There would be a lot of work to plumb this through the KV layer. Not something I'd want to tackle right now, but it seems feasible on the surface (which likely means there is a large under water mass hidden from view).

bdarnell · 2017-02-22T03:29:36Z

I'm curious, would the proposal on the table (to interpret rollback to savepoint as restarting the entire transaction) be compatible with the usage by ORMs?

It better be, since in Cockroach it's always possible for the entire transaction to get restarted at any time. This negates the benefits of savepoints, but it shouldn't break things for client applications/ORMs any more than our optimistic transaction model already does.

We'd also need to maintain multiple intent versions at each key, but that doesn't seem impossible

Not impossible, but hard. (it would have more benefits than savepoints, though).

Or perhaps instead of incrementing the logical part of the transaction timestamp, there is an additional optional savepoint sequence number associated with a transaction.

Yeah. Intents already have epochs; the best way to do this is probably to add new a new field to TxnMeta. Savepoints can (theoretically) be nested, which means that determining which savepoints are currently valid can get a bit complicated (unless we make some simplifying restrictions).

But I'm not convinced that "real" savepoints are a good fit for our transaction model - since full-transaction restarts are already relatively common, are you really going to be able to get enough value out of partial savepoint restarts to make them worth the complexity?

knz · 2017-02-22T07:18:46Z

I was thinking perhaps there is another angle here. Suppose a savepoint would also cause a sub-txn record to appear (perhaps this should rather be a new generation of the main txn, but what matters for my argument here is that it is a separate record), and all intents created after that to link to the sub-txn. I think this buys us partial retries. Then conflicts with other txns can push only that new record if we can detect the conflict is only overlapping with its intents. This way the client does not need to retry everything. Of course a conflict with an earlier intent needs to push both its txn records and all the linked txn records created since then. That's complicated perhaps. But I think in long running txn we could reduce the amount of work on both client and server this way, if the client issues save points at regular intervals.

tbg · 2017-02-22T12:52:18Z

But I'm not convinced that "real" savepoints are a good fit for our transaction model - since full-transaction restarts are already relatively common, are you really going to be able to get enough value out of partial savepoint restarts to make them worth the complexity?

That's basically my opinion, savepoints really just don't fit in well. I haven't investigated this thoroughly, but my impression was that savepoints are mostly used because "they're there". For example, they're convenient during tests (see ecto).

With any amount of complexity added, the behavior of what @bdarnell suggested earlier (the "easy solution") would still have to be handled by clients, just in less cases.

Short-running operations (which is hopefully what an ORM does most of the time) don't profit from being broken up into multiple stages of savepoints, so there we don't need it, unless they're doing something really fancy like using a nested savepoint scope to "test" whether they can get away with certain operations that would otherwise bust the parent scope if they failed (if those are around, would be instructive to see them).

Long-running operations which write large amounts of data are intrinsically problematic already (I wonder how Postgres handles them), and the subset of those that can profit from multiple stages might be rather small. Likely more could be gained by reducing the frequency of those transactions in the first place (for example, an import rarely needs to be in a transaction) or making them less likely to restart.

ISTM that there would have to be specific use cases to inform anything more complicated than the "easy way out", and something more complicated could be added later without any changes in public interfaces.

petermattis · 2017-02-22T13:20:08Z

ISTM that there would have to be specific use cases to inform anything more complicated than the "easy way out", and something more complicated could be added later without any changes in public interfaces.

Definitely. I too doubt that ORMs would use savepoints in normal usage. If they do, I'd love to see some concrete examples so we could understand why. And I'd like to understand how they are using savepoints to simplify testing. I can imagine the testing usage doesn't have proper retry loops which would make the simplified approach a non-starter.

petermattis · 2017-02-22T13:30:36Z

Savepoints can (theoretically) be nested, which means that determining which savepoints are currently valid can get a bit complicated (unless we make some simplifying restrictions).

Wouldn't keeping a sequence number for the savepoints by sufficient. That is, if you create savepoints A, B and C with sequence numbers 1, 2, 3 and then rollback to A, it would be straightforward to determine than only savepoint A is still valid. A ROLLBACK would reset the active savepoint sequence number (and rollback the associated intents). A RELEASE would remove the savepoint from the session state prohibiting a future rollback to that sequence number. I guess I don't see a particular complication with multiple savepoints beyond the complication for a single savepoint (which is significant enough that we almost certainly don't want to do this right now).

tlvenn · 2017-02-22T14:19:00Z

Definitely. I too doubt that ORMs would use savepoints in normal usage. If they do, I'd love to see some concrete examples so we could understand why. And I'd like to understand how they are using savepoints to simplify testing.

I gonna have to ask @fishcakez if he can help us understand better why Ecto is leveraging savepoints for testing and in which situation it helps further than beginning and rollbacking a transaction for each test. Something that I am not so clear on myself...

bdarnell · 2017-02-22T16:40:26Z

The complexity with multiple savepoints is that when you roll back to A and invalidate savepoint sequence numbers 2 and 3, what do you do when your partially-retried transaction advances to savepoint B again? If you give it ID 4, you have to remember now that 1 and 4 are valid but 2 and 3 are not. If you reuse ID 2, you have to erase all trace of the old savepoint (by synchronously resolving all its intents at the time of the rollback).

And I'd like to understand how they are using savepoints to simplify testing.

I think the basic idea is to speed the tests up by running each test in a transaction and rolling it back when the test finishes. This means you don't have to do any work to restore the database to its original state between tests. But you want this to be transparent to the tests (so they can use their own transactions without being aware that they're in a larger transaction), which requires nested transactions (i.e. savepoints).

fishcakez · 2017-02-23T01:31:29Z

I think the basic idea is to speed the tests up by running each test in a transaction and rolling it back when the test finishes. This means you don't have to do any work to restore the database to its original state between tests. But you want this to be transparent to the tests (so they can use their own transactions without being aware that they're in a larger transaction), which requires nested transactions (i.e. savepoints).

It is exactly this, and it can allow tests to run concurrently if there is some care is taken. We start a sandbox that a test case interacts with, it holds a single connection that calls BEGIN at the start and ROLLBACK at the end. In this testing mode BEGIN, COMMIT and ROLLBACK are replaced with the savepoint equivalents when using the transaction functionality of Ecto. As single queries, without an explicit transaction, are normally executed inside an implicit transaction, these are also wrapped in savepoints. This is required because an error in an individual query would cause the sandbox's real transaction to enter an error state and future queries would error until the real transaction was rolled back.

The later functionality is executed efficiently over the wire using a batch of queries with a single sync protocol message on the happy path. This feature is exposed generally for use in transactions when either inside and outside tests if a query may fail but the transaction should continue as if it had not occurred. In a similar way to ON_ERROR_ROLLBACK in psql. However it is off by default and requires an explicit opt-in per query using it.

This means that Ecto always uses savepoints to create nested transactions, and does not have interleaving savepoint logic. Ecto also uses the sandbox for integration testing each SQL database. This means it is not possible to run the tests against CockroachDB.

tbg · 2017-02-23T04:29:07Z

Definitely. I too doubt that ORMs would use savepoints in normal usage. If they do, I'd love to see some concrete examples so we could understand why. And I'd like to understand how they are using savepoints to simplify testing. I can imagine the testing usage doesn't have proper retry loops which would make the simplified approach a non-starter.

This is just the general problem that ORMs for PG likely don't do proper retries anywhere. No matter what implementation of savepoints we choose, everything will have to expect retries at any stage of the transaction. So doesn't really seem that it informs this issue one way or another.

In the example of Ecto, if the test cases were written in a restart-aware manner (i.e. as a closure with retry handler) they should run just fine even with all savepoints collapsed internally, though with a lot of odd restarts (as ecto would release savepoints frequently, which would force a restart of the transaction). That is, of course, ignoring issues such as #12123.

tlvenn · 2017-02-27T02:41:35Z

Just to follow up on the Ecto use case for testing, with some help from @fishcakez, I was able to implement an Ecto Sandbox for CockroachDB that emulates the implicit savepoints created during tests using a log replay strategy.

https://github.com/jumpn/cockroachdb_sandbox

lgo · 2017-12-04T18:27:34Z

Another place where SQL savepoints are used is in ActiveRecord. It uses them for making explicitly nested transactions. I'm not sure how common they are in usage but they are probably more common in tests. (This can be worked around by monkey-patching ActiveRecord to raise an error if the user creates a new nested transaction while using the CockroachDB adapter.)

For example, the Rails test suite for ActiveRecord uses them specifically while populating the database with fixtures. It doesn't seem like they are used elsewhere in the tests though.

This snippet gives a short explanation for how ActiveRecord transactions work, and when savepoints will be used.

# Assuming there was no transaction previously, a new one will be
# created at the beginning of the block.
ActiveRecord::Base.transaction do
  # This implicitly calls transaction(require_new: false). This does not
  # make a new transaction.
  ActiveRecord::Base.transaction do
  end

  # This will attempt to create another transaction using CREATE SAVEPOINT.
  ActiveRecord::Base.transaction(require_new: true) do
    # If an error is encountered here, an automatic rollback will occur.
    # This will cause a ROLLBACK SAVEPOINT, and we just exit one level
    # of the code block.
    raise(Exception)
  end
  # Once we exit the next code block, this will send a COMMIT
end

# A similar transaction interface is provided on any user-defined models. For
# example:
class Customer < ActiveRecord::Base
end

# The primary use case of this is when using two models that are stored in
# different databases. In this case, if we nest multiple transactions on these two
#  models, we are actually creating two transactions on different databases.
# Again, these don't involve any savepoints until we pass transaction(require_new: true)
Customer.transaction do
  ADifferentDatabaseCustomer.transaction do
    # ...
  end
end

This patch adds support for SAVEPOINT <foo>, RELEASE SAVEPOINT <foo>, ROLLBACK TO SAVEPOINT <foo>. Before this patch, we only had support for the special savepoint cockroach_restart, which had to be placed at the beginning of the transaction and was specifically intended for dealing with transaction retries. This patch implements general support for savepoints, which provide an error recovery mechanism. The connExecutor now maintains a stack of savepoints. Rolling back to a savepoint uses the recent KV api for ignoring a range of write sequence numbers. At the SQL level, savepoints differ in two characteristics: 1) savepoints placed at the beginning of a transaction (i.e. before any KV operations) are marked as "initial". Rolling back to an initial savepoint is always possible. Rolling back to a non-initial savepoint is not possible after the transaction restarts (see below). 2) the savepoint named "cockroach_restart" retains special RELEASE semantics: releasing it (attempts to) commit the underlying KV txn. This continues to allow for discovering of deferred serilizability errors (i.e. write timestamp pushes by the ts cache). As before, this RELEASE can fail with a retriable error, at which point the client can do ROLLBACK TO SAVEPOINT cockroach_restart (which is guaranteed to work because cockroach_restart needs to be an "initial" savepoint). The transaction continues to maintain all its locks after such an error. This is all in contrast to releasing any other savepoints, which cannot commit the txn and also never fails. See below for more discussion. The cockroach_restart savepoint is only special in its release behavior, not in its rollback behavior. With the implementation of savepoints, the state machine driving a SQL connection's transactions becomes a lot simpler. There's no longer a distinction between an "Aborted" transaction and one that's in "RestartWait". Rolling back to a savepoint now works the same way across the two states, so RestartWait is gone. This patch also improves the KV savepoints. They now capture and restore the state of the read spans and the in-flight writes. Some things don't work (yet): a) Rolling back to a savepoint created after a schema change will error out. This is because we don't yet snapshot the transaction's schema change state. b) After a retriable error, you can only rollback to an initial savepoint. Attempting to rollback to a non-initial savepoint generates a retriable error again. If the trasaction has been aborted, I think this is the right behavior; no recovery is possible since the transaction has lost its write intents. In the usual case where the transaction has not been aborted, I think we want something better but it will take more work to get it. I think the behavior we want is the following: - after a serializability failure, retrying just part of the transaction should be doable by attempting a ROLLBACK TO SAVEPOINT. This rollback should succeed if all the non-rolled-back reads can be refreshed to the desired timestamp. If they can be refreshed, then the client can simply retry the rolled back part of the transaction. If they can't, then the ROLLBACK should return a retriable error again, allowing the client to attempt a deeper rollback - and so on until the client rolls back to an initial savepoint (which succeeds by definition). Implementing this would allow for the following nifty pattern: func fn_n() { for { SAVEPOINT savepoint_n try { fn_n+1() } catch retriable error { err := ROLLBACK TO SAVEPOINT outer if err != nil { throw err } continue } RELEASE SAVEPOINT savepoint_n break } } The idea here is that the client is trying to re-do as little work as possible by successively rolling back to earlier and earlier savepoints. This pattern will technically work with the current patch already, except it will not actually help the client in any way since all the rollbacks will fail until we get to the very first savepoint. There's an argument to be made for making RELEASE SAVEPOINT check for deferred serializability violations (and perhaps other deferred checks - like deferred constraint validation), although Postgres doesn't do any of these. Anyway, I've left implementing this for a future patch because I want to do some KV work for supporting it nicely. Currently, the automatic restart behavior that KV transactions have is a pain in the ass since it works against what we're trying to do. For the time-being, non-initial savepoints remember their txn ID and epoch and attempting to rollback to them after these changes produces a retriable error automatically. Fixes cockroachdb#45477 Touches cockroachdb#10735 Release note (sql change): SQL savepoints are now supported. SAVEPOINT <foo>, RELEASE SAVEPOINT <foo>, ROLLBACK TO SAVEPOINT <foo> now works. `SHOW SAVEPOINT STATUS` can be used to inspect the current stack of active savepoints. Co-authored-by: Raphael 'kena' Poss <[email protected]> Co-authored-by: Andrei Matei <[email protected]>

41569: rfcs: SQL savepoints r=knz a=knz Informs #10735 Fixes #46075 Release justification: doc only change Latest edits found at: https://github.com/knz/cockroach/blob/20191014-rfc-savepoints/docs/RFCS/20191014_savepoints.md Co-authored-by: Raphael 'kena' Poss <[email protected]>

jordanlewis · 2020-04-02T19:58:52Z

@knz if this isn't completely fixed by your recent work, I would suggest closing it and replacing with a tighter issue explaining the remaining missing bits.

knz · 2020-04-02T20:46:07Z

Thanks for the reminder.
I have reworded the issue title and the top-level issue description to reflect the last remaining link to this issue number in the source code.
This is not any more an issue of SQL execution but rather one specific to the conn executor and the schema change logic.
We can have a look at it and aim at closing this in the coming weeks/months.

knz · 2020-04-22T13:29:33Z

@lucy-zhang I understand that partial DDL rollback is not on the criticial path for now (@awoods187 can confirm perhaps) so this is low priority, but I think it needs to be roadmapped/prioritized for 21.1 or not too long after.

lessless · 2022-07-30T22:06:39Z

Hi guys! How's this coming along?

knz · 2022-07-31T20:28:03Z

cc @ajwerner for re-triage

ajwerner · 2022-08-02T14:40:05Z

We'll revisit this after we finish adopting the declarative schema changer. I don't expect progress on this in the year 2022.

knz added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-sql-semantics labels Nov 16, 2016

knz mentioned this issue Nov 16, 2016

sql/pgwire: Elixir Connection Issues - Postgrex Protocol Errors - First Test #5582

Closed

cuongdo mentioned this issue Feb 12, 2017

sql: support ActiveRecord through postgresql adapter #12792

Closed

13 tasks

tlvenn mentioned this issue Feb 22, 2017

Sql sandbox without leveraging savepoints elixir-ecto/ecto#1959

Closed

petermattis added this to the Later milestone Feb 23, 2017

tbg mentioned this issue Feb 24, 2017

sql: support diesel-rs #13787

Open

4 tasks

andreimatei mentioned this issue Mar 6, 2020

sql,kv: add preliminary SQL savepoints support #45566

Merged

knz mentioned this issue Mar 6, 2020

sql: introduce preliminary support for savepoints #43051

Closed

jordanlewis assigned knz Apr 2, 2020

knz changed the title ~~sql: support SQL savepoints~~ sql: support rolling back nested transactions containing schema changes Apr 2, 2020

knz removed their assignment Apr 22, 2020

knz added A-schema-changes and removed A-sql-execution Relating to SQL execution. labels Apr 22, 2020

jordanlewis added the A-schema-transactional label Apr 30, 2020

jlinder added the T-sql-schema-deprecated Use T-sql-foundations instead label Jun 16, 2021

exalate-issue-sync bot added T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) and removed T-sql-schema-deprecated Use T-sql-foundations instead labels May 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: support rolling back nested transactions containing schema changes #10735

sql: support rolling back nested transactions containing schema changes #10735

knz commented Nov 16, 2016 •

edited by cockroach-jira-scripts

Loading

bdarnell commented Nov 17, 2016

knz commented Nov 17, 2016

tlvenn commented Jan 15, 2017

knz commented Jan 17, 2017

tlvenn commented Jan 17, 2017

tbg commented Feb 21, 2017

tlvenn commented Feb 21, 2017

tbg commented Feb 21, 2017

tlvenn commented Feb 22, 2017

petermattis commented Feb 22, 2017

bdarnell commented Feb 22, 2017

knz commented Feb 22, 2017 via email

tbg commented Feb 22, 2017

petermattis commented Feb 22, 2017

petermattis commented Feb 22, 2017

tlvenn commented Feb 22, 2017

bdarnell commented Feb 22, 2017

fishcakez commented Feb 23, 2017

tbg commented Feb 23, 2017

tlvenn commented Feb 27, 2017 •

edited

Loading

lgo commented Dec 4, 2017 •

edited

Loading

jordanlewis commented Apr 2, 2020

knz commented Apr 2, 2020

knz commented Apr 22, 2020

lessless commented Jul 30, 2022

knz commented Jul 31, 2022

ajwerner commented Aug 2, 2022

sql: support rolling back nested transactions containing schema changes #10735

sql: support rolling back nested transactions containing schema changes #10735

Comments

knz commented Nov 16, 2016 • edited by cockroach-jira-scripts Loading

bdarnell commented Nov 17, 2016

knz commented Nov 17, 2016

tlvenn commented Jan 15, 2017

knz commented Jan 17, 2017

tlvenn commented Jan 17, 2017

tbg commented Feb 21, 2017

tlvenn commented Feb 21, 2017

tbg commented Feb 21, 2017

tlvenn commented Feb 22, 2017

petermattis commented Feb 22, 2017

bdarnell commented Feb 22, 2017

knz commented Feb 22, 2017 via email

tbg commented Feb 22, 2017

petermattis commented Feb 22, 2017

petermattis commented Feb 22, 2017

tlvenn commented Feb 22, 2017

bdarnell commented Feb 22, 2017

fishcakez commented Feb 23, 2017

tbg commented Feb 23, 2017

tlvenn commented Feb 27, 2017 • edited Loading

lgo commented Dec 4, 2017 • edited Loading

jordanlewis commented Apr 2, 2020

knz commented Apr 2, 2020

knz commented Apr 22, 2020

lessless commented Jul 30, 2022

knz commented Jul 31, 2022

ajwerner commented Aug 2, 2022

knz commented Nov 16, 2016 •

edited by cockroach-jira-scripts

Loading

tlvenn commented Feb 27, 2017 •

edited

Loading

lgo commented Dec 4, 2017 •

edited

Loading