-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why does #lock_tables need to be the outermost transaction? #64
Comments
That branch passes, so there isn't any test coverage to help explain it. There also isn't any git history on the locking file to help us :/ https://github.com/envato/double_entry/commits/master/lib/double_entry/locking.rb In theory, if the code inside I think a main point is you want all your locks at the top, and DoubleEntry certainly does when it loads and/or creates the accounts. It does make using DoubleEntry very hard in cases like the above, adding itself as an outer layer to any code that wants to do a financial transaction inside it, adding the dependency of knowing the accounts to lock to the code that just wants to do some database query along side a DoubleEntry transfer. I would prefer if that wasn't the case, as I like the look of the code above. It would be great to know exactly why is is required, so that if we understand the risks can design around them opt to not have DoubleEntry as an outer layer. |
It's been a while, and this is complicated, but it goes something like this: You want to grab all your locks at the start of the transaction, and do them in a consistent order, to minimise deadlocks. Deadlocks will happen though, particularly when we have to create AccountBalance records on the fly. That means we need the capability to retry the transaction. The fun is, when you get a deadlock, MySQL will always roll back the outermost transaction. If double entry didn't own the outermost transaction, that could roll back other things you've done. In the example above, if you wrote anything to the DB between Make any sort of sense? |
Or, put another way: You want reliable locking and deadlock handling? Hard things are hard. |
@rabidcarrot had also mentioned something about concurrency issues if all accounts weren't locked at the outermost level |
Yep. If you don't do all your locking in one place and in a consistent order, you're way more likely to get deadlocks. |
Thanks @notahat, I wonder how much of that is MySQL specific? I'm trying to find some docs on how deadlocks are handled in MySQL and Postgres, but as you put it...hard things are hard :P |
I'm not sure whether Postgres behaves the same way. |
DoubleEntry doesn't prevent deadlocks from within the lock accounts block. You could do DoubleEntry.lock_accounts(*accounts) do
@user.lock
@product.lock
end in one place DoubleEntry.lock_accounts(*accounts) do
@product.lock
@user.lock
end in another, which could cause a deadlock so it's really up to the programmer to avoid this. But the current locking API has really caused issues due to the outer transaction requirement ( the main issue is demonstrated by the lack of ability to write code like in the first post). Maybe there is a way to improve this API?
In the Keith's example above, is that a problem?
If this is the main reason, I don't see any need for the creation of account balance records to have to happen during the transaction. It could be a separate transaction that happened some point earlier, for example by calling |
I don't think there is a way to improve the locking without either a) changing ActiveRecord's behaviour, b) changing MySQL's behaviour, or c) changing the database model used. Keep in mind that this database model is a legacy model that had locking grafted on after the fact. If you were designing to make locking easy, it would likely look pretty different. (If I was designing from scratch, I'd actually design for eventual consistency and not have to muck around with all this locking. That's the way industrial strength financial systems work in most places.)
Keith's example has a "...". It depends on what's in the "...". More to the point, there's no way to programmatically know if something happened in the "..." that might cause a problem, so if you allow stuff to happen, you'll get random nasty failures. The failures will be timing related, so are very unlikely to show up in testing. (See the |
Requiring DoubleEntry.lock_tables to be the outermost transaction can result in less than ideal code. Keith's example would be great to be able to do. I believe it is possible to change the api without a) b) or c), but by doing d) - requiring account balance records to be created before doing a double entry transfer. The creation of the account balance records is the tricky thing that really needs that outer lock, due to the rollback and retry stuff. Looking at the locking code, it is very I believe Keith's example could work provided he simply guaranteed that all required double entry account records already existed. This would require more thought on the part of someone implementing a double entry solution, but it could be as simple as creating a a matching DoubleEntry::AccountBalance every time you create a CreditCard record. That way, when DoubleEntry tries to lock accounts, the lock won't fail due to no record existing. |
The If I was building finance code from scratch today, I'd make it eventually consistent. i.e. Just record that money was moved inside the transaction, and calculate balances after the fact. That's a better model in many ways, but it's not the model of this library, and there's not a way to make this library use that model without fundamentally changing the underlying data structures. |
@notahat I'm not saying do away with account balance records, or per-account locking, I'm saying the really complicated "ensure outermost transaction" locking code is due to the creation of account balance records, not locking them. ( or, trying to lock them when they don't exist ). My hypnosis is: that if, as part of your application, you ensure that all account balance records are created separately before using them ( and you may need to "ensure outermost transaction" for that ), once you have that assurance, you can safely do DoubleEntry transactions within another transaction. |
@eadz: There's definitely a lot of complexity in there to support lazy creation of account balances. And yes, you could simplify things if you assumed all account balances were created up front. I don't think that lazy creation is the only reason for the outermost transaction requirement though. |
I think it is the only reason that really matters. I think this is the essence of this Issue "Why does #lock_tables need to be the outermost transaction?" So we can agree that it's because of the account balance creation, but why else? What is the other reason for being an outermost transaction? ( Other than deadlocks which IMHO having an outermost transaction requirement doesn't solve that problem on its own, although it may help ). And the other question is, whatever the reason, is it so important that it prevents us from having a cleaner API? |
Deadlocks are the problem because of ActiveRecord and MySQL having broken behaviour around them. If you get a deadlock in a nested transaction, ActiveRecord behaves as though just the inner transaction is being rolled back, but MySQL rolls back the outermost transaction. Makes it impossible to retry after a deadlock unless you can be sure you own the outermost transaction. So you could abandon the requirement for being the outermost transaction as long as you abandon the retrying on deadlock. |
You may be able to pass I would suggest removing the automatic retry instead: because of the potential benefits you get to the API, removal of database specific and very complicated transactional / locking code, and the reduction in surprises such as the following: DoubleEntry.lock_accounts(account_a, account_b) do
payment = CreditCard::Charger.charge(@invoice.account.credit_card...)
if payment.success?
DoubleEntry.transfer(Money.new(20_00), :from => account_a, :to => account_b, :code => :purchase)
end
# other code that may cause a deadlock
end It's not clear in the documentation that |
What about a Retrying on deadlock and checking for outmost transaction remain the default, so it's safe by default, but if you know what you're doing, you can take your chances and handle retries higher up in the stack? When the option is on, deadlocks or any other errors just bubble up and fail the outer transaction, which is free to have its own retry mechanism (and even recommended so in the option's documentation) Maybe we could even provide a Having use for a transaction around your financial transfer is not an edge case, I'd argue it's one of the most common and important things you may need to wrap in a transaction with other stuff. It's been a recurring issue for us. Well worth the extra-complexity of an option in the code/interface I reckon. |
I think we definitely need some more options. I think there were some decisions made early on that have meant the API for using double entry is a little more difficult than it could be. Having double entry do some "magic" ( creating account balance records as part of locking ), seems to in my opinion complicate things enough to warrant an alternative. Some of the locking code seems to have some code smells;
def create_missing_account_balances
@accounts_without_balances.each do |account|
# Get the initial balance from the lines table. <<- this comment
balance = account.balance
# Try to create the balance record, but ignore it if someone else has done it in the meantime.
AccountBalance.create_ignoring_duplicates!(account: account, balance: balance)
end
end I don't understand how there can be a balance in the lines table, when a lock is required to create a line and therefore an account balance record. And finally, the DoubleEntry::Locking module seems to break SRP as it does at least two major things
From my reading, a potential way forward would be to make the following change:
This would remove a lot of the checks and edge cases around account balance records not existing. The creation of account balance records can happen in another place, as when you call From here there are a few options:
The creation of DoubleEntry account balance records DOES NOT need to happen with any locks - it can't, there is nothing to lock on, that's why it tries to create the record and if it fails it ignores the error ( duplicate error ). |
Based on these discussions, and also from #168 and #160. Wouldn't it be possible to eliminate locking by performing the following?
|
@kschutt for most cases, you wouldn't want to eliminate locking. I guess in theory you could rely on a line check but I don't believe it will fix any error it finds. It's a sanity checker, and doesn't fix any errors it finds AFAIK. For most cases you would want your balances to be accurate all the time, hence the locking. This thread is about the programmer's API for locking and the requirement to have double entry as the outer most transaction, not about removing locks entirely. |
I've been using advisory locks* recently, and I think could be another tool to potentially simplify the code. Both postgres and mysql ( >5.7 ) supports these. I believe that the following would allow kieth's code to work. Lets say the account is `"sales-#{customer.id}" we can do with_advisory_lock("double-entry-account-sales-2134") do
account = AccountBalance.find_or_create_by_ref("double-entry-account-sales-2134")
# this will never fail due to race conditions (? i think!! ?)
end so that means that # this is just an example of how it works.
# the actual code would map to names, sort the accounts, then run a bunch of nested advisory locks
# based on the names.
with_advisory_lock(account1-ref) do
with_advisory_lock(account2-ref) do
# find or create accounts
# yield to do transaction ( no need to lock the rows ).
end
end Not sure if this solves the outermost transaction issue - I think if it just fails on deadlock rather than retrying then that might be solved. And/Or @keithpitt could potentially use an advisory lock instead of |
Consider the following code:
This seems like pretty reasonable code to me, but unfortunately doesn't work because in the Invoice class, I'm locking the invoice (which creates a transaction), and then the
DoubleEntry#transfer
method fails because it's transaction is not the outer-most one.What's the reasoning behind making it the outer-most transaction? What would the world be like if it "just worked"?
The text was updated successfully, but these errors were encountered: