Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove race condition in postTransactionOld where pending (spent) UTxO could be selected as inputs #2827

Merged
merged 3 commits into from
Aug 20, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions lib/core/cardano-wallet-core.cabal
Original file line number Diff line number Diff line change
Expand Up @@ -209,6 +209,7 @@ library
Cardano.Wallet.Unsafe
Cardano.Wallet.Version
Cardano.Wallet.Version.TH
Control.Concurrent.Concierge
Crypto.Hash.Utils
Data.Function.Utils
Data.Time.Text
Expand Down Expand Up @@ -299,6 +300,7 @@ test-suite unit
, http-types
, iohk-monitoring
, io-classes
, io-sim
, lattices
, lens
, memory
Expand Down Expand Up @@ -407,6 +409,7 @@ test-suite unit
Cardano.Wallet.RegistrySpec
Cardano.Wallet.TransactionSpec
Cardano.WalletSpec
Control.Concurrent.ConciergeSpec
Data.Function.UtilsSpec
Data.QuantitySpec
Data.Time.TextSpec
Expand Down
21 changes: 20 additions & 1 deletion lib/core/src/Cardano/Wallet/Api.hs
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TypeOperators #-}

{- HLINT ignore "Use newtype instead of data" -}

module Cardano.Wallet.Api
( -- * API
Api
Expand Down Expand Up @@ -146,6 +148,8 @@ module Cardano.Wallet.Api
, ApiLayer (..)
, HasWorkerRegistry
, workerRegistry
, WalletLock (..)
, walletLocks
, HasDBFactory
, dbFactory
, tokenMetadataClient
Expand Down Expand Up @@ -247,6 +251,8 @@ import Cardano.Wallet.TokenMetadata
( TokenMetadataClient )
import Cardano.Wallet.Transaction
( TransactionLayer )
import Control.Concurrent.Concierge
( Concierge )
import Control.Tracer
( Tracer, contramap )
import Data.ByteString
Expand Down Expand Up @@ -1073,14 +1079,21 @@ data ApiLayer s (k :: Depth -> Type -> Type)
(TransactionLayer k)
(DBFactory IO s k)
(WorkerRegistry WalletId (DBLayer IO s k))
(Concierge IO WalletLock)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to put this lock variable into the WalletLayer type, right?
Then we wouldn't need a Concierge.

As I understand the problem, the lock we need is a spending coin selection lock. While there is a coin selection running, if the resulting TxIns will be marked as spent (pending), then we must prevent other threads from selecting and spending any UTxO from the same wallet.

Unfortunately, as you previously pointed out, much of postTransactionOld should be in the Wallet layer not API server layer. But it should still be possible to put this lock variable in the Wallet layer where it belongs.

Also, a future version of constructTransaction (the new transaction API) will have a query parameter to mark the UTxO which were selected as spent (pending). So we will need exactly the same coin selection locking there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to put this lock variable into the WalletLayer type, right?
Then we wouldn't need a Concierge.

Actually, it's not a single lock for the entire layer, but one lock for each wallet id. 😅 The first approach that comes to mind for managing these locks would be to create each one lock at the beginning of the lifecycle of each wallet id. However, I think that it's less intrusive to use this Concierge thing instead, because it doesn't need to know about the lifecycle.

Unfortunately, as you previously pointed out, much of postTransactionOld should be in the Wallet layer not API server layer. But it should still be possible to put this lock variable in the Wallet layer where it belongs.

My current thinking is that these locks around postTransactionOld are supposed to be a quick fix for an issue that Binance is having, but not more. Imposing an ordering on the POST transactions endpoint seemed like an adequate solution for that, and that is something I would attribute to the ApiLayer, even though I (perhaps paradoxically) agree that postTransactionOld should be part of the wallet layer.

On the Wallet layer, the possibility to do selectAssets and submitTx separately makes it possible to circumvent the lock. Hence, on the wallet layer, I think that postTransactionOld should come without a lock. I think that for this layer, we need a more principled solution, especially because it is possible to do a coin selection and then discard the result without ever submitting to the chain, e.g. as estimateFee does. I would prefer a solution in terms of pure data dependencies: A user can do as many coin selections as they like, but each of these will reference the wallet state from which the selection was made. Submitting one selection to the wallet will render the others invalid, because they do not fit to the new wallet state; this could result in a helpful error message.

TL;DR: My preference would be more pure code (later), less locks. 😅 Hence this "throwaway" solution to locking the POST transactions endpoint.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - I appreciate the quick fix approach, because as the name suggests, postTransactionOld is not going to be around for long.

I probably should have said it explicitly - there is one WalletLayer per WalletId, so it seems to me like the natural home for this lock variable. The lock would be needed in the medium term, even after postTransactionOld has gone.

I think we pretty much agree but perhaps don't have a common vocabulary for this thing. I differentiate between "coin selections" and "spending coin selections" - the latter is where the TxIns to be spent must be immediately marked as Used with a Pending transaction. The former don't need a lock, the latter do.

It would be preferable for the wallet to run spending coin selections sequentially, rather than potentially return error messages about conflicts, invalid txins, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably should have said it explicitly - there is one WalletLayer per WalletId, so it seems to me like the natural home for this lock variable.

Oh. I did not know that. This fact was not apparent to me, because all the functions in Cardano.Wallet take a WalletId and go through great pains to throw an ErrNoSuchWallet exception if that Id is invalid, hence I assumed that the WalletLayer cannot be tailored to be a specific WalletId, for then there would be no reason to throw this exception. Oh my. Can I vote to put this on the "to renovate" wishlist? 😅

Hm. For the sake of argument, let me try to find a way to get a lock into the WalletLayer. A WalletLayer is created only with hoistResource applied to an ApiLayer:

https://github.com/input-output-hk/cardano-wallet/blob/3f9513b926f0c9ebb7884da386e7419e8da56e27/lib/core/src/Cardano/Wallet/Api.hs#L1090-L1095

This is used in withWorkerCtx which is called in pretty much every a REST API endpoint:

https://github.com/input-output-hk/cardano-wallet/blob/3f9513b926f0c9ebb7884da386e7419e8da56e27/lib/core/src/Cardano/Wallet/Api/Server.hs#L3109-L3140

Oh my. This means that in order to create a lock in the WalletLayer, I would have to create it in the ApiLayer first, and then percolate it to the right wallet Id. In other words, the collection of locks (the equivalent of the Concierge thing) will not go away, instead, now it also has to be propagated to each WalletLayer. And I would have to think about the lifecycle: When do I create the lock, e.g. call newMVar? I suppose in registerWorker, but I'm not entirely sure actually.

https://github.com/input-output-hk/cardano-wallet/blob/3f9513b926f0c9ebb7884da386e7419e8da56e27/lib/core/src/Cardano/Wallet/Api/Server.hs#L3071-L3102

Hm. I think the easiest solution is the following: If you feel that the WalletLayer should at least know about the locks, then I can add a Concierge field to the WalletLayer and propagate a vanilla copy from the ApiLayer. But adding a lifecycle for the lock seems to be more trouble than it's worth at the moment; I would rather work on the DB layer renovation which will allow us to make the code much more pure, which in turn makes most of these problems disappear. 😅

(TokenMetadataClient IO)
deriving (Generic)

-- | Locks that are held by the wallet in order to enforce
-- sequential executation of some API actions.
-- Used with "Control.Concurrent.Concierge".
data WalletLock = PostTransactionOld WalletId
deriving (Eq, Ord, Show)

instance HasWorkerCtx (DBLayer IO s k) (ApiLayer s k) where
type WorkerCtx (ApiLayer s k) = WalletLayer IO s k
type WorkerMsg (ApiLayer s k) = WalletWorkerLog
type WorkerKey (ApiLayer s k) = WalletId
hoistResource db transform (ApiLayer _ tr gp nw tl _ _ _) =
hoistResource db transform (ApiLayer _ tr gp nw tl _ _ _ _) =
WalletLayer (contramap transform tr) gp nw tl db

{-------------------------------------------------------------------------------
Expand Down Expand Up @@ -1115,6 +1128,12 @@ tokenMetadataClient
tokenMetadataClient =
typed @(TokenMetadataClient IO)

walletLocks
:: forall ctx. (HasType (Concierge IO WalletLock) ctx)
=> Lens' ctx (Concierge IO WalletLock)
walletLocks =
typed @(Concierge IO WalletLock)

{-------------------------------------------------------------------------------
Type Families
-------------------------------------------------------------------------------}
Expand Down
17 changes: 15 additions & 2 deletions lib/core/src/Cardano/Wallet/Api/Server.hs
Original file line number Diff line number Diff line change
Expand Up @@ -178,8 +178,10 @@ import Cardano.Wallet.Api
, HasDBFactory
, HasTokenMetadataClient
, HasWorkerRegistry
, WalletLock (..)
, dbFactory
, tokenMetadataClient
, walletLocks
, workerRegistry
)
import Cardano.Wallet.Api.Server.Tls
Expand Down Expand Up @@ -578,6 +580,7 @@ import qualified Cardano.Wallet.Primitive.Types.TokenBundle as TokenBundle
import qualified Cardano.Wallet.Primitive.Types.Tx as W
import qualified Cardano.Wallet.Primitive.Types.UTxO as UTxO
import qualified Cardano.Wallet.Registry as Registry
import qualified Control.Concurrent.Concierge as Concierge
import qualified Data.Aeson as Aeson
import qualified Data.ByteString as BS
import qualified Data.ByteString.Lazy as BL
Expand Down Expand Up @@ -1850,7 +1853,8 @@ postTransactionOld ctx genChange (ApiT wid) body = do
, txTimeToLive = ttl
}

(sel, tx, txMeta, txTime) <- withWorkerCtx ctx wid liftE liftE $ \wrk -> do
(sel, tx, txMeta, txTime) <- withWorkerCtx ctx wid liftE liftE $ \wrk ->
atomicallyWithHandler (ctx ^. walletLocks) (PostTransactionOld wid) $ do
w <- liftHandler $ W.readWalletUTxOIndex @_ @s @k wrk wid
sel <- liftHandler
$ W.selectAssets @_ @s @k wrk w txCtx outs (const Prelude.id)
Expand Down Expand Up @@ -3093,7 +3097,8 @@ newApiLayer tr g0 nw tl df tokenMeta coworker = do
re <- Registry.empty
let trTx = contramap MsgSubmitSealedTx tr
let trW = contramap MsgWalletWorker tr
let ctx = ApiLayer trTx trW g0 nw tl df re tokenMeta
locks <- Concierge.newConcierge
let ctx = ApiLayer trTx trW g0 nw tl df re locks tokenMeta
listDatabases df >>= mapM_ (startWalletWorker ctx coworker)
return ctx

Expand Down Expand Up @@ -3216,6 +3221,14 @@ withWorkerCtx ctx wid onMissing onNotResponding action =
re = ctx ^. workerRegistry @s @k
df = ctx ^. dbFactory @s @k

{-------------------------------------------------------------------------------
Atomic handler operations
-------------------------------------------------------------------------------}
atomicallyWithHandler
:: Ord lock
=> Concierge.Concierge IO lock -> lock -> Handler a -> Handler a
atomicallyWithHandler c l = Handler . Concierge.atomicallyWith c l . runHandler'

{-------------------------------------------------------------------------------
Error Handling
-------------------------------------------------------------------------------}
Expand Down
88 changes: 88 additions & 0 deletions lib/core/src/Control/Concurrent/Concierge.hs
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
{-# LANGUAGE NamedFieldPuns #-}
{-# LANGUAGE RankNTypes #-}

{- HLINT ignore "Use newtype instead of data" -}

-- |
-- Copyright: © 2021 IOHK
-- License: Apache-2.0
--
-- This module provides a utility for ordering concurrent actions
-- via locks.
module Control.Concurrent.Concierge
( Concierge
, newConcierge
, atomicallyWith
, atomicallyWithLifted
)
where

import Prelude

import Control.Monad.Class.MonadFork
( MonadThread, ThreadId, myThreadId )
import Control.Monad.Class.MonadSTM
( MonadSTM
, TVar
, atomically
, modifyTVar
, newTVarIO
, readTVar
, retry
, writeTVar
)
import Control.Monad.Class.MonadThrow
( MonadThrow, bracket )
import Control.Monad.IO.Class
( MonadIO, liftIO )
import Data.Map.Strict
( Map )

import qualified Data.Map.Strict as Map

{-------------------------------------------------------------------------------
Concierge
-------------------------------------------------------------------------------}
-- | At a 'Concierge', you can obtain a lock and
-- enforce sequential execution of concurrent 'IO' actions.
--
-- Back in the old days, hotel concierges used to give out keys.
-- But after the cryptocurrency revolution, they give out locks. :)
-- (The term /lock/ is standard terminology in concurrent programming.)
Comment on lines +49 to +51
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like you're drawing a long bow with this Concierge name. 😛

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😂 I have a sweet spot for figurative names. I briefly considered "Locksmith", but that suggested that the locks were being created on the fly instead of being rented out. The drawback of figurative names is that they take a moment of explanation, but the drawback of generic names like "lock collection", "lock factory", "lock registry", … is that it's easy to forget what they do.

data Concierge m lock = Concierge
{ locks :: TVar m (Map lock (ThreadId m))
}

-- | Create a new 'Concierge' that keeps track of locks.
newConcierge :: MonadSTM m => m (Concierge m lock)
newConcierge = Concierge <$> newTVarIO Map.empty

-- | Obtain a lock from a 'Concierge' and run an 'IO' action.
--
-- If the same (equal) lock is already taken at this 'Concierge',
-- the thread will be blocked until the lock becomes available.
--
-- The action may throw a synchronous or asynchronous exception.
-- In both cases, the lock is returned to the concierge.
atomicallyWith
:: (Ord lock, MonadIO m, MonadThrow m)
=> Concierge IO lock -> lock -> m a -> m a
atomicallyWith = atomicallyWithLifted liftIO

-- | More polymorphic version of 'atomicallyWith'.
atomicallyWithLifted
:: (Ord lock, MonadSTM m, MonadThread m, MonadThrow n)
=> (forall b. m b -> n b)
-> Concierge m lock -> lock -> n a -> n a
atomicallyWithLifted lift Concierge{locks} lock action =
bracket acquire (const release) (const action)
where
acquire = lift $ do
tid <- myThreadId
atomically $ do
ls <- readTVar locks
case Map.lookup lock ls of
Just _ -> retry
Nothing -> writeTVar locks $ Map.insert lock tid ls
release = lift $
atomically $ modifyTVar locks $ Map.delete lock
83 changes: 83 additions & 0 deletions lib/core/test/unit/Control/Concurrent/ConciergeSpec.hs
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
{-# LANGUAGE RankNTypes #-}
module Control.Concurrent.ConciergeSpec
( spec
) where

import Prelude

import Control.Concurrent.Concierge
( atomicallyWithLifted, newConcierge )
import Control.Monad.Class.MonadFork
( forkIO )
import Control.Monad.Class.MonadSay
( say )
import Control.Monad.Class.MonadTimer
( threadDelay )
import Control.Monad.IOSim
( IOSim, runSimTrace, selectTraceEventsSay )
import Control.Monad.Trans.Class
( lift )
import Control.Monad.Trans.Except
( ExceptT, catchE, runExceptT, throwE )
import Test.Hspec
( Spec, describe, it, parallel )
import Test.QuickCheck
( Property, (===) )

spec :: Spec
spec = do
parallel $ describe "Control.Concurrent.Concierge" $ do
it "Atomic operations do not interleave"
unit_atomic

it "throwE in ExceptT releases lock"
unit_exceptT_release_lock

{-------------------------------------------------------------------------------
Properties
-------------------------------------------------------------------------------}
-- | Deterministic test for atomicity.
-- We have to compare a program run that interleaves
-- against one that is atomic.
unit_atomic :: Bool
unit_atomic =
("ABAB" == sayings testInterleave) && ("AABB" == sayings testAtomic)
where
sayings :: (forall s. IOSim s a) -> String
sayings x = concat . selectTraceEventsSay $ runSimTrace x

testAtomic = do
concierge <- newConcierge
test $ atomicallyWithLifted id concierge ()
testInterleave = test id

test :: (forall a. IOSim s a -> IOSim s a) -> IOSim s ()
test atomically = do
_ <- forkIO $ atomically (delay 1 >> action "B")
atomically $ action "A"
delay 4

action :: String -> IOSim s ()
action s = say s >> delay 2 >> say s

delay :: Int -> IOSim s ()
delay n = threadDelay (fromIntegral n*0.1)

-- | Check that using 'throwE' in the 'ExceptE' monad releases the lock
unit_exceptT_release_lock :: Property
unit_exceptT_release_lock =
["A"] === selectTraceEventsSay (runSimTrace $ runExceptT test)
where
liftE :: IOSim s a -> ExceptT String (IOSim s) a
liftE = lift

test :: ExceptT String (IOSim s) ()
test = do
concierge <- liftE newConcierge
let atomically = atomicallyWithLifted liftE concierge ()
_ <- tryE $ atomically $ throwE "X"
atomically $ liftE $ say "A"

-- not exported in transformers <= 0.5.6
tryE :: Monad m => ExceptT e m a -> ExceptT e m (Either e a)
tryE action = (Right <$> action) `catchE` (pure . Left)
1 change: 1 addition & 0 deletions nix/.stack.nix/cardano-wallet-core.nix

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.