Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Call getAccountBalance for all wallets at once on tip change #2034

Merged
merged 9 commits into from
Aug 24, 2020

Conversation

Anviking
Copy link
Member

@Anviking Anviking commented Aug 12, 2020

Issue Number

#2005

Overview

  • Only send the LSQ query GetFilteredDelegationsAndRewardAccounts when the tip changes
  • Query all wallet reward accounts in the same GetFilteredDelegationsAndRewardAccounts

Listing stake pools with 10 wallets now takes ~5 seconds instead of 2-3 minutes!

  • Make the implementation easier to follow and actually sane

Comments

  • Not completely sure this doesn't mess up some some intricate interactions between the reward balance state and utxo state. The integration tests pass. But maybe there are some rollback scenarios which could lead to the reward balance being incorrectly reported for a while… Would be nice to think through properly and synchronise them.

I think this happens on tip changes:

  1. Wallet worker updates UTxO
  2. Tip worker re-feches balances and writes to TVar
  3. (I think on next tip change) Wallet reads the balance from the TVar and writes it to Sqlite
  4. New reward balance is queryable from API

With this PR, I wonder if it may not risk being out of date for one more tip-change

@Anviking
Copy link
Member Author

bors try

iohk-bors bot added a commit that referenced this pull request Aug 12, 2020
@Anviking Anviking self-assigned this Aug 12, 2020
@Anviking Anviking changed the base branch from master to anviking/2005/time-lsq-queries August 12, 2020 18:36
@iohk-bors
Copy link
Contributor

iohk-bors bot commented Aug 12, 2020

try

Build failed

@@ -328,6 +349,8 @@ withNetworkLayer tr np addrInfo versionData action = do
connectNodeTipClient
:: HasCallStack
=> RetryHandlers
-> TVar IO (Map W.ChimericAccount (Maybe W.Coin))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about using 0 as a default value instead of Nothing. We do already default to 0 when it comes to the reward balance and it is quite plausible / will not cause any particular harm to report a balance as 0 until we fetch it.

Copy link
Member Author

@Anviking Anviking Aug 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to keep "we already default to 0" separate from the core caching logic.

With the slight refactoring, I made new entries get added to a separate toBeObserved :: Set, instead of Nothing values inside the Map, but the "query" still returns Maybe W.Coin like here, requiring an explicit fromMaybe.

Copy link
Member

@KtorZ KtorZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the principle here of fetching them all in one query. We need to make sure however to clean accounts from the list when wallets are deleted. Otherwise, we end up fetching unused account over and over. We need some form "pub / sub" mechanism.

@Anviking Anviking mentioned this pull request Aug 13, 2020
1 task
@Anviking Anviking force-pushed the anviking/2005/batch-getAccountBalance branch from 22ac654 to d7dc020 Compare August 17, 2020 10:45
@Anviking Anviking changed the base branch from anviking/2005/time-lsq-queries to master August 17, 2020 10:46
@Anviking
Copy link
Member Author

bors r+

@iohk-bors
Copy link
Contributor

iohk-bors bot commented Aug 17, 2020

👎 Rejected by too few approved reviews

@Anviking
Copy link
Member Author

Meant:
bors try

iohk-bors bot added a commit that referenced this pull request Aug 17, 2020
@iohk-bors
Copy link
Contributor

iohk-bors bot commented Aug 17, 2020

try

Build failed

@Anviking Anviking force-pushed the anviking/2005/batch-getAccountBalance branch from d7dc020 to f66a86e Compare August 17, 2020 11:10
@Anviking
Copy link
Member Author

bors try

iohk-bors bot added a commit that referenced this pull request Aug 17, 2020
[ "Querying the reward account balance for"
, pretty acct
, fmt $ listF accts
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With 10 wallets:

[cardano-wallet.network:Info:15] [2020-08-17 10:37:49.31 UTC] Querying the reward account balance for  [1ed0bdd4, 2a98402c, 5237aa81, 634bea28, 69e2c672, 75765c1e, 84132baa, c662444a, d55deafa, f4ea9095] at 70617265<-[c90c882f-6094377#4570261]

@Anviking Anviking marked this pull request as ready for review August 17, 2020 11:39
@Anviking Anviking changed the title WIP: Call getAccountBalance for all wallets at once on tip change Call getAccountBalance for all wallets at once on tip change Aug 17, 2020
@iohk-bors
Copy link
Contributor

iohk-bors bot commented Aug 17, 2020

try

Build succeeded

@Anviking Anviking requested a review from KtorZ August 17, 2020 13:49
, stopObserving = \k ->
atomically $ do
modifyTVar' toBeObservedVar (Set.delete k)
modifyTVar' cacheVar (Map.delete k)
Copy link
Member Author

@Anviking Anviking Aug 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Integration tests / STAKE_POOLS_JOIN_01 pass, meaning this should work.

But if this Observer abstraction is something we want to keep, it would be nice to add more extensive unit tests. newRewardBalanceFetcher would probably have to be made more abstract. (Because the record itself provides very little structure and can't be tested)

I also guess there is similarities between this and the WorkerRegistry, that perhaps could be unified in the future.

getAccountBalance is called by each wallet worker, I think on each
restoration step (which is often).

For some reason, sending multiple getAccountBalance queries at the same
time slows it down from e.g. 0.002s to 40s.

The LSQ query even supports querying multiple accounts in a single
query, so let's use it!

This commit makes getAccountBalance merely lookup in a Map in a TVar,
which is updated with a single query on every tip-change.

Somewhat crude implementation so far.

I'm yet not familiar with the code that is calling getAccoutBalance. I
suspect there are layers of indirection that can be removed.
@Anviking Anviking force-pushed the anviking/2005/batch-getAccountBalance branch from f66a86e to 7d1ff70 Compare August 17, 2020 14:07
@Anviking Anviking added the RESOLVING ISSUE Mark a PR as resolving issues, for auto-generated CHANGELOG label Aug 18, 2020
-> TVar IO (Set W.ChimericAccount)
-> Tip (CardanoBlock sc)
-> IO ()
refresh cacheVar toBeObservedVar tip = do
Copy link
Member

@KtorZ KtorZ Aug 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it surprising that refresh is not part of the Observer "interface". The function clearly has a privileged access to both TVar and make strong assumption about how they are manipulated. So to me, it seems like it belong to the same abstraction and having it as a separate function is a bit confusing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me, Observer is concerned about subscribing to and accessing values. It theoretically be passed around and escape the Network module.

refresh is coupled to the creation of the Observer. It's more like an implementation detail, specific to the Network module.

If say the wallet registry were to have access to an Observer, it should not be allowed to manually call refresh.

, Tip (CardanoBlock sc) -> IO ()
-- Call on tip-change to refresh
)
newRewardBalanceFetcher tr gp queryRewardQ = do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be nice to be able to test this bit of logic in isolation. This is currently hard because the management of the keys / accounts is mixed with the logic for fetching the reward. I'd suggest to replace the ChimericAccount with a Ord k => k and to replace Coin with an abstract a, and pass a function that yield a as argument. So, something like:

data Observer m key value = Observer
   { startObserving :: key -> m ()
   , stopObserving :: key -> m ()
   , query :: key -> m (Maybe value)
   , refresh :: tip -> m tip
   }

newObserver 
   :: Ord key
   => (tip -> [key] -> m (Map key value))
   -> Observer m key value
newObserver getValues = ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This way, we can separate the network logic from the observer logic and test them separately (at least, test the observer).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be nice to be able to test this bit of logic in isolation.

Yes 👍 I alluded to it in #2034 (comment), but didn't want to explore it before hearing your feedback.

Before this commit keys would start in `toBeObserved`. They would then
get passed to the `fetch` function. If the supplied `fetch` function
chose to not include the same keys in the resulting map of values, they
would silently dissappear.

This way, `newObserver` takes full responsibility of maintaining the set
of keys which should be observed, by replacing the TVars (cache,
toBeObserved) with (cache, observedKeys)
@Anviking Anviking force-pushed the anviking/2005/batch-getAccountBalance branch from f4a4d39 to 99ee84a Compare August 20, 2020 09:04
{ startObserving = \k -> do
atomically $ do
modifyTVar' observedKeysVar (Set.insert k)
traceWith tr $ MsgAddedObserver k
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since "startObserving" is called every time getAccountBalance is called, this would log a line MsgAddedObserver in a rather misleading way and much more than it needs. We could use something like Set.alterF and return Just _ or Nothing depending on whether the element was added (and log accordingly).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup. It's connected to nullTracer now, so one doesn't notice, but should connect to the real tracer and change this.

describe "Observer" $ do
describe "(query k) with typical use" $ beforeAll mockObserver $ do
-- Using monadic-property tests /just/ for the sake of testing with
-- multiple keys seem worthless.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be nice to test with multiple keys, since this is the main point of this PR actually. Coming up with some properties would help testing possible edge-cases that aren't caught here, although may be a bit tricky here. At least, we should add two additional test scenarios:

  • registering the same key twice
  • registering multiple keys

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should say, I’m much more worried about this causing further balance discrepancies than that it doesn’t handle registering the same key twice, or multiple keys, etc (though sure, good to test).

@Anviking Anviking force-pushed the anviking/2005/batch-getAccountBalance branch from 72100ff to 0312071 Compare August 20, 2020 14:04
[ MsgAddedObserver k
, MsgWillFetch $ Set.singleton k
, MsgDidFetch $ Map.singleton k v
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get this test, unless this test depends on the previous ones which would be quite bad / misleading. Tests should work in isolation.

Copy link
Member Author

@Anviking Anviking Aug 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to experiment with slightly different usage of describe/it for more readable test output.

Here we get:

        when refresh fails
          (query k) returns the existing v
          only MsgWillFetch is traced

which I think is quite neat.

(And in total:)

      typical use
        startObserving
          (query k) returns Nothing before startObserving
          (query k) returns v after (startObserving k >> refresh)
          traced MsgAddedObserver, MsgWillFetch, MsgDidFetch
        calling startObserving a second time
          (query k) is still v
        when refresh fails
          (query k) returns the existing v
          only MsgWillFetch is traced
        stopObserving
          makes (query k) return Nothing

With no shared state, the whole typical use tree would be a single it.

I'll admit it gets a bit weird here though, where the refresh False belongs to the whole when refresh fails tree, but is run in the first it. (We cannot have it as a runIO or beforeAll, because we cannot access the refresh function from outside an it).

We'd need some

mapBeforeAll :: (a -> IO b) -> SpecWith b -> Spec a

which doesn't exist

Copy link
Member Author

@Anviking Anviking Aug 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably add a Arbitrary (Observer, refresh), and do run most of these checks in isolation though 🤔

Copy link
Member

@KtorZ KtorZ Aug 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I admit that the separation is neat, but it is puzzling. So at the very least, there should be a clear note as comment explaining basically what you just explained here.

@Anviking Anviking force-pushed the anviking/2005/batch-getAccountBalance branch from 5ddfc4c to 03ed329 Compare August 20, 2020 15:14
@Anviking
Copy link
Member Author

bors try

@Anviking Anviking requested a review from KtorZ August 20, 2020 15:20
iohk-bors bot added a commit that referenced this pull request Aug 20, 2020
Observer
{ startObserving = \k -> do
wasAdded <- atomically $ do
notAlreadyThere <- Set.notMember k <$> readTVar observedKeysVar
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems Set.alterF doesn't exist with our version of containers , and I didn't figure out how it worked, but think a separate readTVar inside atomically is perfectly fine.

@iohk-bors
Copy link
Contributor

iohk-bors bot commented Aug 20, 2020

try

Build failed

     Private Key
        +++ OK, passed 100 tests (59% conflicting db entries).
    parallel puts replace values in _any_ order
      Checkpoint
building of '/nix/store/wx550kfdn839bdg2racdn5755ng85fhl-cardano-wallet-core-test-unit-2020.8.3-check' timed out after 900 seconds of silence

@Fell-x27
Copy link

There is a bug. This build doesn't sync reward balance for fresh-restored wallets. Probably it also doesn't get reward from pools incoming after wallet sync.

@Fell-x27
Copy link

Fell-x27 commented Aug 20, 2020

Hmm..looks like the very last build (03ed329) doesn't have this problem. Prev had.

@KtorZ
Copy link
Member

KtorZ commented Aug 21, 2020

Do you mind describing the context, expected behavior and observed behavior?

There are a few things that are counter intuitive at first when dealing with reward balances. And there's also a known "latency" issue we are tracking as part of anothrr ticket.

@Fell-x27
Copy link

Fell-x27 commented Aug 21, 2020

Do you mind describing the context, expected behavior and observed behavior?

There are a few things that are counter intuitive at first when dealing with reward balances. And there's also a known "latency" issue we are tracking as part of anothrr ticket.

I mean, fresh-restored wallets were restored without reward balance. And even pre-restored wallets with non-zero reward balance randomly "lost" rewards. But with the last build problem is gone. And rewards were back. But there is another one issue with fees described in #2005 (comment)

@KtorZ
Copy link
Member

KtorZ commented Aug 24, 2020

bors r+

@iohk-bors
Copy link
Contributor

iohk-bors bot commented Aug 24, 2020

Build succeeded

@iohk-bors iohk-bors bot merged commit b5b292b into master Aug 24, 2020
@iohk-bors iohk-bors bot deleted the anviking/2005/batch-getAccountBalance branch August 24, 2020 08:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
RESOLVING ISSUE Mark a PR as resolving issues, for auto-generated CHANGELOG
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants