Mark addresses as used while discovering them. #2033

KtorZ · 2020-08-12T15:38:10Z

Issue Number

#2032

Overview

dd3cff5
📍 mark addresses as used as they are discovered
The main issue of the current list address handler is that it loads in memory and crawls the ENTIRE transaction history of the wallet. This is done merely to know whether addresses have been used or not. This is unnecessary for random addresses because used and unused addresses are already separated in two different Maps. For sequential wallets it's a bit more subtle but, we can mark addresses as "Used" during discovery and store this information in the database.
61bbd7b
📍 add two properties for Random & Sequential wallets showing the effect of IsOurs on the address state
eed460c
📍 fix random address import to not mark every imported address as 'Used'

Comments

NOTE 1: This change will require a database migration forcing wallets to re-synchronize from scratch.
NOTE 2: I haven't yet fixed unit tests which need some adjustment since some of the interfaces have changed.
NOTE 3: I'll try measuring and comparing the behavior with a stress-wallet that owns many addresses.

rvl

I wholly approve of the change to mark sequential addresses as used once they are discovered, so that this information is easily known when listing addresses.

I also like the idea of streaming responses, but the topic needs further discussion in a separate PR.

rvl · 2020-08-19T01:11:46Z

lib/core-integration/src/Test/Integration/Framework/Request.hs

+    -- separator), and decoding them as a JSON list.
+    decodeOne = Aeson.decodeStrict . mconcat
+    decodeStream chunks = do
+        xs <- traverse (Aeson.decodeStrict @Aeson.Value . B8.init) chunks


I don't think you can necessarily assume that the chunks are split nicely at the newline.
The OS might interrupt the socket read early, and you will have a line in two pieces, neither of which are well-formed json.

This streaming change is worthy of its own PR, where the details can be sorted out.

I like the idea of streaming in general, because it removes one case of unbounded memory usage.

But on the other hand, if you have 1 million addresses, that would be perhaps 80MB of encoded json, as an indication of size. So the memory cost of not streaming is smallish. Without streaming there is a delay while the server is reading the list from sqlite, but unless the clients also have stream processing, so what?

@rvl actually.. there's one test failing because of this which shows that sometimes, there are chunks that only contains half of an address (it fails on the "import 15.000 addresses and list them" scenario, I haven't noticed this below 100 addresses).

So yeah, this require some additional thoughts maybe. I'll split the streaming approach into a separate PR.

The main issue of the current list address handler is that it loads in memory and crawls the ENTIRE transaction history of the wallet. This is done merely to know whether addresses have been used or not. This is unnecessary for random addresses because used and unused addresses are already separated in two different Maps. For sequential wallets it's a bit more subtle but, we can mark addresses as "Used" during discovery and store this information in the database.

… of IsOurs on the address state

KtorZ · 2020-08-19T14:50:16Z

@rvl I ended up adding a simple migration for this. Reviewing the migration approach of the wallet can be done as a separate task / debt. I've tested the query with a hand-crafted database with 100k address and txout rows arbitrarily generated (with however existing relationships between tables) and the query is executed almost instantly.

In the end, migrating an existing database was quite easy given the right SQL incantation. We need to mark any 'known' address as used where a known address is simply an address that appear in one of the wallet known txout. We don't care in the end if it's an outgoing or incoming transaction. Just seeing an address there immediately qualifies it as 'used'.

rvl

Migration looks OK

rvl · 2020-08-20T03:17:14Z

bors r+

iohk-bors · 2020-08-20T04:08:19Z

Build succeeded

KtorZ added the RESOLVING ISSUE Mark a PR as resolving issues, for auto-generated CHANGELOG label Aug 12, 2020

KtorZ requested a review from rvl August 12, 2020 15:38

KtorZ self-assigned this Aug 12, 2020

KtorZ force-pushed the KtorZ/2032/mark-address-while-discovering branch 3 times, most recently from 8ba5a3f to eed460c Compare August 13, 2020 13:05

rvl reviewed Aug 19, 2020

View reviewed changes

KtorZ added 3 commits August 19, 2020 14:41

add two properties for Random & Sequential wallets showing the effect…

cf93ccb

… of IsOurs on the address state

fix random address import to not mark every imported address as 'Used'

b94f9f8

KtorZ force-pushed the KtorZ/2032/mark-address-while-discovering branch from 02b126e to b94f9f8 Compare August 19, 2020 12:41

KtorZ marked this pull request as ready for review August 19, 2020 12:43

KtorZ force-pushed the KtorZ/2032/mark-address-while-discovering branch from 29ed552 to dc6db76 Compare August 19, 2020 16:31

rvl approved these changes Aug 20, 2020

View reviewed changes

Sqlite: Column order matters to persistent auto-migrations

3a01713

rvl force-pushed the KtorZ/2032/mark-address-while-discovering branch from 1486920 to 3a01713 Compare August 20, 2020 02:49

iohk-bors bot merged commit 9864ac3 into master Aug 20, 2020

iohk-bors bot deleted the KtorZ/2032/mark-address-while-discovering branch August 20, 2020 04:08

KtorZ mentioned this pull request Aug 27, 2020

'listAddresses' is extremely slow / unusable for large wallets. #2032

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mark addresses as used while discovering them. #2033

Mark addresses as used while discovering them. #2033

KtorZ commented Aug 12, 2020 •

edited

Loading

rvl left a comment

rvl Aug 19, 2020

KtorZ Aug 19, 2020

KtorZ commented Aug 19, 2020

rvl left a comment

rvl commented Aug 20, 2020

iohk-bors bot commented Aug 20, 2020

Mark addresses as used while discovering them. #2033

Mark addresses as used while discovering them. #2033

Conversation

KtorZ commented Aug 12, 2020 • edited Loading

Issue Number

Overview

Comments

rvl left a comment

Choose a reason for hiding this comment

rvl Aug 19, 2020

Choose a reason for hiding this comment

KtorZ Aug 19, 2020

Choose a reason for hiding this comment

KtorZ commented Aug 19, 2020

rvl left a comment

Choose a reason for hiding this comment

rvl commented Aug 20, 2020

iohk-bors bot commented Aug 20, 2020

Build succeeded

KtorZ commented Aug 12, 2020 •

edited

Loading