-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wallet Id Calculation #156
Conversation
We needed a way to compute wallet ids in a deterministic manner, without taking the risk of exposing the root private key. Since the mnemonic (and therefore, the root key) is the only real common piece of information between two wallets, we got to derive _something_ from it in a deterministic manner. The original idea was to derive a number / seed that would seed a UUID random generator, though, we can also just stop the process at the seed (no need for the extra step of computing a uuid). Ids are therefore just a hex-encoded hash of the public key. MD5 would have been sufficient, but going for Blake2b 160 because... [insert valid reason here]
}, | ||
"id": "00002008-0000-008a-0000-1a7100002145" | ||
"id": "8ed0b58003abc9109c76328fa7e0fd1876c58d20" | ||
} | ||
] | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re-generated golden tests
k <- choose (0, 10) | ||
return $ PrimaryKey $ WalletId $ UUID.fromWords k 0 0 0 | ||
bytes <- B8.pack . pure <$> elements ['a'..'k'] | ||
return $ PrimaryKey $ WalletId $ hash bytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So here, we try to keep the generator's entropy quite low because we do want to generate conflicts in the properties.
"samples": [ | ||
{ | ||
"passphrase": { | ||
"last_updated_at": "1864-04-24T03:49:30.165952219949Z" | ||
"last_updated_at": "1864-06-06T17:04:22.128113480808Z" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why here year 1864?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is arbitrarily generated 🤷♂️
uuidFromWords :: (Word32, Word32, Word32, Word32) -> UUID.UUID | ||
uuidFromWords (a, b, c, d) = UUID.fromWords a b c d | ||
arbitrary = do | ||
bytes <- BS.pack <$> replicateM 16 arbitrary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why 16 here, because address pool gap is 16?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. Just arbitrary.
uuidFromWords (a, b, c, d) = UUID.fromWords a b c d | ||
shrink _ = [] | ||
arbitrary = do | ||
bytes <- BS.pack <$> replicateM 16 arbitrary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the same question as before
walletIdInjective | ||
:: (NewWallet, NewWallet) | ||
-> Property | ||
walletIdInjective (walletA, walletB) = monadicIO $ liftIO $ do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would persuade me more here if it would be imprinted explicitly that a substantial number of conflicts are tested (maybe adding cond for this is a good idea)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean ? Verifying that the mnemonic are indeed different?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh, ok - there is :
145 instance Arbitrary NewWallet where
146 -- No shrinking
147 arbitrary = NewWallet
148 <$> arbitrary
149 <*> arbitrary
150 <*> pure (WalletName "My Wallet")
151 <*> arbitrary
152 <*> arbitrary
and we use digest $ publicKey rootXPrv
. I am always thinking when testing injectivity to find as many legal common fields ... and for rootXPrv
we need seed
, secondFactor
, and passprase
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so I gap
and name
do not affect calculation of rootXPrv
. It is depending on seed
, secondFactor
, and passprase
. But what when only one of this triple
is unique and the rest are the same...but maybe I am here too picky. And it is more about testing properties of generateKeyFromSeed
... Sorry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To some extent, yes. We do not test the injectivity of generateKeyFromSeed
and just take it as "granted" (since it's coming from cardano-crypto
and it's tested there).
One thing we could add however is to check that, mnemonics are actually different (it should be 100% coverage with QC cover
so that we are sure that our generator is good enough).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to sum up the idea I was loudly thinking above which is in one sentence showing in tests that our assumption about MINIMAL stuff needed to determine the uniqueness of walletID is right: I was just trying to say that it is informative for test to convey the message : when we have some param(s) and they are different being n-tuple, then we have unique walletId; when they are the same then the same walletId is expected. Some params does not contribute here (like name, gap) and for the same n-tuple they can be different and the walletId is the same. With that embedded in the test I do not have to look at the implementation to understand what are the rules behind walletID computation. Moreover, if we artificially limit number of possibilities of generation of each field in NewWallet (in contrast to what is used now instance Arbitrary NewWallet
) , then we can easily verify what is here minimal subset of params needed to determine unique walletID. cover
with specially designed cond
would do the job. My next question was also continuing this investigation - are we sure seed
, secondFactor
, and passprase
constitute this n-tuple? (of course we believe in generateKeyFromSeed
here, but it can be confirmed ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -188,5 +209,6 @@ instance Arbitrary AddressPoolGap where | |||
|
|||
instance Arbitrary WalletId where |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why don't we share instance Arbitrary WalletId
between specs ? (ie, the same is defined in TypeSpec)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't share instances between tests to help keeping them separate and well-scoped. Every test file is sufficient to itself and doesn't need extra context.
Having shared generator is usually a bad idea because, depending on the properties you're testing, you don't want the generators to be identical (we do actually have, in several places, generators for same data types that are completely different). Even in this PR, the generator for wallet Id in the MVarSpec generates much more conflicting wallet ids than in others. This is on purpose.
Generally speaking, we don't want to over-engineer our test suite; every tests are already complex enough to also share the burden of code architecture and structuring. Ideally, we would use generators as data and not class (like Hedgehog does it), but we are quite acquainted to QC (and the tooling seems also of better quality here..) so, we went for having orphan instances NOT shared between tests modules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Issue Number
#145
Overview
Comments