Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address Format in Shelley Era (Part 1: abstract over the address encoding) #313

Merged
merged 4 commits into from
May 27, 2019

Conversation

KtorZ
Copy link
Member

@KtorZ KtorZ commented May 23, 2019

Issue Number

#217

Overview

  • I have abstracted the address encoding to text over a backend target t
  • I have adjusted API types and tests accordingly

Comments

See #217 (comment)

@KtorZ KtorZ requested review from Anviking and akegalj May 23, 2019 15:36
@KtorZ KtorZ self-assigned this May 23, 2019
(decodeBase58 bitcoinAlphabet $ T.encodeUtf8 x)
fromText = bimap textDecodingError Address
. convertFromBase Base16
. T.encodeUtf8
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've kept the FromText and ToText as they are conveniently used to serialize addresses to disk, but went for an agnostic Base16 encoding here. This isn't user-facing, so there's no reason to pick an heavier encoding format like bech32 or base58 for this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would even go and put this comment in the code under -- NOTE: ... as other devs might be interested in the encoding choice

. T.encodeUtf8
where
decodingError _ = TextDecodingError
"Unable to decode Address: expected Base16 encoding"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could have just used toText and fromText here 🤔 This is also arbitrary format for testing.

@Anviking
Copy link
Member

Anviking commented May 23, 2019

The downside of this approach (the Text) is obvious lost of type-safety

Will this loss really be noticeable? We won't ever use Bech32 and Base58 at the same time?

I'm naturally sceptical of adding ts everywhere if it can be avoided. On the other hand it is not quite everywhere… 🤔

Copy link
Contributor

@akegalj akegalj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding additional constraint parameter does add a bit of clutter but it doesn't look that bad from my perspective. Maybe someone more experienced (like duncan) might disagree .

For two other approaches:

  1. Change the ApiAddress.id to Text
    • I wonder how would FromJSON and ToJSON instances know which implementation to pick in this scenario? Aeson (and/or other serializing classes) would still have to have information about the parameter and which version of decoding is used. Or are you proposing that these classes try to encode/decode all known schemes in some order:
instance FromJSON Address where
  parseJSON =
    decodeOldScheme
    <|> decodeNewScheme
    <|> fail "Can't decode"

in this case Address might be sum type:

data Address = OldScheme Type1 | NewCheme Type2

This looks to me like a way to keep type safety with this sum type - where you still loose the dependency on type parameter t. I might be wrong though

  1. Drop Base58 in core and simply go for Bech32 all-the-way.
    • can you expand on this a bit more? If we go for beach32 all the way, how will current scheme that runs on mainnet atm (base58?) be backward compatible? I must be missuderstanding something - but will someone running cardano from its first release be able to run it if we only support beach32 for address?

(decodeBase58 bitcoinAlphabet $ T.encodeUtf8 x)
fromText = bimap textDecodingError Address
. convertFromBase Base16
. T.encodeUtf8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would even go and put this comment in the code under -- NOTE: ... as other devs might be interested in the encoding choice

toText = T.decodeUtf8 . encodeBase58 bitcoinAlphabet . getAddress
-- | An abstract class to allow encoding of addresses depending on the target
-- backend used.
class EncodeAddress t where
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is probably very stupid question but I am confused which instance of EncodeAddress and DecodeAddress are picked here?

I don't see any instance/implementation of these two classes provided (except DummyTarget in spec)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

@KtorZ KtorZ May 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the point. The instance is required when we create a server (see: Api/Server and we happen to make a choice at the call-site in exe/wallet/Main.hs (notice the type application @HttpBridge))

Instances for these various parametric interfaces are defined in the Compatbility modules of each backend option, for instance for http-bridge:

https://github.com/input-output-hk/cardano-wallet/blob/KtorZ/239/address-format-in-shelley-era/lib/http-bridge/src/Cardano/Wallet/Compatibility/HttpBridge.hs

(note that I haven't yet defined the instance for EncodeAddress and DecodeAddress here, but that'd basically be where we pick toBase58 and fromBase58).

We would have similar instances in jormungandr picking toBech32 and fromBech32.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense now. Thanks for detailed walkthrough

Copy link
Contributor

@rvl rvl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a question about universally adding the address encoding as a parameter ...


-- | https://input-output-hk.github.io/cardano-wallet/api/#operation/postTransaction
type CreateTransaction = "wallets"
type CreateTransaction t = "wallets"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this type constraint mean that the wallet API won't be able to make payments to addresses which are of an encoding different to t?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a valid point and the answer is: yes. This means we'll only parse address according to the encoding supported by t (note that, this is already implicitly the case at the moment. If one uses something else than Base58, the API returns an error expecting an address in Base58).

Whether we should allow different encoding for addresses is up to discussion I believe, although there's no clear use-case for it at the moment I believe. That's a very valid question to ask ourselves (and product) however.

@KtorZ
Copy link
Member Author

KtorZ commented May 24, 2019

@akegalj

Or are you proposing that these classes try to encode/decode all known schemes

Well, pretty-much, and we would have to handle the address format in the API handlers and wallet layer directly. I am not sure this is much simpler in the end, as we just defer the parsing of the address to later and have to conserve a bigger textual representation in memory (whereas at the moment, we only keep raw bytes which is slightly more efficient + safer since we do input validation at the API boundaries).

in this case Address might be sum type:

No. We would still have Address ByteString, but the encoding of this bytestring be "free" and only checked when needed :/ Now that I re-think about it, I don't think this option is viable at all.

can you expand on this a bit more? If we go for beach32 all the way, how will current scheme that runs on mainnet atm (base58?) be backward compatible? I must be missuderstanding something - but will someone running cardano from its first release be able to run it if we only support beach32 for address?

The chain is actually completely agnostic to the address encoding. The encoding is just a human view of the address payload. So, we would still have a system that is compatible provided that you have a software that represents addresses using a bech32 encoding. As a matter of fact, let's say that there's an address with a raw payload of "xxx". This can be encoded as A2w87k using a base-58 encoding, or bb7532 using a bech32 encoding (I made up those representations); in the end, both A2w87k and bb7532 represents the same address which appears on the chain as xxx once serialized.

So, the downside here would be that if we intend to use Daedalus or Yoroi to make some tests on Byron, we have to convert their base-58 addresses to bech-32 and vice-versa if we want to compare results.

The reason why I was thinking of keeping this flexible is that the current implementation-decision regarding address was proposed by the Rust team and I believe has also been accepted by the Haskell team. However, it isn't impossible that both team end-up following a different encoding schemes that are not compatible (I know already that the Haskell team has more address types, and they might not be inclined to use bech32 in the end). Who knows. So I prefer staying flexible here and minimizing the risk.

@KtorZ
Copy link
Member Author

KtorZ commented May 24, 2019

@Anviking

Will this loss really be noticeable? We won't ever use Bech32 and Base58 at the same time?

It's not much about using them at the same time. We know already that we can't use two different backend target at the same time anyway (and that's a design choice). So, the question is more like: Will we ever have to release several executables that support different address encoding (bech32, base58, whatever else people will come up with)? And the answer is: most likely. Cf the last part of my previous answer: #313 (comment)

@KtorZ KtorZ force-pushed the KtorZ/239/address-format-in-shelley-era branch from 8bc2f3b to a2c2b32 Compare May 24, 2019 14:45
@akegalj
Copy link
Contributor

akegalj commented May 24, 2019

@KtorZ
Thanks for detailed answer. This PR would make sense if it is clear what to do with what @rvl said #313 (comment) .

It seems to me that, as @rvl noticed, parametarization like this forbids sending transaction to an address with different encoding which is probably what people will be doing with daedalus (if we decide to support this)

@KtorZ
Copy link
Member Author

KtorZ commented May 27, 2019

like this forbids sending transaction to an address with different encoding which is probably what people will be doing with Daedalus

Hmmm. Yes and no. Again, the encoding is just a "view" on the address raw bytes. There's no problem converting from an encoding to another. Also, Daedalus will eventually end up using bech32 only (and the whole mainnet will be using bech32). Still, there will be "old" addresses out there that we'll have to support. And, we can most probably support a variety of encodings using a typed list 🤔 ....

Copy link
Member

@Anviking Anviking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the question is more like: Will we ever have to release several executables that support different address encoding (bech32, base58, whatever else people will come up with)? And the answer is: most likely.

Ok, then great 👍

  • I imagine we could do something with CPP or flags too, but maybe this properness is warranted.
  • Concerned about readability, I would have considered ApiTransaction addrEncoding instead of ApiTransaction t. Also because the t somehow feels less composable. But this is the same pattern as in Wallet s t, so I think i makes sense.

@KtorZ KtorZ force-pushed the KtorZ/239/address-format-in-shelley-era branch from a2c2b32 to f808176 Compare May 27, 2019 15:33
@KtorZ
Copy link
Member Author

KtorZ commented May 27, 2019

@akegalj I've updated the remaining part to use the type parameter I introduced. Also, I've tried to summarize the situation and the rationale behind this decision in a comment on the ticket itself:

#217 (comment)

This means also that, we can rather easily support both legacy byron addresses and new shelley addresses using bech32 at the same time without any change in the core code (apart from what this PR introduces). I'll have a second PR soon with actual instances for Jomungandr to show how it could work.

@akegalj
Copy link
Contributor

akegalj commented May 27, 2019

@KtorZ
Your proposal sounds good. When you said:

So here, depending on which backend we're dealing with, we may end up rejecting addresses that are encoded in bech32, or those with a wrong network magic.

did you have this use case in mind?

  • there are two people, person A and person B
  • person A runs on http-bridge backend and is using base58 encoding
  • person B runs on Jomungandr backend and is using bech32 encoding
  • person A runs generates a new address with its http-bridge backend (this address is encoded with base58) and sends it to person B via email
  • person B receives the email from person B
  • person B tries to send transaction to person A using the address from the email (base58 encoded)
  • Jomungandr backend on person B machine can't decode base58 address that it sees and it throws an error
  • person B and person A end up being confused

Everything else from your writeup makes sense - but if use case described above is possible to happen I don't see how that can be good

@KtorZ
Copy link
Member Author

KtorZ commented May 27, 2019

Jomungandr backend on person B machine can't decode base58 address that it sees and it throws an error

This is not correct. Jormungandr will comprehend both formats. That'll be handled internally by the instance of decodeAddress for Jormungandr, imagine something like:

decodeAddress _ bytes = decodeBech32 bytes <|> decodeBase58 bytes

And in the end, the address payload is just captured as a bunch of bytes and stored in Address ByteString. When encoding, a similar pattern match happens, looking at the raw bytes to determine the type of encoding that is required, and then, encoding the address. This latter part is the non-elegant part which buys us all the simplicity elsewhere; I think it's a fair price to pay...

@KtorZ KtorZ force-pushed the KtorZ/239/address-format-in-shelley-era branch from f808176 to d91bd70 Compare May 27, 2019 16:49
Copy link
Contributor

@akegalj akegalj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Matthias for clearing things up for me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants