Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Miscellaneous Bech32 library fixes and improvements. #323

Merged
merged 7 commits into from
May 27, 2019

Conversation

jonathanknowles
Copy link
Contributor

A collection of small fixes and improvements.

Issue Number

None.

Overview

  • Remove inappropriate Arbitrary instance for Text.
  • Make what constitutes a valid DataPart much more explicit.
  • Make what constitutes a valid HumanReadablePart much more explicit.
  • Use deriving strategies for HumanReadablePart instances.

@jonathanknowles jonathanknowles self-assigned this May 27, 2019
@jonathanknowles jonathanknowles force-pushed the jonathanknowles/bech32-improvements branch 2 times, most recently from d9de3a4 to b480d91 Compare May 27, 2019 07:05
dataCharFromWord = (dataCharFromWordArray Arr.!)

dataCharFromWordArray :: Array Word5 Char
dataCharFromWordArray = Arr.listArray (Word5 0, Word5 31) dataCharList
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use a Map Word5 Char here too in the end 🤔 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use a Map Word5 Char here too in the end

Good question.

The mapping from Word5 to Char covers all values of Word5. So the type we need is actually Word5 -> Char. There should be no valid value of Word5 that doesn't map to some Char.

Using Data.Map.lookup gives us a type of Word5 -> Maybe Char. So when performing a lookup, we'd have to pattern match on the result, and then call error in the case that the lookup fails.

What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, different needs so different data-structures. I didn't quite grasp that at first but it makes sense indeed to use Array here.

@@ -183,7 +223,8 @@ dataPartToWords = mapMaybe charToWord5 . T.unpack . dataPartToText
-- | Represents the human-readable part of a Bech32 string, as defined here:
-- https://git.io/fj8FS
newtype HumanReadablePart = HumanReadablePart Text
deriving (Eq, Show)
deriving newtype (Eq, Monoid, Semigroup)
deriving stock Show
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there's any difference between deriving newtype Eq and deriving stock Eq, since that's a bit the "promise" with newtypes ("no-runtime cost"). I am just thinking out loud though 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also curious about this. :)

instance Arbitrary DataChar where
arbitrary = DataChar <$> elements Bech32.dataCharList
shrink (DataChar c) =
DataChar . Bech32.dataCharFromWord <$> shrink
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shrinker seems a bit "clunky" now that I see it. We shrink Word5 by shrinking Word8 and calling Bech32.word5 on them. Internally, word5 does a bitwise and 31 to nullify the first three bits. which means that in practice, there's a whole range of Word8 that gets shrinked to the same Word5. And therefore, I believe it's quite easy to end up shrinking to the same Word5 by just accidentally picking in wrong range of Word8.

This would cause QC to loop endlessly trying to shrink a DataChar whereas there's very little value here in shrinking such element. What about just yielding an empty list?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And therefore, I believe it's quite easy to end up shrinking to the same Word5 by just accidentally picking in wrong range of Word8.

I've added a new commit to fix this. Instead of masking arbitrary Word8 values to generate our values of Word5 (which would give a skewed distribution), I instead use arbitraryBoundedEnum, and also provide Enum and Bounded instances for Word5.

The arbitraryBoundedEnum function yields a uniform distribution [minBound .. maxBound], which in the case of Word5, is [0 .. 31].

The shrinker should now work, as internally it just calls shrinkIntegral, which will always yield values in the range [0 .. 30]. Such values will always fit within the range of a Word5.


it "dataPartFromBytes" $
property $ \bytes ->
dataPartIsValid (dataPartFromBytes bytes) === True
Copy link
Member

@KtorZ KtorZ May 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that there's little value in asserting === True, the only counter example this can print is False.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that there's little value in asserting === True, the only counter example this can print is False.

Understood. I've added another commit which improves the display of counterexamples (in all three constructor tests).

Simplify the functions that convert between a data character and a word,
removing case-insensitivity (as this is now handled elsewhere).

Add tests to check that all constructors produce valid values.
Reasons that this instance was inappropriate:

1. It only ever generated `Text` that could successfully be passed to
   the `humanReadablePartFromText` function.

2. The `arbitrary` function was actually unused.

3. The `shrink` function was only used from the `Arbitrary` instance for
   `HumanReadablePart`. We can simply inline that code.
@jonathanknowles jonathanknowles force-pushed the jonathanknowles/bech32-improvements branch from 93c3f6b to 6c8086a Compare May 27, 2019 09:25
Use `arbitraryBoundedEnum` to generate arbitrary values of `Word5`.

This function should give a uniform distribution of values across the
whole range of `Word5`.
@jonathanknowles jonathanknowles requested a review from KtorZ May 27, 2019 12:23
@KtorZ KtorZ merged commit 8f553bc into master May 27, 2019
@iohk-bors iohk-bors bot deleted the jonathanknowles/bech32-improvements branch May 27, 2019 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants