-
Notifications
You must be signed in to change notification settings - Fork 631
[CO-347] Wallet's UTXO histogram view #3402
[CO-347] Wallet's UTXO histogram view #3402
Conversation
This commit adds stubs to legacy and new-wallet, histogram and bar datatypes are accomodated, one-pass many-aggregates functionality is added
Kernel level implementation of WalletLayer is added, Legacy is not supported. The implementation is used in Handlers of V1 API. Refactoring from Integer to Word64 is done
Added integration test for Wallets and Transactions (here commented until migration to Handler happens). Added ToSchema instance needed to pass swagger integration tests. Decided to use [(Account, Utxo)] type in WalletLayer as it can be useful if the method is extended for account parameters. WalletLayers quickcheck required to add several instances as a consequence
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I didn't check the logic of the histogram itself, merely how it has been integrated with the wallet layer and the kernel, which looks good. Would appreciate it though if you changed the style of the implementation of the layer ever so slightly though to bring it in line with the rest (see comment below).
Right accountsIxSet -> | ||
let hdRootId = HD.HdRootId . InDb $ rootAddr | ||
hdAccountId accountIndex = HD.HdAccountId hdRootId (HD.HdAccountIx accountIndex) | ||
in Right ( map (\acc -> (acc, accountUtxo db (hdAccountId $ V1.accIndex acc) ) ) $ IxSet.toList accountsIxSet ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This uses the style of the wallet layer implementation as it was before I refactored it. Take a look at what it is now, the use of ExceptT can make a bit easier to write, and then you can also use from existing combinators such as fromRootId
.
@@ -187,3 +187,34 @@ transactionSpecs wRef wc = do | |||
etxn <- postTransaction wc payment | |||
|
|||
void $ etxn `shouldPrism` _Left | |||
{-- Uncomment when new Client is Handler based rather than LegacyHandler based | |||
it "posted transactions gives rise to nonempty Utxo histogram" $ do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that there's a xit
to declare pending tests. This way, we can still have GHC to verify the code as the codebase evolve, and as soon as it's ready, we can change the xit
to it
and have the test executed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -0,0 +1,51 @@ | |||
module Cardano.Wallet.API.V1.Handlers.Internal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than Cardano/Wallet/API/V1/Handlers/Internal
, I'd rather have a Cardano/Wallet/Types/UtxoStatistics
module with those functions and the type defined within it, then re-exported from V1/Types.hs
.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved types and functions to Cardano/Wallet/Types/UtxoStatistics.hs and reexport them in V1/Types.hs
|
||
eresp <- getUtxoStatistics wc (walId wallet) | ||
utxoStatistics <- fmap wrData eresp `shouldPrism` _Right | ||
let possibleBuckets = fmap show $ ( map (\toPower -> 10^toPower :: Word64) [(1::Word64)..16] ) ++ [45 * (10^(15::Word64))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not re-use the generateBounds
function ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
populateBuckets bounds = | ||
case bounds of | ||
(x:_) -> L.Fold (addCountInBuckets x) (initalizeMap bounds) (fmap (\pair -> HistogramBarCount (T.pack $ show $ fst pair, snd pair) ) . MS.toList) | ||
_ -> error "populateBuckets has to be powered with nonempty bounds" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then, use a NonEmpty Word64
instead of a [Word 64]
:D !
https://hackage.haskell.org/package/base-4.11.1.0/docs/Data-List-NonEmpty.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
applied
utxosCoinValuesForAllAccounts pairs = | ||
concatMap (\pair -> map extractValue (M.elems $ snd pair) ) pairs | ||
in do | ||
return $ single (computeUtxoStatistics $ utxosCoinValuesForAllAccounts w) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reads better with the do
at the upper-most level:
Right w -> do
let extractValue :: TxOutAux -> Word64
extractValue =
getCoin . txOutValue . toaOut
let utxosCoinValuesForAllAccounts :: [(Account, Utxo)] -> [Word64]
utxosCoinValuesForAllAccounts pairs =
concatMap (\pair -> map extractValue (M.elems $ snd pair) ) pairs
return $ single (computeUtxoStatistics $ utxosCoinValuesForAllAccounts w)
Also, I'd rather move these functions inside the computeUtxoStatistics
function. It will also make its signature a bit clearer. Rather than [Word64] -> UtxoStatistics
it becomes NonEmpty (Account, Utxo) -> UtxoStatistics
which in my opinion gives already a much clearer idea of what the function is doing and how to use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
applied within do
. As underscored above I moved everything to Cardano/Wallet/Types/UtxoStatistics.hs
which is reexported in V1/Types.hs. I retained lets
here as there is Account which means in order to avoid cyclic dependencies one solution would be to move everything related with Account
to separate Cardano/Wallet/Types/Account.hs
and reexport it in V1.Types.hs. I am not sure if we want this
case floatingOrInteger val of | ||
Left (_ :: Double) -> HistogramBarCount ("0", 0 :: Word64) | ||
Right integer -> HistogramBarCount (key, integer) | ||
constructHistogram _ = HistogramBarCount ("0", 0 :: Word64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same remark as above, if it's not an positive integer, we probably want to fail explicitly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
& type_ .~ SwaggerNumber | ||
& minimum_ .~ Just 0 | ||
) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👌
@@ -94,6 +94,8 @@ bracketPassiveWallet = | |||
, _pwlCreateAddress = pwlCreateAddress | |||
, _pwlGetAddresses = pwlGetAddresses | |||
|
|||
, _pwlGetUtxos = error "Method not implemented for legacy handler" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this level I am retrieving [(Account,Utxo)] for a given Wallet. Then ad the level of endpoint handler I will turn it into Utxo statistics. Why, maybe we are to extend endpoint and introduce some Account fields filtering, etc.
deriving Eq | ||
|
||
instance Show GetUtxosError where | ||
show = formatToString build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure here, but do we actually use show
to display our errors :s ?
Aren't we using their Buildable
instances? If not, then I'd open a ticket to change that, and wouldn't define any custom instance for Show
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the Show
instances get used in some places (and are required when throwing these as expections). We do the same in a few places. I think the correct solution is to have Show
be auto-derived and then define displayException
in terms of the Buildable
instances, but typically displayException
is not used in top-level exception handlers. Either way, I'd say open a separate ticket about this because this is not just this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@edsko Yes, that's exactly what I meant and I am not expecting such refactoring to happen in this PR :/ ... It is rather bad if we indeed rely on show
to display exceptions in the top-level handlers.
I think it would be fine to go directly for formatToString build
in those handlers (since the actual purpose of the Buildable
is to construct a user-friendly readable output) yet I don't mind having that exposed as a clear displayException
👍
genTxIn = oneof | ||
[ TxInUtxo <$> arbitrary <*> arbitrary | ||
, TxInUnknown <$> choose (1, 255) <*> scale (min 150) arbitrary | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why have this defined here as a separated function 🤔 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was reused in one of my earlier versions, now only in one place -> fixed
eresp <- getUtxoStatistics wc (walId wallet) | ||
utxoStatistics <- fmap wrData eresp `shouldPrism` _Right | ||
let possibleBuckets = fmap show $ ( map (\toPower -> 10^toPower :: Word64) [(1::Word64)..16] ) ++ [45 * (10^(15::Word64))] | ||
let histogram = map (\ x -> curry HistogramBarCount x 0) possibleBuckets |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style nit: no space between \
and the variable it introduces
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
@@ -0,0 +1,51 @@ | |||
module Cardano.Wallet.API.V1.Handlers.Internal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this
) where | ||
|
||
|
||
import Prelude |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The project uses Universum
as the prelude instead of the stock Prelude.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
generateBounds :: BoundType -> [Word64] | ||
generateBounds bType = | ||
case bType of | ||
Log10 -> ( map (\toPower -> 10^toPower :: Word64) [(1::Word64)..16] ) ++ [45 * (10^(15::Word64))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I have a bunch of type signatures I usually like to write a "specialized identity function" to make it a little more natural:
asWord64 :: Word64 -> Word64
asWord64 = id
which makes this look like:
Log10 -> (map (\toPower -> asWord64 (10^toPower)) [1 .. asWord64 16]) ++ [45 * (10 ^ asWord64 15)]
Or, as the ambiguity appears to come from the polymorphic type of (^)
, you could define a specialized variant that removes the need for signatures:
(^!) :: Word64 -> Word64 -> Word64
(^!) = (^)
(map (\toPower -> 10 ^! toPower) [1..16]) ++ [45 * (10 ^! 15)]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can also be expressed as:
take 16 (iterate (*10) 10) ++ [45 * (10 ^! 15)]
and since *
requires both operands to be the same type, you only need the type annotation once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion applied
import Cardano.Wallet.API.V1.Types | ||
|
||
import qualified Control.Foldl as L | ||
import qualified Data.Map.Strict as MS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the codebase had a convention to do it with import ... as Map
but it is actually split fairly evenly between as M
and as Map
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
-- | (b) avg or std of stake in a bucket | ||
-- | (c) topN buckets | ||
-- | to name a few | ||
newtype HistogramBar = HistogramBarCount (Text, Word64) deriving (Show, Eq, Ord, Generic) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of a newtype over a tuple, I'd prefer to see a data
with record fields:
data HistogramBar = HistogramBar
{ bucketName :: !Text
, bucketUpperBound :: !Word64
}
(,)
as a type is usually lazier than what you want. This also allows you to provide better information on the type for generic serialization/deserialization and schemas, and it is more robust to refactoring/changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
applied
-- | (a) sum of stakes in a bucket | ||
-- | (b) avg or std of stake in a bucket | ||
-- | (c) topN buckets | ||
-- | to name a few |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haskell's doc comments do not use multiple |
characters for each line. IIRC this won't render correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed | from everywhere in the comments I introduced
data UtxoStatistics = UtxoStatistics | ||
{ theHistogram :: ![HistogramBar] | ||
, theAllStakes :: !Word64 | ||
} deriving (Show, Generic, Ord) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4 space indent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
, "allStakes" .= allStakes ] | ||
|
||
instance FromJSON UtxoStatistics where | ||
parseJSON (Object v) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be derived generically. There's nothing about the type here that requires it to be manually written.
Once the HistogramBar
type is a record with proper serde instances, this should be derivable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to specification of the task endpoint expects to have:
"allStakes": 0,
"histogram": {
"10000000000000000": 0,
"100000000000000": 0,
"1000000000000": 0,
"10000000000": 0,
"100000000": 0,
"1000000": 0,
"10000": 0,
"100": 0,
"1000000000": 0,
"100000000000": 0,
"10": 0,
"45000000000000000": 0,
"1000": 0,
"10000000000000": 0,
"1000000000000000": 0,
"100000": 0,
"10000000": 0
}
}
rather than
"allStakes": 0,
"histogram": [
{
"bucketName" : "10",
"bucketUpperBound" : 0
},
.....
]
}
& at "10" ?~ (Inline $ mempty | ||
& type_ .~ SwaggerNumber | ||
& minimum_ .~ Just 0 | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The repetition of the Inline schema should be factored out. You can do this with:
declareNamedSchema _ = do
wordRef <- declareSchemaRef (Proxy :: Proxy Word64)
pure $ ...
mempty & at "10" ?~ wordRef
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also suggests a further condensation, where the hashmap is constructed from a list consisting of the relevant powers paired with the schemaref instead of manually enumerated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Thank you all for the reviews and suggestions. I have just made respective commit |
@@ -60,10 +61,10 @@ walletSpecs _ wc = do | |||
|
|||
eresp <- getUtxoStatistics wc (walId wallet) | |||
utxoStatistics <- fmap wrData eresp `shouldPrism` _Right | |||
let possibleBuckets = fmap show $ ( map (\toPower -> 10^toPower :: Word64) [(1::Word64)..16] ) ++ [45 * (10^(15::Word64))] | |||
let histogram = map (\ x -> curry HistogramBarCount x 0) possibleBuckets | |||
let possibleBuckets = fmap show $ (generateBounds Log10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parens or $
are unnecessary here; could be either fmap show (generateBounds Log10)
or fmap show $ generateBounds Log10
or even show <$> generateBounds Log10
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
arbitrary = | ||
let possibleBuckets = fmap show (generateBounds Log10) | ||
possibleBars = NL.zipWith HistogramBarCount possibleBuckets (NL.fromList [0..]) | ||
in elements (NL.toList possibleBars) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this does what you want -- this makes it choose a bucket name and a constant number from the [0..]
list. Eg, for every HistogramBarCount x y
, the x
will uniquely determine the y
.
I think what you want instead is something like:
arbitrary = do
name <- elements $ map show (generateBounds Log10)
bound <- arbitrary
pure (HistogramBarCount name bound)
This will pick a random upper bound and name from the list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
, "allStakes" .= allStakes ] | ||
|
||
instance FromJSON UtxoStatistics where | ||
parseJSON (Object v) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is better to use Data.Aeson.withObject :: String -> (Object -> Parser a) -> Parser a
rather than pattern matching directly on the constructors. The String
label is used for error messages, which is really helpful when debugging a parse failure.
So, replace:
parseJSON (Object v) = ...
parseJSON _ = empty
-- with:
parseJSON = withObject "UtxoStatistics" $ ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refactored as suggested
instance FromJSON UtxoStatistics where | ||
parseJSON (Object v) = | ||
let histogramListM = case HMS.lookup "histogram" v of | ||
Nothing -> empty |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
empty
fails without an error message. Try to avoid doing that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refactored as suggested
|
||
instance FromJSON UtxoStatistics where | ||
parseJSON (Object v) = | ||
let histogramListM = case HMS.lookup "histogram" v of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aeson
has an operator (.:) :: FromJSON a => Object -> Text -> Parser a
that does the lookup and reports an error message if it isn't present. So this can be:
parseJSON = withObject "UtxoStatistics" $ \o -> do
histo <- o .: "histogram" :: Parser (HashMap Text Word64)
let constructHistogram = uncurry HistogramBarCount
pure $ map constructHistogram $ HMS.toList histo
Here we specify the type of the parser, but it's unnecessary, because GHC would figure out what it needs to return. By requesting a HashMap Text Word64
, we will get the right error messages for invalid values in that type:
λ> eitherDecode "-123" :: Either String Word64
Left "Error in $: Word64 is either floating or will cause over or underflow: -123.0"
λ> eitherDecode "12345678901234567890" :: Either String Word64
Right 12345678901234567890
λ> eitherDecode "123456789012345678901234567890" :: Either String Word64
Left "Error in $: Word64 is either floating or will cause over or underflow: 1.2345678901234567890123456789e29"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refactored as suggested
Just (Object bars) -> do | ||
let constructHistogram (key, Number val) = | ||
case floatingOrInteger val of | ||
Left (_ :: Double) -> error "UtxoStatistics FromJson not integer" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the Parser
type, prefer fail
to error
as fail
reports the error using Aeson's parsing machinery, while error
bypasses all of that and blows up at runtime. It's a difference between decode :: FromJSON a => ByteString -> Maybe a
returning a Nothing
and blowing up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refactored as suggested
arbitrary = genTxIn | ||
arbitrary = oneof | ||
[ TxInUtxo <$> arbitrary <*> arbitrary | ||
, TxInUnknown <$> choose (1, 255) <*> scale (min 150) arbitrary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is what you want to do. This will set the size
parameter to QuickCheck generation to be the minimum of what it already is or 150. If you want to restrict the actual values, then you'd want suchThat
:
arbitrary `suchThat` (<= 150)
Would generate valid values up to 150.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I trusted what Core people used in core/test/Test/Pos/Core/Arbitrary/Txp.hs . Which is of course not a proof that it is correct. I will contact Eric concerning this. Maybe it will give rise to a new ticket and I inform him to affect also this instance
Correction commit done. I had previously exactly these linker errors in a number of checks: |
histogramBinNumCond histo = (length $ HMS.keys histo) <= 0 | ||
validateKeys = any (\key -> notElem key $ map show (NL.toList $ generateBounds Log10) ) | ||
histogramKeysCond = validateKeys . HMS.keys | ||
validateVals = any (< 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<= 0
? Are we fine with zero values? Or are we interested to catch any non-positive value ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are fine with zero values as bar represent count
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was confused with the names at first (bounds vs values) from the constructor. As a matter of fact, we are totally fine with empty buckets 👍
histogramKeysCond = validateKeys . HMS.keys | ||
validateVals = any (< 0) | ||
histogramValsCond = validateVals . HMS.elems | ||
allStakesCond = (< 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting way of doing it. For those, I usually leverage the Monad
instance of Either
and the when
combinator. This way, we avoid the hassle and overhead of defining extra identifiers. I believe it reads a bit better:
validateUtxoStatistics
:: HashMap Text Word64
-> Word64
-> Either UtxoStatisticsError (HashMap Text Word64, Word64)
validateUtxoStatistics histogram allStakes = do
let (keys, elems) = (HMS.keys histogram, HMS.elems histogram)
let acceptedKeys = show <$> (toList $ generateBounds Log10)
when (length keys <= 0) $
Left ErrHistogramEmpty
when (any (flip notElem acceptedKeys) keys) $
Left ErrHistogramNamesInvalid
when (any (< 0) elems) $
Left ErrHistogramUpperBoundNegative
when (allStakes < 0) $
Left ErrAllStakesNegative
Right (histogram, allStakes)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
instance Arbitrary HistogramBar where | ||
arbitrary = do | ||
possiblenames <- elements $ map show (NL.toList $ generateBounds Log10) | ||
bound <- arbitrary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one isn't entirely arbitrary, it is suchThat (\n -> n <= upperBound)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry :s .. I probably induced you to use suchThat
here. That's wasn't my best choice of words / example ...
data HistogramBar = HistogramBarCount | ||
{ bucketName :: !Text | ||
, bucketUpperBound :: !Word64 | ||
} deriving (Show, Eq, Ord, Generic) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a confusion in the field names here 🙃 , took me a short while to figure it out from the rest of the code.
bucketUpperBound
actually refers to the value / number of stakes whereas bucketName
is the actual upper bound.
Hence the following question: what is the need for the bound (i.e. bucketName
here) to be a Text
? Why isn't this a Word64
, especially when it is generated using Word64
values and converted to Text using show
. If it's about using it as a key in the JSON representation, having a Word64
here shouldn't be much of a problem!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are right, I made a change
module Cardano.Wallet.Types.UtxoStatistics | ||
( computeUtxoStatistics | ||
, UtxoStatistics (..) | ||
, HistogramBar (..) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure we want to expose the internals to the outside world. I'd be in favor of opaque types with smart-constructors 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
instance ToJSON UtxoStatistics where | ||
toJSON (UtxoStatistics bars allStakes) = | ||
let histogramObject = Object . HMS.fromList . map extractBarKey | ||
extractBarKey (HistogramBarCount bound stake) = bound .= stake |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what lead to the conclusions about the actual meaning of bucketUpperBound
and bucketName
in the comments above.
|
||
data UtxoStatistics = UtxoStatistics | ||
{ theHistogram :: ![HistogramBar] | ||
, theAllStakes :: !Word64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't allStakes
be derived from the list of histogram already? I mean, it's a nice-to-have in the API response, that's for sure, but I think here we better have a
allStake :: UtxoStatistics -> Word64
GHC-runtime is rather good at caching these kind of results anyway. And we only need it when serializing to JSON right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about it, we can't actually compute it from the list of bars indeed, because we only have the buckets there so it'll only be an approximation of what's in the bucket. Though, I believe this is preferable whatsoever to having a allStakes :: Word64
next to the histogram.
let's say:
approxTotalStakes :: UtxoStatistics -> Word64
The argument behind this is that, having allStakes
in the data-structure means that we have to maintain and control some sort of invariant since there could be a discrepancy between the list of bars and the total number of stakes. That discrepancy can't realistically come from us but could come from external UtxoStatistics we could parse through JSON, which is also unlikely.
Maybe I am being too picky here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when computing histogram, in the same traverse I compute allStakes. I can add many more other statistics here in the future. And in one traverse it is going to be computed. Usually there is no easy (nonapproximative) way of going from one aggregation to other
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I am just afraid about parsing UtxoStatistics
from an external source but I guess this will never happen in the end 👍
histogramKeysCond = validateKeys . HMS.keys | ||
validateVals = any (< 0) | ||
histogramValsCond = validateVals . HMS.elems | ||
allStakesCond = (< 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting way of writing this ^.^
When dealing with Either
like this, I usually go for the Monad
approach and leverage the when
combinator. It avoids all the hassle of defining extra identifiers (making sure variables don't conflicts etc..) and is, to my opinion, a bit more readable:
validateUtxoStatistics
:: HashMap Text Word64
-> Word64
-> Either UtxoStatisticsError (HashMap Text Word64, Word64)
validateUtxoStatistics histogram allStakes = do
let (keys, elems) = (HMS.keys histogram, HMS.elems histogram)
let acceptedKeys = show <$> (toList $ generateBounds Log10)
when (length keys <= 0) $
Left ErrHistogramEmpty
when (any (flip notElem acceptedKeys) keys) $
Left ErrHistogramNamesInvalid
when (any (< 0) elems) $
Left ErrHistogramUpperBoundNegative
when (allStakes < 0) $
Left ErrAllStakesNegative
Right (histogram, allStakes)
Also, this function looks like a good candidate for a smart constructor and could return an actual UtxoStatistics
rather than an HashMap and a Word64. You can find some inspiration here maybe:
(NOTE: We might want to expose this eitherToParse
function somewhere else ...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
addCountInBuckets thefirst acc entry = | ||
case Map.lookupGE entry acc of | ||
Just (k, v) -> Map.insert k (v+1) acc | ||
Nothing -> Map.adjust (+1) thefirst acc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From foldl
documentation, we find this:
Fold (x -> a -> x) x (x -> b) -- Fold step initial extract
When there's a clear structure like this, it's usually a good idea to have the implementation follows this structure as it helps understanding (same for bracket
for instance).
What about something like:
foldBuckets :: NonEmpty Word64 -> L.Fold Word64 [HistogramBar]
foldBuckets bounds =
let
step x a =
case Map.lookupGE a x of
Just (k, v) -> Map.insert k (v+1) x
Nothing -> Map.adjust (+1) (head bounds) x
initial =
Map.fromList $ zip (NL.toList bounds) (repeat 0)
extract =
map (uncurry HistogramBar) . Map.toList
in
L.Fold step initial extract
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
instance Arbitrary TxOut where | ||
arbitrary = genericArbitrary | ||
shrink = genericShrink |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does all this has to do with the PR 😮 ? Maybe an artifact from a rebase / merge conflict ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is unfortunately needed to support Utxo
|
||
instance Arbitrary UtxoStatistics where | ||
arbitrary = UtxoStatistics <$> arbitrary | ||
<*> arbitrary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "invariant" I was talking about enters into play here. We could generate an arbitrary statistics here where we have 14000
in the last bucket, but a total take of 42
ADA which doesn't make sense.
Hence the difficulty of having such parameter as part of the structure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some code that addresses this but it goes forever ;-/
= ErrHistogramEmpty | ||
| ErrHistogramNamesInvalid | ||
| ErrHistogramUpperBoundsNegative | ||
| ErrAllStakesNegative |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the total stake is lower than the minimum amount that can be represented with all the buckets 😛 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed
summarizeUtxoStatistics bounds = | ||
UtxoStatistics | ||
<$> populateBuckets bounds | ||
<*> L.sum |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit the same as below, minor styling remark but we could avoid the extra summarizeUtxoStatistics
and make the implementation a bit more straightforward:
computeUtxoStatistics :: [Word64] -> UtxoStatistics
computeUtxoStatistics = L.fold $ UtxoStatistics
<$> foldBuckets (generateBounds Log10)
<*> L.sum
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
histogram <- arbitrary | ||
let (minPossibleValue, maxPossibleValue) = getPossibleBounds histogram | ||
let histoBars = map (uncurry HistogramBarCount) $ Map.toList histogram | ||
allStakes <- arbitrary `suchThat` (\s -> s >= minPossibleValue && s <= maxPossibleValue) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, the way Quickcheck works usually makes suchThat
a poor choice in many case because the condition becomes harder and harder to satisfy as QuickCheck looks for values in a new range. A rule of the thumb is that it's good when you have to make sure some values get discarded, but when you want to pick values in a range, it's better to rely on choose
:
do
histogram <- arbitrary
allStakes <- choose (getPossibleBounds histogram)
...
possiblenames <- elements $ map show (NL.toList $ generateBounds Log10) | ||
bound <- arbitrary | ||
possiblenames <- elements (NL.toList $ generateBounds Log10) | ||
bound <- arbitrary `suchThat` (>= 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Word64
are unsigned-integers 😛
NOTE: We're going to need this (#3432) for the arguments list build issue |
This was done using weeder and is for now necessary for the CI to kick in. We are starting to reach a critical point in terms of dependencies and this kind of solution might not work anymore in the future. There's a PR opened about doing something against these 'arguments list too long'. :finger_crossed:
- We do not expose type internals and provide one smart constructor for that purpose (computeUtxoStatistics) - Same for the 'BoundType' which should actually be part of the API if we intend to use it, making it an opaque type with exposed constructors allow for easy extension and maintainability - Reviewed a bit errors to make constructors a bit less specified in favor of constructor with arg - Added 'BoundType' as 'boundType' to the JSON representation - Made Aeson & Swagger imports explicit!
This is more semantically correct and type-safe than taking a raw list of 'Word64'. This way, we also get documentation for free simply by looking at the function signature and also makes calls for callers simpler (provided they have a list of available utxos, but why would they call the function if they hadn't? :) )
2b5478b
to
166665c
Compare
|
||
-- * Constructing 'BoundType' | ||
, log10 | ||
) where |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@paweljakubas Only exposing opaque types here with smart-constructors.
Note that I have removed the few externals calls to mkUtxoStatistics
for they were not really needed and could simply rely on computeUtxoStatistics
. This function should be our one and single constructor for the outside world.
We now use mkUtxostatistics
internally only (in the FromJSON
instance).
Note also, I've removed the Haphazard
constructor for the BoundType
and here again, to keep the module easy to extend, we hide the implementation to the outside world, only having a "smart-constructor" log10
which does the trick.
= ErrEmptyHistogram | ||
| ErrInvalidBounds !Text | ||
| ErrInvalidTotalStakes !Text | ||
deriving (Eq, Show, Read, Generic) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@paweljakubas I've factored out a couple of errors as they were semantically referring to the same thing, only with a small difference which can be captured in a message Text
. It makes the error type a bit less bloated but still rather expressive.
object | ||
[ "histogram" .= histogramObject bars | ||
, "allStakes" .= allStakes | ||
, "boundType" .= log10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that we need "boundType"
in the JSON representation otherwise we simply force the bound type upon decoding a JSON object to Log10
and there's no point of keeping that flexible.
{-# INLINE log10 #-} | ||
|
||
-- | Compute UtxoStatistics from a bunch of UTXOs | ||
computeUtxoStatistics :: BoundType -> [Utxo] -> UtxoStatistics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also a change here we've suggested in the past and then forgot about. The API used to take [Word64]
only, but that gives us poor type-guarantees and requires the caller to know about how to extract the correct value from an Utxo
. Here, we take care of that burden ourselves and even get documentation for free!
-- | ||
-- INTERNALS | ||
-- | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything below is actually internal stuff and doesn't need to be exported 👍
calculatePossibleBound fromPair = | ||
sum . | ||
concatMap (\pair -> matching fromPair pair $ createBracketPairs acceptedKeys) $ | ||
Map.toList histogram |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I honestly haven't verified much of this as my brain isn't much capable of it right now :(
Description
Adding Utxo statistics endpoint to API V1 with new data layer support.
Idea was to deliver solution that can in one pass calculate a number of statistics. Besides histogram I added as an example total number of stakes in a given wallet calculation - which in the last version can go out. The solution relies on Gabriel Gonzalez foldl library.
Alternatively, I was also considering the approach adopted:
http://tech.frontrowed.com/2017/09/22/aggregations/
As an example statistics look like:
data UtxoStatistics = UtxoStatistics { theHistogram :: ![HistogramBar] , theAllStakes :: !Word64 }
I added full functionality for Handler and stub for LegacyHandler as it is about to be removed soon. I added integration test in WalletSpec, and also commented one in TransactionsSpec (to be uncommented and checked when Client.hs usage of LegacyHandler will be replaced with Handler).
Moreover, I intentionally used
[(Account, Utxo)]
in WalletLayer. Maybe in the future installation one can extend this endpoint and filter using account info, etc.Linked issue
https://iohk.myjetbrains.com/youtrack/issue/CO-347
Type of change
Developer checklist
Testing checklist
QA Steps
To build:
./scripts/build/cardano-sl.sh wallet-new
To test:
stack test cardano-sl-wallet-new
nix-build release.nix -A tests.walletIntegration
Screenshots (if available)