Property Test Guide

Shelley property testing

Shelley property tests are defined on [source state, signal, target state] for a given STS, e.g. the following property states that the total ADA in the source ChainState does not change when we run the CHAIN rule with the given Block signal:

checkPreservation ::
  (UsesTxOut era, UsesValue era) =>
  SourceSignalTarget (CHAIN era) ->
  Property
checkPreservation SourceSignalTarget {source, target} =
  counterexample
    ( ... )
    $ sourceTotal === targetTotal
  where
    sourceTotal = totalAda source
    targetTotal = totalAda target

Generating Traces

To test the above property, we need to generate an initial ChainState0 and Block0 and then compute the new ChainState1 by applying the CHAIN rule to (ChainState_0, Block).

Given ChainState1 we can generate the next Block1 and apply the CHAIN rule again. In this way we build up a CHAIN "trace":


[ChainState0, Block0, ChainState1]
[ChainState1, Block1, ChainState2]
[ChainState2, Block2, ChainState3]
...

Most properties are tested for each of these state transitions in a trace.

To generate a trace for an STS Rule, we create an instance of the HasTrace class, for example see the CHAIN HasTrace instance. This class specifies a signal generator/shrinker and a Environment generator.

Along with this we need to define a generator for initial state, e.g. see the ChainState0 generator.

With the HasTrace instance and initial state generator, we can generate a trace e.g. generating a CHAIN trace:

traceFromInitState @(CHAIN era) testGlobals tl (genEnv p) (Just $ mkGenesisChainState (genEnv p))

Certificate Trace Generator

For the certificate trace generator we define the HasTrace instance on a dummy STS Rule because we want to generate certificates along with index and witnesses, and since there is not an STS Rule that has (cert,wits,ix) as signal, we need to create our own.

The HasTrace signal generator uses genDCert, which generates all types of certificates.

It is worth noting that each generated certificate affects the STS state (via the STS transition) and hence affects the generation context of the next certificate. This is why we must generate a list of certificates in sequence (and cannot generate certificates indepedently).

Transaction Trace Generator

As mentioned above for certificates, we must generate transactions in sequence. We do this with a HasTrace instance defined on the LEDGERS STS rule.

The HasTrace signal generator uses genTx, which defines the transaction generator.

genTx

To generate a transaction:

pick some inputs from the Utxo (in LEDGERS state) - this implicitly provides a spending balance for the transaction.
generate some Reward withdrawals - which also adds to the available spending balance (may generate none)
generate protocol parameter updates - (may generate no Updates) - see genUpdate. Note, these updates have potentially huge effects on the chain simulation (e.g. consider updates of the d parameter)
generate certificates (using the certificate trace generator described earlier). Certificates can affect the spending balance by requiring deposit amounts or providing refund amounts
all of the above components of a transaction need to be witnessed, see
the spending balance of the transaction ultimately emerges from spendable utxo (inputs), withdrawals, deposits and refunds
pick output addresses and distribute (or "split") the spending balance over the outputs - this requires an instance of the Split class for the given Era - since values are era dependent
now that we have all the building blocks for a transaction, generate an era-dependent body. Note that for some eras we may need additional witness data.
attempt to converge on a balanced and valid transaction.

Notes:

the coin selection algorithm is implicit and occurs in genInputs and the converge function
given some utxo (in the ledger state) a generated transaction will redistribute inputs to outputs. Over time (and successive transactions) it is possible that the utxo gets fragmented into dust. In this case the transaction generator may fail to provide enough spending balance from input utxo, no matter how many inputs or how few outputs it selects. Moreover, once the utxo is fragmented, it becomes more likely that the generator will not recover and keep failing in this way. This will typically manifest as a OutputTooSmallUTxO predicate failure

Chain Trace Generator

The CHAIN trace is defined by the CHAIN HasTrace instance, which relies on the block generator genBlock.

The block generator computes the next slot for which to produce a block (along with the issuer of that block) by computing the overlay schedule on the fly. The overlay schedule changes per epoch and depends on the ChainState (including, e.g. the current value of the d parameter). In this way, the block slot emerges rather than being explicitly defined.

Block transactions are generated using the LEDGERS trace generator's signal generator.

Constants: tuning trace generation

The generators are configurable via the Constants data type, e.g.:

minNumGenInputs and maxNumGenInputs are used in genTx and define the range for how many utxo inputs the transaction generator will select, see.
[minGenesisUTxOouts and maxGenesisUTxOouts] define the range of outputs in the transaction generator. Caution: If we tune this constant such that there are on average more outputs generated than inputs, we will find that utxo is fragmented over time into smaller and smaller values, ultimately leading to the CHAIN trace generator "running out of Utxo" while generating chains of blocks of transactions.
minGenesisOutputVal and maxGenesisOutputVal define the range of coin values when generating genesis coin. minGenesisUTxOouts and maxGenesisUTxOouts define the range for the number of genesis values. Together, these constants specify the amount of utxo that the CHAIN/LEDGER trace generators start with. If the total genesis coin is too small the trace generators will more likely run out of Utxo, in which case one could try to increase the minGenesisOutputVal, and perhaps also maxGenesisOutputVal.
maxCertsPerTx used in genDCerts in set at 3 for the pinned commit. By setting it to 300, say, one may start seeing test failures related to negative transaction fees, or that the max transaction size is exceeded. Caution in order to prevent creating new types of failures when debugging properties, be sure to change these constant values very gradually, if at all.
various constants define the frequency of generated items, e.g. frequencies of certificate types in genDCert

Generating Valid signals

Generators used in the "trace based" property tests all attempt to generate only valid Chains, Blocks, Transactions, Certificates, etc.

The onlyValidChainSignalsAreGenerated property tests for generation of valid blocks/transactions/certificates/etc.

Generating invalid things can be very expensive when trying to build chains of blocks of transactions that are valid by the STS rules. For example, a single invalid transaction can also invalidate the enclosing transaction and block.

Classifying traces

Consider the generator for RegKey certificates - it is be possible that for the given state there are no availableKeys to choose from, in which case the generator will return Nothing as a silent fail.

In these cases we want to ensure that we are getting "reasonable numbers" of generated things, e.g. the classifier asserts that there is at least one RegKey certificate for every 10 transactions

Another notable clause in the classifier is at least 5 epochs in a trace, 20% of the time. In the interest of making the classifier less brittle, this is a conservative lower bound.

EraGen

Since transactions (tx body, scripts types, value types, metadata) are era-independent, we need to abstract over eras when generating these data structures. This is achieved with the EraGen typeclass:

Property Tests and the STS hierarchy

Consider testing a property of the ledger UTXO, e.g. the checkPreservation property used as an example at the start of this document. Since this property refers to utxo balances, we could have expressed it i.t.o. a UTXO trace of transactions. However, such transactions would be generated in the context of the UTXO STS state, which does not include delegation state, i.e. we would only be able to generate transactions without certificates.

Testing the checkPreservation property on a UTXO trace is a weaker test than testing it on transactions from a CHAIN trace (which would include effects on the Utxo by certificates/updates/etc).

In this sense there are strong incentives to move "high up the STS hierarchy" when generating traces to test properties on:

for one, we need access to the state/context we want to generate our data in, e.g. to generate transactions with certificates, we need to base our trace on an STS with state that includes utxo and delegation state.
only the CHAIN trace reflects a realistic/plausible overlay schedule (and hence slot distribution for blocks). This is another incentive to test properties derived from CHAIN traces.

This is also the case for properties defined on "low level" STS state, e.g, delegProp. We could test this on a trace generated for the DELEG STS, but this would be a weaker test than basing it on a CHAIN STS trace, hence the use of delegTraceFromBlock - a function that "projects" the CHAIN trace down to DELEG STS level.

There are several such projections from CHAIN trace.

Shrinking

Shrinking transactions is hard, since a change to any part invalidates the whole transaction. The same is true for shrinking Blocks, Certificates (certificates in a transactions are possibly dependent on previous certificates), Updates, etc.

While we don't shrink most generated data structures, we do shrink traces themselves, see shrinkTrace.

The trace shrinker is wired into the trace generator forAllTraceFromInitState.

Troubleshooting

OutputTooSmallUTxO predicate failure, running out of Utxo

See the following sections for troubleshooting "running out of Utxo" (which will manifest as a OutputTooSmallUTxO predicate failure in the property tests):

see genTx section
see Constants section

Silent Discards

Quickcheck generator discards are expensive when generating deep data structures (e.g. chains of blocks of transactions). Discards are problematic because they hand over control to the Quickcheck test machinery - this can lead to a lot of collateral discarding (a discarded transaction would lead to discarding the block that the transaction is part of).

There is only one QuickCheck generator discard in the pinned repo.

When troubleshooting a property failure, it can be informative at times to ensure that there are not a large number of discards (by using a Debug.Trace statement, say).

Silent Predicate Failures

There are a couple of places in the testing machinery where an STS rule could silently fail, in which case the testing machinery simply moves and tries the next possibile action. During troubleshooting it is worth checking that there are not excessive failures. This can be done by Debug.traceing the predicate failure:

Assertions

In the presence of actual predicate failures, the Assertions machinery can obscure the failures by reporting downstream assertion failures. To temporarily disable assertions, change AssertionsAll to AssertionsOff in applySTSTest

Provide feedback

Saved searches

Use saved searches to filter your results more quickly