Confusing behavior / sometimes failing validators for semantically equivalent changes. #3648

KtorZ · 2021-07-30T08:18:53Z

Area

[?] Plutus Foundation Related to the GHC plugin, Haskell-to-Plutus compiler, on-chain code
[?] Plutus Application Framework Related to the Plutus application backend (PAB), emulator, Plutus libraries
~~[] Marlowe Related to Marlowe~~
~~[] Other Any other topic (Playgrounds, etc.)~~

Summary

History

We had a working contract and test suite.
Then, we upgraded Plutus to the most recent commit hash 35b1e41 (at the time):
Our test suite was then failing (without any change) after the upgrade: cardano-scaling/hydra@6255012
We upgraded some other parts, and then gave a shot at fixing this test suite starting from: cardano-scaling/hydra@ce6e38c

Making what should be semantically equivalent changes (replacing a truthy expression with True, or changing from and [a,b] to a && b) do change the behavior of the validator in an unpredictable but deterministic way (see below for details).

Some changes made the test suite pass, some made it fail again but without clear reason. This is confusing :/

Steps to reproduce

Starting point

To the reproduce the initial failure, please follow these steps:

$ git clone https://github.com/input-output-hk/hydra-poc/tree/ce6e38c4d7cf70c0e6131ab75b23b238bb334d4a
$ cd hydra-poc 
$ nix-shell 
$ cabal test hydra-plutus --test-options '-p "Can CollectCom"

1. `and` -> `&&`

From there the fun begins... The following diff, makes the test suite pass. Notice that the only change is removing the use of and in favor of multiple inlined &&. Conditions are still the same and it is fully reproducible. Switching back to and makes the test fail again, and vice-versa.

cardano-scaling/hydra@ab45e9e

See here the diff with the contract PIR cardano-scaling/hydra@15b43ca

2. Alternative (equivalent) implementation

That second diff, reverts the previous one but, changes the implementation of our mustForwardParty predicate to (what should be) an equivalent implementation. That is, we are inlining the internals of checkScriptContext with the corresponding code handling the MustSpendAtLeast and MustProduceAtLeast data-constructors.

Changing this makes the test pass.

cardano-scaling/hydra@847e72a

See here the diff with the contract PIR cardano-scaling/hydra@66ba442

3. Removing a (truthy) predicate

This one is our favorite.

On this diff, we removed one of the predicate on a conjunction. What led us here was the absence of trace ("PT not produced") in the failure from the Emulator. This changes... makes the passing test now fail... ?!?

cardano-scaling/hydra@da605c6

See here the diff with the contract PIR cardano-scaling/hydra@149a841

4. Changing unrelated code

This is even better. Changing an unrelated case in the validator (according to diff (1)) makes the test pass again. Please do notice that the diff is about the Abort branch, which is not executed in the test (which fails on the CollectCom). Changing this was actually a mistake of ours as we intended to change the CollectCom branch, but the results were surprising, if not mind-blowing.

cardano-scaling/hydra@e0797ac

See here the diff with the contract PIR cardano-scaling/hydra@43727bf

Expected behavior

None of these changes should result in different validator behavior (observed from the tests). They are semantically equivalent.

System info (please complete the following information):

OS: NixOS, Ubuntu, Manjaro (everywhere we could try it out).
Version: 21.5, 20-04
Plutus version: 35b1e41

Screenshots and attachments

Additional context

Might be related to #3488 & #3601

(cc @ch1bo @abailly-iohk)

Thanks for the help 🙏

The text was updated successfully, but these errors were encountered:

michaelpj · 2021-07-30T11:27:42Z

When you say the tests pass/fail, do the failures come from script failures, or otherwise? Maybe @j-mueller can help us track this down? Because the scripts don't look too bad right now.

michaelpj · 2021-07-30T11:28:21Z

It would be helpful to have the trace log for the script that we expect to pass/fail in each case...

michaelpj · 2021-07-30T11:40:53Z

Another useful thing: I have a vague hypothesis that this could be due to differing strictnesses of && and all. We could verify this by putting some trace statements around each of the conditions and seeing whether we get all of them in the log in both cases.

If one of the conditions actually throws an error, but one of && or all is lazy enough not to evaluate it, then that could explain the difference.

KtorZ · 2021-07-30T11:52:21Z

When you say the tests pass/fail, do the failures come from script failures, or otherwise?

According to the emulator, the failures comes from the validator which fails to validate the transaction produced by the offchain code.

I have a vague hypothesis that this could be due to differing strictnesses of && and all

How come though? For a conjunction, I'd expect a lazy evaluation to stop at the first falsy value. And, the validator would actually fail on every call no matter what (because one of the element is falsy). But here, we have sometimes passing validators, and sometimes failing ones; To have a validator being evaluated to True, then surely all branches of the conjunction must evaluate to True 🤔

KtorZ · 2021-07-30T12:19:45Z

It would be helpful to have the trace log for the script that we expect to pass/fail in each case...

Here's an example of the (end of the) trace log for the base contract, which fails because of some evaluation error for one of the script output. It complains about pretty much all conditions present in the validator.

Contracts tests
  Hydra Scenarios
    Init > Commit > Commit > CollectCom: Can CollectCom when all parties have submitted: FAIL (0.25s)

[...]

[INFO] Slot 6: 00000000-0000-4000-8000-000000000000 {Contract instance for wallet 1}:
                 Receive endpoint call on 'collectCom' for Object (fromList [("contents",Array [Object (fromList [("getEndpointDescription",String "collectCom")]),Object (fromList [("unEndpointValue",Array [Object (fromList [("getPubKeyHash",String "35dedd2982a03cf39e7dce03c839994ffdec2ec6b04f1cf2d40e61a3")]),Array [Object (fromList [("txOutAddress",Object (fromList [("addressCredential",Object (fromList [("contents",Object (fromList [("getPubKeyHash",String "35dedd2982a03cf39e7dce03c839994ffdec2ec6b04f1cf2d40e61a3")])),("tag",String "PubKeyCredential")])),("addressStakingCredential",Null)])),("txOutDatumHash",Null),("txOutValue",Object (fromList [("getValue",Array [Array [Object (fromList [("unCurrencySymbol",String "")]),Array [Array [Object (fromList [("unTokenName",String "")]),Number 1000.0]]]])]))]),Object (fromList [("txOutAddress",Object (fromList [("addressCredential",Object (fromList [("contents",Object (fromList [("getPubKeyHash",String "977efb35ab621d39dbeb7274ec7795a34708ff4d25a01a1df04c1f27")])),("tag",String "PubKeyCredential")])),("addressStakingCredential",Null)])),("txOutDatumHash",Null),("txOutValue",Object (fromList [("getValue",Array [Array [Object (fromList [("unCurrencySymbol",String "")]),Array [Array [Object (fromList [("unTokenName",String "")]),Number 1000.0]]]])]))])]])])]),("tag",String "ExposeEndpointResp")])
[INFO] Slot 6: W1: Balancing an unbalanced transaction:
                     Tx:
                       Tx 1765d66cf878d1740a43e7fb1f750714e7691a2c07c93deb85c36881fed0034a:
                         {inputs:
                            - 22d479018f5cfeb6255e5f74769958691c6de75fabf00dc84b34f90ad3b4b907!1
                              <>
                            - e6556d453e86605eda29c0a221259cd831a91c1ad18c60f13a12142cb9964074!1
                              <>
                            - ebb0589a32f986407a2b2934a78b02fd03cfba5d289fde7c44a00222621f58cb!1
                              <>
                         collateral inputs:
                         outputs:
                           - Value (Map [(,Map [("",2000)]),(7823198bf3fa55a923f349eda13874c12deb28fa799b604b77430a82,Map [(0x35dedd2982a03cf39e7dce03c839994ffdec2ec6b04f1cf2d40e61a3,1),(0x977efb35ab621d39dbeb7274ec7795a34708ff4d25a01a1df04c1f27,1)])]) addressed to
                             addressed to ScriptCredential: 0d13999069c55537f3bb43f96040182c761c2d93d4b4fb031b9e6cff (no staking credential)
                         mint: Value (Map [])
                         fee: Value (Map [])
                         mps:
                         signatures:
                         validity range: Interval {ivFrom = LowerBound NegInf True, ivTo = UpperBound PosInf True}
                         data:
                           <>
                           <[<<<"\151~\251\&5\171b\GS9\219\235rt\236w\149\163G\b\255M%\160\SUB\GS\240L\US'">,
                           <>>,
                           [<"", [<"", 1000>]>],
                           <>>,
                           <<<"5\222\221)\130\160<\243\158}\206\ETX\200\&9\153O\253\236.\198\176O\FS\242\212\SOa\163">,
                           <>>,
                           [<"", [<"", 1000>]>],
                           <>>]>
                           <<<"\151~\251\&5\171b\GS9\219\235rt\236w\149\163G\b\255M%\160\SUB\GS\240L\US'">,
                           <>>,
                           [<"", [<"", 1000>]>],
                           <>>
                           <<<"5\222\221)\130\160<\243\158}\206\ETX\200\&9\153O\253\236.\198\176O\FS\242\212\SOa\163">,
                           <>>,
                           [<"", [<"", 1000>]>],
                           <>>}
                     Requires signatures:
                       35dedd2982a03cf39e7dce03c839994ffdec2ec6b04f1cf2d40e61a3
                     Utxo index:
                       ( 22d479018f5cfeb6255e5f74769958691c6de75fabf00dc84b34f90ad3b4b907!1
                       , - Value (Map [(,Map [("",1000)]),(7823198bf3fa55a923f349eda13874c12deb28fa799b604b77430a82,Map [(0x35dedd2982a03cf39e7dce03c839994ffdec2ec6b04f1cf2d40e61a3,1)])]) addressed to
                           addressed to ScriptCredential: be5719fcd50b32b8c5b37224c9165f84b881aeb99704ffb183358d1b (no staking credential) )
                       ( e6556d453e86605eda29c0a221259cd831a91c1ad18c60f13a12142cb9964074!1
                       , - Value (Map []) addressed to
                           addressed to ScriptCredential: 0d13999069c55537f3bb43f96040182c761c2d93d4b4fb031b9e6cff (no staking credential) )
                       ( ebb0589a32f986407a2b2934a78b02fd03cfba5d289fde7c44a00222621f58cb!1
                       , - Value (Map [(,Map [("",1000)]),(7823198bf3fa55a923f349eda13874c12deb28fa799b604b77430a82,Map [(0x977efb35ab621d39dbeb7274ec7795a34708ff4d25a01a1df04c1f27,1)])]) addressed to
                           addressed to ScriptCredential: be5719fcd50b32b8c5b37224c9165f84b881aeb99704ffb183358d1b (no staking credential) )
[WARNING] Slot 6: W1: Validation error: Phase2 1765d66cf878d1740a43e7fb1f750714e7691a2c07c93deb85c36881fed0034a: ScriptFailure (EvaluationError ["Missing signature","checkScriptContext failed","Missing datum","checkScriptContext failed","Check has failed"])
[WARNING] Slot 6: 00000000-0000-4000-8000-000000000000 {Contract instance for wallet 1}:
                    Contract instance stopped with error: WalletError (ValidationError (ScriptFailure (EvaluationError ["Missing signature","checkScriptContext failed","Missing datum","checkScriptContext failed","Check has failed"])))
  src/Plutus/Contract/Test.hs:243:
  Init > Commit > Commit > CollectCom: Can CollectCom when all parties have submitted

Changing from and to multiple inlined && makes the test pass:

Contracts tests
  Hydra Scenarios
    Init > Commit > Commit > CollectCom: Can CollectCom when all parties have submitted: OK (0.34s)

Changing it back to and but, inlining of the checkScriptContext in the validator branch makes it fail again with a similar error:

Contracts tests
  Hydra Scenarios
    Init > Commit > Commit > CollectCom: Can CollectCom when all parties have submitted: FAIL (0.24s)

[...]

      [WARNING] Slot 6: W1: Validation error: Phase2 df90f713608a4db280c74edba6be533305cbf72e7ea60b3d05bbdfe4bd12b610: ScriptFailure (EvaluationError ["Missing signature","checkScriptContext failed","Missing datum","checkScriptContext failed","Check has failed"])
      [WARNING] Slot 6: 00000000-0000-4000-8000-000000000000 {Contract instance for wallet 1}:
                          Contract instance stopped with error: WalletError (ValidationError (ScriptFailure (EvaluationError ["Missing signature","checkScriptContext failed","Missing datum","checkScriptContext failed","Check has failed"])))
        src/Plutus/Contract/Test.hs:243:
        Init > Commit > Commit > CollectCom: Can CollectCom when all parties have submitted

ch1bo · 2021-08-10T08:58:01Z

After updating plutus to f174d27 (master from yesterday) I do get now a bit more info about errors on our test failures:

      [WARNING] Slot 6: W1: Validation error: Phase2 69e0b9baf815d4316ec26ad97131847d8eb1d8550b2aa46df5a7915c53f40409: ScriptFailure (EvaluationError ["L4","Ld","L2","Ld","Pd"] "CekEvaluationFailure")
      [WARNING] Slot 6: 00000000-0000-4000-8000-000000000000 {Contract instance for wallet 1}:
                          Contract instance stopped with error: WalletError (ValidationError (ScriptFailure (EvaluationError ["L4","Ld","L2","Ld","Pd"] "CekEvaluationFailure")))
        src/Plutus/Contract/Test.hs:241:
        Init > Commit > Commit > CollectCom: Can CollectCom when all parties have submitted

1 out of 1 tests failed (0.19s)
Test suite hydra-plutus-test: FAIL

But what was passing before is not anymore and what was not passing before does work now. For example before rebase, this diff cardano-scaling/hydra@4bad81a was making it pass, but now (rebased) this commit cardano-scaling/hydra@a268d57 does make it fail, i.e. on the previous version True was failing and 1 == 1 was passing, but now it's the other way around.

Working hypothesis of the plutus team: could be ordering issues.

effectfully · 2023-02-01T23:13:21Z

i.e. on the previous version True was failing and 1 == 1 was passing, but now it's the other way around.

WAT?

@KtorZ @ch1bo Did you guys figure out what was going on in the end?

abailly-iohk · 2023-02-02T07:50:44Z

This is really old and given we have had our contracts written and running for a year and half, and we ditched use of trace emulator, PAB, Contract monad, etc. I would suggest to close this issue but @KtorZ might know better.

KtorZ · 2023-02-02T08:00:10Z

I've completely lost context of this by now. I believe it was an issue with the emulator and not with the on-chain code.

ch1bo · 2023-02-02T08:14:26Z

No clue either. If there is a problem with the plutus core or plutus-tx, we would have ran into this again. Which did not happen.. so I suggest we just close this.

effectfully · 2023-02-02T13:51:16Z

Thanks everyone!

KtorZ added the bug label Jul 30, 2021

KtorZ self-assigned this Jul 30, 2021

abailly-iohk mentioned this issue Aug 9, 2021

Cryptic failure report in test execution #3488

Closed

effectfully added the status: needs triage GH issues that requires triage label Feb 1, 2023

ch1bo closed this as completed Feb 2, 2023

effectfully removed the status: needs triage GH issues that requires triage label Feb 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Confusing behavior / sometimes failing validators for semantically equivalent changes. #3648

Confusing behavior / sometimes failing validators for semantically equivalent changes. #3648

KtorZ commented Jul 30, 2021 •

edited

Loading

michaelpj commented Jul 30, 2021

michaelpj commented Jul 30, 2021

michaelpj commented Jul 30, 2021

KtorZ commented Jul 30, 2021

KtorZ commented Jul 30, 2021

ch1bo commented Aug 10, 2021

effectfully commented Feb 1, 2023

abailly-iohk commented Feb 2, 2023

KtorZ commented Feb 2, 2023

ch1bo commented Feb 2, 2023

effectfully commented Feb 2, 2023

Confusing behavior / sometimes failing validators for semantically equivalent changes. #3648

Confusing behavior / sometimes failing validators for semantically equivalent changes. #3648

Comments

KtorZ commented Jul 30, 2021 • edited Loading

Area

Summary

History

Steps to reproduce

Starting point

1. and -> &&

2. Alternative (equivalent) implementation

3. Removing a (truthy) predicate

4. Changing unrelated code

Expected behavior

System info (please complete the following information):

Screenshots and attachments

Additional context

michaelpj commented Jul 30, 2021

michaelpj commented Jul 30, 2021

michaelpj commented Jul 30, 2021

KtorZ commented Jul 30, 2021

KtorZ commented Jul 30, 2021

ch1bo commented Aug 10, 2021

effectfully commented Feb 1, 2023

abailly-iohk commented Feb 2, 2023

KtorZ commented Feb 2, 2023

ch1bo commented Feb 2, 2023

effectfully commented Feb 2, 2023

KtorZ commented Jul 30, 2021 •

edited

Loading

1. `and` -> `&&`