Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verify that equivalent RawVals and ScVals compare the same #743

Closed
brson opened this issue Mar 24, 2023 · 25 comments · Fixed by stellar/rs-soroban-sdk#957
Closed

Verify that equivalent RawVals and ScVals compare the same #743

brson opened this issue Mar 24, 2023 · 25 comments · Fixed by stellar/rs-soroban-sdk#957
Assignees
Labels
bug Something isn't working

Comments

@brson
Copy link
Contributor

brson commented Mar 24, 2023

Per 225cf4f#r1144119247 it is crucial that for all equivalent pairs of RawVals and ScVals, the pair has the same comparison results.

It looks to me that RawVal uses a custom Compare trait - RawVals don't appear to implement Ord; ScVal does implement Ord. So presumably the result of Compare on RawVals needs to be equivalent to the results of Ord::cmp (and PartialOrd::partial_cmp).

This should be testable with the fuzzing infrastructure in stellar/rs-soroban-sdk#878 so I intend to try that. Fuzzing won't guarantee all possible values compare the same, but better than nothing.

cc @graydon @dmkozh

@brson brson added the bug Something isn't working label Mar 24, 2023
@brson
Copy link
Contributor Author

brson commented Mar 26, 2023

I have started fuzzing as described and have found differences in how RawVal and ScVal implement comparison.

ScVal's derive Ord, and RawVal implements Compare by hand, so to remediate issues I am currently focusing on making the RawVal implementations match the derived ScVal implementations.

It may be prudent though to write the Ord impls for XDR types by hand if it is important that they not change. As an example, the ScVal enum's Ord impl depends on the lexical order of ScVal variants, something that is prone to changing as the XDR definitions change.

@brson
Copy link
Contributor Author

brson commented Mar 26, 2023

If it is crucial for these comparison functions to be the same then perhaps it is possible to just remove the Ord impl from ScVal and always do comparisons as RawVals. It seems very high-risk to have these two impls that might diverge over time, even if we fix them once with some confidence.

Or maybe there is some way to unify the impls, though it seems difficult.

@dmkozh
Copy link
Contributor

dmkozh commented Mar 27, 2023

Thanks for looking into this!

The issue is indeed concerning, though maybe it's not as bad as it seems.

For the particular case described in the PR only equivalence of Eq comparisons is important - the internal order divergence doesn't really matter as we're interested in the key retrieval interface only.

However, I agree that Ord doesn't seem useful/correct for ScVal. Besides the mentioned issue of RawVal divergence, since ScVal is defined in XDR, different XDR libraries might define different comparison algorithms (as it's not a part of XDR standard). So even if I wanted to do some client-side sorting of ScVal, there is no way to guarantee that it would match RawVal comparison logic. I don't think we should have the cases where comparing ScVals is useful on the host-side, though I might be wrong.

Looking for comments from @graydon as well.

@brson
Copy link
Contributor Author

brson commented Mar 31, 2023

Thanks for the clarifications @dmkozh.

I've continued this fuzzing under the assumption that the only comparisons that matter are those where one of the RawVal or ScVal comparisons returns Ordering::Equal.

Is it important that impl Compare<ScVal> for Budget has the same logic as impl Compare<RawVal> for Host and/or PartialOrd for ScVal?

I have found one bug in Compare<ScVec> for Budget where differing vector lengths are not accounted for. Likewise for Compare<ScMap> for Budget.

@graydon
Copy link
Contributor

graydon commented Mar 31, 2023

Unfortunately it's not just a matter of equal comparisons, we also need identical behaviour wrt. the total order of the values. This is because we treat maps as ordered collections, and their order is canonical so as not to admit logically-equal maps that have physically-different (encoded) representations. This is better than the alternative, in which we say we don't canonicalize the order of values and then a signature on a message {"a": 10, "b": 11} doesn't apply to a message {"b":11, "a": 10}.

Put another way: users think of maps as sets-of-pairs, and sets-of-pairs have no logical order, but hashing relies on a physical order, so we impose a canonical order to the not-logically-ordered artifact to eliminate the ability to make logically-equal but physically-different maps.

This is all 100% intentional and part of the design.

@graydon
Copy link
Contributor

graydon commented Mar 31, 2023

(Also practically speaking the "key-retrieval interface" relies on binary search in sorted maps, which relies on proper ordering, so key retrieval will fail if the ordering isn't correct.)

@dmkozh
Copy link
Contributor

dmkozh commented Apr 3, 2023

Also practically speaking the "key-retrieval interface" relies on binary search in sorted maps, which relies on proper ordering, so key retrieval will fail if the ordering isn't correct.

Actually, what happens if we provide an incorrectly sorted map as a user input? Will the host fail or will we just instantiate a 'broken' map?

I think it would be nice to avoid having two different comparison implementations. The comparisons should normally happen only on RawVals. ScVal is just a serialization format, so as long as it doesn't shuffle the data during conversions, we should be good w.r.t. hashing. Basically the host implementation of RawVal comparison should be the source of truth and we shouldn't really need to sort ScVals on the host side, should we? The client implementations of course need to follow the host implementation, but it's much less dangerous than host itself having divergent implementations.

@graydon
Copy link
Contributor

graydon commented Apr 3, 2023

If you build a map out of a mis-sorted vector of pairs, the call to initialize the map fails. If you try to deserialize a mis-sorted ScMap into a host map, the deserialization fails.

The host never sorts ScVal by Ord but an external user / guest / other client can. Even if we turned off Ord on the client, nothing stops someone else from turning it on. If there are divergences between a naive structural sort as generated by #[derive(Ord)] and what we're doing in Compare<RawVal> I actually really want to know and understand what's causing them.

@graydon
Copy link
Contributor

graydon commented Apr 3, 2023

(if divergence is arising from something deeply subtle and impossible to guard against, I can imagine being persuaded to turn off #[derive(Ord)] on the XDR, but it's generally a useful feature in XDR and it seems much more likely to me that a divergence arises from a bug in Compare<RawVal>. I want to find those!)

@dmkozh
Copy link
Contributor

dmkozh commented Apr 3, 2023

The host never sorts ScVal by Ord

But why do we implement Compare<ScVal> for Budget then if it's not intended to be called by host? Ord on XDR is fine because it's not a part of the host implementation and divergence should (hopefully) only affect the client.

I agree that there might be bugs in RawVal comparisons as well and we should make sure it works as expected.

@graydon
Copy link
Contributor

graydon commented Apr 3, 2023

The host does call Compare<ScVal> for Budget in the storage map. It doesn't call ScVal::Ord because that's un-metered.

@dmkozh
Copy link
Contributor

dmkozh commented Apr 3, 2023

Right, that's what we do now. Would it be too silly to use RawVals for the storage comparisons as well? I realize that we have to compute the LedgerKey anyway to verify its presence in the footprint, but that would be a bit safer and the overhead IIUC would come only from storing an additional RawVal per map entry. Maybe it's not worth the effort, just something to consider.

@brson
Copy link
Contributor Author

brson commented Apr 4, 2023

There are many discrepancies between impl Ord for ScVal and the impls of Compare,
so many that it would help to know for sure before continuing that it is definitely important that they agree.
Furthermore, I have found that some cases of Compare<ScVal> for Budget do defer to impls of Ord for ScVal,
and that does seem to result in potentially incorrect behavior that deviates from Compare<RawVal> for Env.

I'm a bit overwhelmed, not sure what should be changed and what should be ignored, but here are my concrete findings so far.

Patches

These are patches I've made to change the behavior of comparison functions,
with explanations.

 impl Ord for Status {
     #[inline(always)]
     fn cmp(&self, other: &Self) -> Ordering {
-        let self_tup = (self.as_raw().get_major(), self.as_raw().get_minor());
-        let other_tup = (other.as_raw().get_major(), other.as_raw().get_minor());
+        let self_tup = (self.as_raw().get_minor(), self.as_raw().get_major());
+        let other_tup = (other.as_raw().get_minor(), other.as_raw().get_major());
         self_tup.cmp(&other_tup)
     }
 }

This makes Compare<RawVal> for Env agree with Ord for ScStatus -
the minor part is the ScStatus variant, the major part the contained value.


+// Note that these must have the same order as the impl
+// of Ord for ScVal, re https://github.com/stellar/rs-soroban-env/issues/743
 fn host_obj_discriminant(ho: &HostObject) -> usize {
     match ho {
-        HostObject::Vec(_) => 0,
-        HostObject::Map(_) => 1,
-        HostObject::U64(_) => 2,
-        HostObject::I64(_) => 3,
-        HostObject::TimePoint(_) => 4,
-        HostObject::Duration(_) => 5,
-        HostObject::U128(_) => 6,
-        HostObject::I128(_) => 7,
-        HostObject::U256(_) => 8,
-        HostObject::I256(_) => 9,
-        HostObject::Bytes(_) => 10,
-        HostObject::String(_) => 11,
-        HostObject::Symbol(_) => 12,
-        HostObject::Address(_) => 13,
-        HostObject::ContractExecutable(_) => 14,
+        HostObject::U64(_) => 0,
+        HostObject::I64(_) => 1,
+        HostObject::TimePoint(_) => 2,
+        HostObject::Duration(_) => 3,
+        HostObject::U128(_) => 4,
+        HostObject::I128(_) => 5,
+        HostObject::U256(_) => 6,
+        HostObject::I256(_) => 7,
+        HostObject::Bytes(_) => 8,
+        HostObject::String(_) => 9,
+        HostObject::Symbol(_) => 10,
+        HostObject::Vec(_) => 11,
+        HostObject::Map(_) => 12,
+        HostObject::ContractExecutable(_) => 13,
+        HostObject::Address(_) => 14,
         HostObject::NonceKey(_) => 15,
     }

This makes the discriminants agree with Ord for ScVal, so that when objects of different types are compared they compare the same as Ord for ScVal.


@@ -211,13 +214,9 @@ impl Compare<ScVec> for Budget {
     type Error = HostError;

     fn compare(&self, a: &ScVec, b: &ScVec) -> Result<Ordering, Self::Error> {
-        for (a, b) in a.iter().zip(b.iter()) {
-            match self.compare(a, b)? {
-                Ordering::Equal => (),
-                unequal => return Ok(unequal),
-            }
-        }
-        Ok(Ordering::Equal)
+        let a: &Vec<ScVal> = &*a;
+        let b: &Vec<ScVal> = &*b;
+        self.compare(a, b)
     }
 }

This handles the case where the vecs have different lengths. Seems to just be a plain bug.


@@ -225,13 +224,22 @@ impl Compare<ScMap> for Budget {
     type Error = HostError;

     fn compare(&self, a: &ScMap, b: &ScMap) -> Result<Ordering, Self::Error> {
-        for (a, b) in a.iter().zip(b.iter()) {
-            match self.compare(&(&a.key, &a.val), &(&b.key, &b.val))? {
-                Ordering::Equal => (),
-                unequal => return Ok(unequal),
+        let a: &Vec<ScMapEntry> = &*a;
+        let b: &Vec<ScMapEntry> = &*b;
+        self.compare(a, b)
+    }
+}
+
+impl Compare<ScMapEntry> for Budget {
+    type Error = HostError;
+
+    fn compare(&self, a: &ScMapEntry, b: &ScMapEntry) -> Result<Ordering, Self::Error> {
+        match self.compare(&a.key, &b.key)? {
+            Ordering::Equal => {
+                self.compare(&a.val, &b.val)
             }
+            cmp => Ok(cmp),
         }
-        Ok(Ordering::Equal)
     }

This handles the case where the maps have different lengths. Seems to just be a plain bug.


Unresolved findings

I haven't made patches for these because they would take
a bunch of effort that might not be necessary depending on whether
it matters how Ord for ScVal is implemented.

i128/u128 comparisons

When both values are i128 or u128, Compare does a 128-bit numeric comparison;
whereas the Ord XDR implementations are doing comparisons on the layout
of two u64 struct fields, ordered lo, hi.

pub struct Int128Parts {
    pub lo: u64,
    pub hi: u64,
}

Compare<ScVal> for Budget falls back to Ord for ScVal

Related to the above.

impl Compare<ScVal> for Budget {
    type Error = HostError;

    fn compare(&self, a: &ScVal, b: &ScVal) -> Result<Ordering, Self::Error> {
        use ScVal::*;
        match (a, b) {
            (Vec(Some(a)), Vec(Some(b))) => self.compare(a, b),
            (Map(Some(a)), Map(Some(b))) => self.compare(a, b),

            (Vec(None), _) | (_, Vec(None)) | (Map(None), _) | (_, Map(None)) => {
                Err(ScHostValErrorCode::MissingObject.into())
            }

            (Bytes(a), Bytes(b)) => {
                <Self as Compare<&[u8]>>::compare(self, &a.as_slice(), &b.as_slice())
            }

            (String(a), String(b)) => {
                <Self as Compare<&[u8]>>::compare(self, &a.as_slice(), &b.as_slice())
            }

            (Symbol(a), Symbol(b)) => {
                <Self as Compare<&[u8]>>::compare(self, &a.as_slice(), &b.as_slice())
            }

            (Bool(_), _)
            | (Void, _)
            | (Status(_), _)
            | (U32(_), _)
            | (I32(_), _)
            | (U64(_), _)
            | (I64(_), _)
            | (Timepoint(_), _)
            | (Duration(_), _)
            | (U128(_), _)
            | (I128(_), _)
            | (U256(_), _)
            | (I256(_), _)
            | (Bytes(_), _)
            | (String(_), _)
            | (Symbol(_), _)
            | (Vec(_), _)
            | (Map(_), _)
            | (ContractExecutable(_), _)
            | (Address(_), _)
            | (LedgerKeyContractExecutable, _)
            | (LedgerKeyNonce(_), _) => Ok(a.cmp(b)),
        }
    }
}

The fallthrough case defers to Ord for ScVal.
Mentioned above, the i128 case at least disagrees with Compare<RawVal> for Env
because XDR i128s are not 128-bit ints but are two 64-bit uints.

Tag values disagree with ScVal variants

This shows up when e.g. a comparison between a U64 object and a small I128:
the Compare<RawVal> falls back to compare the Tag discriminant;
and a U64 object compares greater than a I128 smallval,
whereas in XDR all U64s are less than all I128s.

#[repr(u8)]
#[derive(Copy, Clone, PartialEq, Eq, PartialOrd, Ord, Hash, Debug)]
pub enum Tag {

    /// Tag for a [RawVal] that contains a [u64] small enough to fit in 56 bits.
    U64Small = 6,

    /// Tag for a [RawVal] that contains an [i64] small enough to fit in 56 bits.
    I64Small = 7,

    ...

    /// Tag for a [RawVal] that contains a [u128] small enough to fit in 56 bits.
    U128Small = 10,

    /// Tag for a [RawVal] that contains a [i128] small enough to fit in 56 bits.
    I128Small = 11,

    ...

    /// Tag for a [RawVal] that refers to a host-side [u64] number.
    U64Object = 64,

    /// Tag for a [RawVal] that refers to a host-side [i64] number.
    I64Object = 65,
}

@graydon
Copy link
Contributor

graydon commented Apr 5, 2023

These are fantastic findings @brson thank you! Much to think about and discuss here. The i128 case probably has a similar companion issue around signed 256-bit values (since they're stored as 32-byte octet arrays in XDR, which won't sort correctly for negative values).

Concerning the vexing cases you identified but haven't patched yet, I think there are fixes to be had (besides "abandon the whole exercise and revisit the question of even having Ord on XDR", which I don't feel we're quite at the point of yet):

  • I think we can fix the i128-bit and 256-bit issues by changing the XDR:

    • Switching the representation of 128-bit numbers to a big-endian byte array, as with i256
    • Adding an explicit bool positive field before, so negatives sort before positives, and the body is always unsigned
    • Extending the ScVal validation code to exclude too-small, too-big, and negative-zero signed-number cases, for both i128 and i256
      (though we could also treat this whole exercise as a signal to remove signed 128 and 256 options, but that'd be much more invasive)
  • I'm not 100% sure about the Compare<RawVal> case you're describing. In the code I see, if either input to the comparison is an object type, we delegate to the environment's obj_cmp function (which has special cases for all obj-vs-small and small-vs-obj mixtures) and we only do a tag-vs-tag comparison in Compare<RawVal> in the residual cases which is small-vs-small. Though .. I see now that in the environment itself, in Host::obj_cmp, in the final residual branch (host.rs line 1714) when we were given a mix of object and small and they're not of the same underlying logical-ScValType, we do a tag-based comparison, which .. yeah, I suppose that case is wrong! Is that the one you meant? If so, great catch! I think it's fairly readily solvable by making a helper method on Tag that maps from Tag to its underlying (coarser) ScValType, and then comparing those. We only ever wind up in this branch if we were given values of unequal logical-ScValType, so making that inequality the basis for comparison should work and make things like U64-object-less-than-i128-small compare the same as they do in XDR.

@brson
Copy link
Contributor Author

brson commented Apr 5, 2023

  • Though .. I see now that in the environment itself, in Host::obj_cmp, in the final residual branch (host.rs line 1714) when we were given a mix of object and small and they're not of the same underlying logical-ScValType, we do a tag-based comparison, which .. yeah, I suppose that case is wrong! Is that the one you meant?

Yes, I believe that is the case where I am seeing a failure.

There are several types that I don't have fuzzing for yet, including the 256-bit integers and Timepoint. I'll add support for them soon, start making the fixes you've suggested, and continue fuzzing.

@graydon
Copy link
Contributor

graydon commented Apr 7, 2023

Ok I spent a bunch of time staring at the definition of 2s complement and running tests that enumerate all the boundary conditions because despite all attempts after decades of mistakes I just don't trust my guesses about splitting numbers and sign bits at all but .. I am fairly convinced at this point that the following representation works right:

struct UInt128Parts {
    uint64 hi;
    uint64 lo;
};

// A signed int128 has a high sign bit and 127 value bits. We break it into a
// signed high int64 (that carries the sign bit and the high 63 value bits) and
// a low unsigned uint64 that carries the low 64 bits. This will sort in
// generated code in the same order the underlying int128 sorts.
struct Int128Parts {
    int64 hi;
    uint64 lo;
};

struct UInt256Parts {
    uint64 hi;
    uint64 lo;
};

// Repeat the same signed high word, unsigned low words thing with int256. 
struct Int256Parts {
    int64 hi_hi;
    uint64 hi_lo;
    uint64 lo_hi;
    uint64 lo_lo;
};

Feel free to pick better names for the fields that convey their most-to-least-significant order.

I think this is better than trying to recycle the "byte array" representation both because it's marginally faster and also because in XDR we don't have signed bytes at all, and we need a signed 2s complement integer type of some sort for the most significant byte-or-word-or-whatever of the encoding. Since int64 and uint64 are kicking around, we might as well use 'em.

There's a bunch of code to change to make this work in both XDR, guest and host. I talked to @jayz22 about this today and he suggested that he might want to do it (since doing the int256 host functions is currently on his plate anyways). I have no preference who does it -- I will if nobody else does! -- but I think this is the right change to make to both of the larger-integer types in XDR.

@jayz22
Copy link
Contributor

jayz22 commented Apr 7, 2023

@graydon Yes I am happy to do it!

@brson
Copy link
Contributor Author

brson commented Apr 9, 2023

I am working on a a PR to just land the three patches from me previous comment. I am struggling to write comprehensive unit tests though: these bugs are fixed in the soroban-env crate; but I want to write proptests in the soroban-sdk crate; but I can't write proptests without landing a bunch of my broken/incomplete fuzzing stuff.

So I'll try to land some of these fixes with cursory unit tests, then endeavor to write comprehensive tests as I get my fuzzing patches straightened out.

@brson
Copy link
Contributor Author

brson commented Apr 10, 2023

I put some initial fixes in #762

I'll rebase the fuzzer and try to get more results later this week.

@jayz22
Copy link
Contributor

jayz22 commented Apr 12, 2023

@brson The i/u128 issue should be fixed now by #763, it also fixed/added support for i/u256 as well. You should be able to fuzz them now.

@brson
Copy link
Contributor Author

brson commented Apr 14, 2023

I've posted a patch to fix a case where Symbol objects can contain invalid chars: #765

Next I will fix the previously mentioned case where objects and smallvals of different types do not compare correctly.

@brson
Copy link
Contributor Author

brson commented Apr 17, 2023

I have written a patch to fix the comparison between objects and smallvals of different types, but not submitted a pull request.

The only new thing I've noticed is that Tag::Bad seems a bit special - any RawVal with an invalid tag will compare equal to any other RawVal with an invalid tag; and invalid RawVals also don't have an equivalent XDR type. I don't see a definite problem with Tag::Bad but I can imagine it being used to confuse container types, with multiple different RawVals comparing the same.

Right now the fuzzer is running without failing. I have a few more things to add to it. And once I'm not finding any more problems, I will look into converting the fuzz tests to proptests that will be easier to run as part of the test suite.

@brson
Copy link
Contributor Author

brson commented Apr 19, 2023

Here's the fix for comparisons between objects and smallvals of differing types: #767

@brson
Copy link
Contributor Author

brson commented May 1, 2023

As of now all of the bugs I'm aware of related to this issue are fixed. I still intend to land proptests for this, and possibly expand the fuzzer to more edge cases.

The latest curiosity I have discovered, which I don't know if it is a bug or not is:

It is possible to create Status values in contracts that can't be converted to XDR types - Status values can contain arbitrary status codes, but the XDR definitions are limited to a fixed set of enumerated status codes. I don't know what impact this could have on contracts that create these semi-invalid statuses.

@brson
Copy link
Contributor Author

brson commented May 8, 2023

Here's one more minor problem where some Status's can't be converted to ScStatus even though they are representable in XDR: #803

leighmcculloch added a commit to stellar/rs-soroban-sdk that referenced this issue Jun 21, 2023
### What

Preliminary support for fuzzing Soroban contracts with
[cargo-fuzz](https://github.com/rust-fuzz/cargo-fuzz/). This patch is
primarily concerned with introducing a pattern for implementing the
[Arbitrary](https://docs.rs/arbitrary/latest/arbitrary/) trait, by which
fuzz tests generate semi-random Rust values.

This is a new revision of previous pr
#878. Little has changed
functionally since that pr.

This patch additionally uses the Arbitrary impls with
[`proptest-arbitrary-interop`
crate](https://github.com/graydon/proptest-arbitrary-interop) to add
proptests that help ensure that:
RawVals and ScVals can be converted between each other and their their
comparison functions are equivalent, which closes
stellar/rs-soroban-env#743.

This patch introduces the SorobanArbitrary trait which looks like this:

```rust
    pub trait SorobanArbitrary:
        TryFromVal<Env, Self::Prototype> + IntoVal<Env, RawVal> + TryFromVal<Env, RawVal>
    {
        type Prototype: for<'a> Arbitrary<'a>;
    }
```

Basically every type relevant to soroban contracts implements (or
should) this trait, including i32, u32, i64, u64, i128, u128, (), bool,
I256Val, U256Val, Error, Bytes, BytesN, Vec, Map, Address, Symbol,
TimepointVal, DurationVal.

The `#[contracttype]` macro automatically derives an implementation,
along with a type that implements `SorobanArbitrary::Prototype`.

In use the trait looks like

```rust
use soroban_sdk::{Address, Env, Vec};
use soroban_sdk::contracttype;
use soroban_sdk::arbitrary::{Arbitrary, SorobanArbitrary};
use std::vec::Vec as RustVec;

#[derive(Arbitrary, Debug)]
struct TestInput {
    deposit_amount: i128,
    claim_address: <Address as SorobanArbitrary>::Prototype,
    time_bound: <TimeBound as SorobanArbitrary>::Prototype,
}

#[contracttype]
pub struct TimeBound {
    pub kind: TimeBoundKind,
    pub timestamp: u64,
}

#[contracttype]
pub enum TimeBoundKind {
    Before,
    After,
}

fuzz_target!(|input: TestInput| {
    let env = Env::default();
    let claim_address: Address = input.claim_address.into_val(&env);
    let time_bound: TimeBound = input.time_bound.into_val(&env).
    // fuzz the program based on the input
});
```

A more complete example is at
https://github.com/brson/rs-soroban-sdk/blob/val-fuzz/soroban-sdk/fuzz/fuzz_targets/fuzz_rawval_cmp.rs


### Why

This patch assumes it is desirable to fuzz Soroban contracts with
cargo-fuzz.

Soroban reference types can only be constructed with an `Env`, but the
`Arbitrary` trait constructs values from nothing but bytes. The
`SorobanArbitrary` trait provides a pattern to bridge this gap,
expecting fuzz tests to construct `SorobanArbitrary::Prototype` types,
and then convert them to their final type with `FromVal`/`IntoVal`.

There are a lot of docs here and hopefully they explain what's going on
well enough.

### fuzz_catch_panic

This patch also introduces a helper function, `fuzz_catch_panic`, which
is built off of the `call_with_suppressed_panic_hook` function in
soroban-env-host.

The `fuzz_target!` macro overrides the Rust panic hook to abort on
panic, assuming all panics are bugs, but Soroban contracts fail by
panicking, and I have found it desirable to fuzz failing contracts.
`fuzz_catch_panic` temporarily prevents the fuzzer from aborting on
panic.


### Known limitations

The introduction of SorobanArbitrary requires a bunch of extra
documentation to explain why Soroban contracts can't just use the stock
Arbitrary trait.

As an alternative to this approach, we could instead expect users to
construct XDR types, not SorobanArbitrary::Prototype types, and convert
those to RawVals. I have been assuming that contract authors should
rather concern themselves with contract types, and not the serialization
types; and presently there are a number of XDR types that have values
which can't be converted to contract types. The
SorobanArbitrary::Prototype approach does allow more flexibility in the
generation of contract types, e.g. we can generate some adversarial
types like invalid object references and bad tags.

Contracts must use `IntoVal` to create the final types, but these traits
are under-constrained for this purpose and always require type hints:

```rust
fuzz_target!(|input: <Address as SorobanArbitrary>::Prototype| {
    let env = Env::default();
    let address: Address = input.into_val(&env);
    // fuzz the program based on the input
});
```

This is quite unfortunate because it means the real type must be named
twice.

This patch introduces a new private module `arbitrary_extra` which
simply defines a bunch of new conversions like

```rust
impl TryFromVal<Env, u32> for u32 {
    type Error = ConversionError;
    fn try_from_val(_env: &Env, v: &u32) -> Result<Self, Self::Error> {
        Ok(*v)
    }
}
```

These conversions are required by `SorobanArbitrary`, which is only
defined when `cfg(feature = "testutils")`; the `arbitrary_extra` module
defines these conversions for all cfgs to ensure type inference is
always the same, but the impls are probably useless outside of the
SorobanArbitrary prototype pattern.

Crates that use `#[contracttype]` need to define a "testutils" feature
if they want to use Arbitrary.

This patch doesn't generate "adversarial" values, which might include:

- RawVals with bad tags
- objects that have invalid references
- objects that are reused multiple times
- deeply nested vecs and maps
- vecs and maps that contain heterogenous element types

The arbitrary module has unit tests, and these help especially ensure
the macros for custom types are correct, but the tests are not
exhaustive.

---------

Co-authored-by: Leigh McCulloch <[email protected]>
Co-authored-by: Graydon Hoare <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants