-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Verify that equivalent RawVals and ScVals compare the same #743
Comments
I have started fuzzing as described and have found differences in how RawVal and ScVal implement comparison. ScVal's derive Ord, and RawVal implements Compare by hand, so to remediate issues I am currently focusing on making the RawVal implementations match the derived ScVal implementations. It may be prudent though to write the Ord impls for XDR types by hand if it is important that they not change. As an example, the ScVal enum's Ord impl depends on the lexical order of ScVal variants, something that is prone to changing as the XDR definitions change. |
If it is crucial for these comparison functions to be the same then perhaps it is possible to just remove the Ord impl from ScVal and always do comparisons as RawVals. It seems very high-risk to have these two impls that might diverge over time, even if we fix them once with some confidence. Or maybe there is some way to unify the impls, though it seems difficult. |
Thanks for looking into this! The issue is indeed concerning, though maybe it's not as bad as it seems. For the particular case described in the PR only equivalence of Eq comparisons is important - the internal order divergence doesn't really matter as we're interested in the key retrieval interface only. However, I agree that Looking for comments from @graydon as well. |
Thanks for the clarifications @dmkozh. I've continued this fuzzing under the assumption that the only comparisons that matter are those where one of the RawVal or ScVal comparisons returns Ordering::Equal. Is it important that I have found one bug in |
Unfortunately it's not just a matter of equal comparisons, we also need identical behaviour wrt. the total order of the values. This is because we treat maps as ordered collections, and their order is canonical so as not to admit logically-equal maps that have physically-different (encoded) representations. This is better than the alternative, in which we say we don't canonicalize the order of values and then a signature on a message Put another way: users think of maps as sets-of-pairs, and sets-of-pairs have no logical order, but hashing relies on a physical order, so we impose a canonical order to the not-logically-ordered artifact to eliminate the ability to make logically-equal but physically-different maps. This is all 100% intentional and part of the design. |
(Also practically speaking the "key-retrieval interface" relies on binary search in sorted maps, which relies on proper ordering, so key retrieval will fail if the ordering isn't correct.) |
Actually, what happens if we provide an incorrectly sorted map as a user input? Will the host fail or will we just instantiate a 'broken' map? I think it would be nice to avoid having two different comparison implementations. The comparisons should normally happen only on |
If you build a map out of a mis-sorted vector of pairs, the call to initialize the map fails. If you try to deserialize a mis-sorted ScMap into a host map, the deserialization fails. The host never sorts |
(if divergence is arising from something deeply subtle and impossible to guard against, I can imagine being persuaded to turn off |
But why do we implement I agree that there might be bugs in |
The host does call |
Right, that's what we do now. Would it be too silly to use |
There are many discrepancies between I'm a bit overwhelmed, not sure what should be changed and what should be ignored, but here are my concrete findings so far. PatchesThese are patches I've made to change the behavior of comparison functions, impl Ord for Status {
#[inline(always)]
fn cmp(&self, other: &Self) -> Ordering {
- let self_tup = (self.as_raw().get_major(), self.as_raw().get_minor());
- let other_tup = (other.as_raw().get_major(), other.as_raw().get_minor());
+ let self_tup = (self.as_raw().get_minor(), self.as_raw().get_major());
+ let other_tup = (other.as_raw().get_minor(), other.as_raw().get_major());
self_tup.cmp(&other_tup)
}
} This makes +// Note that these must have the same order as the impl
+// of Ord for ScVal, re https://github.com/stellar/rs-soroban-env/issues/743
fn host_obj_discriminant(ho: &HostObject) -> usize {
match ho {
- HostObject::Vec(_) => 0,
- HostObject::Map(_) => 1,
- HostObject::U64(_) => 2,
- HostObject::I64(_) => 3,
- HostObject::TimePoint(_) => 4,
- HostObject::Duration(_) => 5,
- HostObject::U128(_) => 6,
- HostObject::I128(_) => 7,
- HostObject::U256(_) => 8,
- HostObject::I256(_) => 9,
- HostObject::Bytes(_) => 10,
- HostObject::String(_) => 11,
- HostObject::Symbol(_) => 12,
- HostObject::Address(_) => 13,
- HostObject::ContractExecutable(_) => 14,
+ HostObject::U64(_) => 0,
+ HostObject::I64(_) => 1,
+ HostObject::TimePoint(_) => 2,
+ HostObject::Duration(_) => 3,
+ HostObject::U128(_) => 4,
+ HostObject::I128(_) => 5,
+ HostObject::U256(_) => 6,
+ HostObject::I256(_) => 7,
+ HostObject::Bytes(_) => 8,
+ HostObject::String(_) => 9,
+ HostObject::Symbol(_) => 10,
+ HostObject::Vec(_) => 11,
+ HostObject::Map(_) => 12,
+ HostObject::ContractExecutable(_) => 13,
+ HostObject::Address(_) => 14,
HostObject::NonceKey(_) => 15,
} This makes the discriminants agree with @@ -211,13 +214,9 @@ impl Compare<ScVec> for Budget {
type Error = HostError;
fn compare(&self, a: &ScVec, b: &ScVec) -> Result<Ordering, Self::Error> {
- for (a, b) in a.iter().zip(b.iter()) {
- match self.compare(a, b)? {
- Ordering::Equal => (),
- unequal => return Ok(unequal),
- }
- }
- Ok(Ordering::Equal)
+ let a: &Vec<ScVal> = &*a;
+ let b: &Vec<ScVal> = &*b;
+ self.compare(a, b)
}
} This handles the case where the vecs have different lengths. Seems to just be a plain bug. @@ -225,13 +224,22 @@ impl Compare<ScMap> for Budget {
type Error = HostError;
fn compare(&self, a: &ScMap, b: &ScMap) -> Result<Ordering, Self::Error> {
- for (a, b) in a.iter().zip(b.iter()) {
- match self.compare(&(&a.key, &a.val), &(&b.key, &b.val))? {
- Ordering::Equal => (),
- unequal => return Ok(unequal),
+ let a: &Vec<ScMapEntry> = &*a;
+ let b: &Vec<ScMapEntry> = &*b;
+ self.compare(a, b)
+ }
+}
+
+impl Compare<ScMapEntry> for Budget {
+ type Error = HostError;
+
+ fn compare(&self, a: &ScMapEntry, b: &ScMapEntry) -> Result<Ordering, Self::Error> {
+ match self.compare(&a.key, &b.key)? {
+ Ordering::Equal => {
+ self.compare(&a.val, &b.val)
}
+ cmp => Ok(cmp),
}
- Ok(Ordering::Equal)
} This handles the case where the maps have different lengths. Seems to just be a plain bug. Unresolved findingsI haven't made patches for these because they would take i128/u128 comparisonsWhen both values are i128 or u128, pub struct Int128Parts {
pub lo: u64,
pub hi: u64,
}
|
These are fantastic findings @brson thank you! Much to think about and discuss here. The i128 case probably has a similar companion issue around signed 256-bit values (since they're stored as 32-byte octet arrays in XDR, which won't sort correctly for negative values). Concerning the vexing cases you identified but haven't patched yet, I think there are fixes to be had (besides "abandon the whole exercise and revisit the question of even having
|
Yes, I believe that is the case where I am seeing a failure. There are several types that I don't have fuzzing for yet, including the 256-bit integers and Timepoint. I'll add support for them soon, start making the fixes you've suggested, and continue fuzzing. |
Ok I spent a bunch of time staring at the definition of 2s complement and running tests that enumerate all the boundary conditions because despite all attempts after decades of mistakes I just don't trust my guesses about splitting numbers and sign bits at all but .. I am fairly convinced at this point that the following representation works right:
Feel free to pick better names for the fields that convey their most-to-least-significant order. I think this is better than trying to recycle the "byte array" representation both because it's marginally faster and also because in XDR we don't have signed bytes at all, and we need a signed 2s complement integer type of some sort for the most significant byte-or-word-or-whatever of the encoding. Since int64 and uint64 are kicking around, we might as well use 'em. There's a bunch of code to change to make this work in both XDR, guest and host. I talked to @jayz22 about this today and he suggested that he might want to do it (since doing the int256 host functions is currently on his plate anyways). I have no preference who does it -- I will if nobody else does! -- but I think this is the right change to make to both of the larger-integer types in XDR. |
@graydon Yes I am happy to do it! |
I am working on a a PR to just land the three patches from me previous comment. I am struggling to write comprehensive unit tests though: these bugs are fixed in the soroban-env crate; but I want to write proptests in the soroban-sdk crate; but I can't write proptests without landing a bunch of my broken/incomplete fuzzing stuff. So I'll try to land some of these fixes with cursory unit tests, then endeavor to write comprehensive tests as I get my fuzzing patches straightened out. |
I put some initial fixes in #762 I'll rebase the fuzzer and try to get more results later this week. |
I've posted a patch to fix a case where Symbol objects can contain invalid chars: #765 Next I will fix the previously mentioned case where objects and smallvals of different types do not compare correctly. |
I have written a patch to fix the comparison between objects and smallvals of different types, but not submitted a pull request. The only new thing I've noticed is that Right now the fuzzer is running without failing. I have a few more things to add to it. And once I'm not finding any more problems, I will look into converting the fuzz tests to proptests that will be easier to run as part of the test suite. |
Here's the fix for comparisons between objects and smallvals of differing types: #767 |
As of now all of the bugs I'm aware of related to this issue are fixed. I still intend to land proptests for this, and possibly expand the fuzzer to more edge cases. The latest curiosity I have discovered, which I don't know if it is a bug or not is: It is possible to create Status values in contracts that can't be converted to XDR types - Status values can contain arbitrary status codes, but the XDR definitions are limited to a fixed set of enumerated status codes. I don't know what impact this could have on contracts that create these semi-invalid statuses. |
Here's one more minor problem where some Status's can't be converted to ScStatus even though they are representable in XDR: #803 |
### What Preliminary support for fuzzing Soroban contracts with [cargo-fuzz](https://github.com/rust-fuzz/cargo-fuzz/). This patch is primarily concerned with introducing a pattern for implementing the [Arbitrary](https://docs.rs/arbitrary/latest/arbitrary/) trait, by which fuzz tests generate semi-random Rust values. This is a new revision of previous pr #878. Little has changed functionally since that pr. This patch additionally uses the Arbitrary impls with [`proptest-arbitrary-interop` crate](https://github.com/graydon/proptest-arbitrary-interop) to add proptests that help ensure that: RawVals and ScVals can be converted between each other and their their comparison functions are equivalent, which closes stellar/rs-soroban-env#743. This patch introduces the SorobanArbitrary trait which looks like this: ```rust pub trait SorobanArbitrary: TryFromVal<Env, Self::Prototype> + IntoVal<Env, RawVal> + TryFromVal<Env, RawVal> { type Prototype: for<'a> Arbitrary<'a>; } ``` Basically every type relevant to soroban contracts implements (or should) this trait, including i32, u32, i64, u64, i128, u128, (), bool, I256Val, U256Val, Error, Bytes, BytesN, Vec, Map, Address, Symbol, TimepointVal, DurationVal. The `#[contracttype]` macro automatically derives an implementation, along with a type that implements `SorobanArbitrary::Prototype`. In use the trait looks like ```rust use soroban_sdk::{Address, Env, Vec}; use soroban_sdk::contracttype; use soroban_sdk::arbitrary::{Arbitrary, SorobanArbitrary}; use std::vec::Vec as RustVec; #[derive(Arbitrary, Debug)] struct TestInput { deposit_amount: i128, claim_address: <Address as SorobanArbitrary>::Prototype, time_bound: <TimeBound as SorobanArbitrary>::Prototype, } #[contracttype] pub struct TimeBound { pub kind: TimeBoundKind, pub timestamp: u64, } #[contracttype] pub enum TimeBoundKind { Before, After, } fuzz_target!(|input: TestInput| { let env = Env::default(); let claim_address: Address = input.claim_address.into_val(&env); let time_bound: TimeBound = input.time_bound.into_val(&env). // fuzz the program based on the input }); ``` A more complete example is at https://github.com/brson/rs-soroban-sdk/blob/val-fuzz/soroban-sdk/fuzz/fuzz_targets/fuzz_rawval_cmp.rs ### Why This patch assumes it is desirable to fuzz Soroban contracts with cargo-fuzz. Soroban reference types can only be constructed with an `Env`, but the `Arbitrary` trait constructs values from nothing but bytes. The `SorobanArbitrary` trait provides a pattern to bridge this gap, expecting fuzz tests to construct `SorobanArbitrary::Prototype` types, and then convert them to their final type with `FromVal`/`IntoVal`. There are a lot of docs here and hopefully they explain what's going on well enough. ### fuzz_catch_panic This patch also introduces a helper function, `fuzz_catch_panic`, which is built off of the `call_with_suppressed_panic_hook` function in soroban-env-host. The `fuzz_target!` macro overrides the Rust panic hook to abort on panic, assuming all panics are bugs, but Soroban contracts fail by panicking, and I have found it desirable to fuzz failing contracts. `fuzz_catch_panic` temporarily prevents the fuzzer from aborting on panic. ### Known limitations The introduction of SorobanArbitrary requires a bunch of extra documentation to explain why Soroban contracts can't just use the stock Arbitrary trait. As an alternative to this approach, we could instead expect users to construct XDR types, not SorobanArbitrary::Prototype types, and convert those to RawVals. I have been assuming that contract authors should rather concern themselves with contract types, and not the serialization types; and presently there are a number of XDR types that have values which can't be converted to contract types. The SorobanArbitrary::Prototype approach does allow more flexibility in the generation of contract types, e.g. we can generate some adversarial types like invalid object references and bad tags. Contracts must use `IntoVal` to create the final types, but these traits are under-constrained for this purpose and always require type hints: ```rust fuzz_target!(|input: <Address as SorobanArbitrary>::Prototype| { let env = Env::default(); let address: Address = input.into_val(&env); // fuzz the program based on the input }); ``` This is quite unfortunate because it means the real type must be named twice. This patch introduces a new private module `arbitrary_extra` which simply defines a bunch of new conversions like ```rust impl TryFromVal<Env, u32> for u32 { type Error = ConversionError; fn try_from_val(_env: &Env, v: &u32) -> Result<Self, Self::Error> { Ok(*v) } } ``` These conversions are required by `SorobanArbitrary`, which is only defined when `cfg(feature = "testutils")`; the `arbitrary_extra` module defines these conversions for all cfgs to ensure type inference is always the same, but the impls are probably useless outside of the SorobanArbitrary prototype pattern. Crates that use `#[contracttype]` need to define a "testutils" feature if they want to use Arbitrary. This patch doesn't generate "adversarial" values, which might include: - RawVals with bad tags - objects that have invalid references - objects that are reused multiple times - deeply nested vecs and maps - vecs and maps that contain heterogenous element types The arbitrary module has unit tests, and these help especially ensure the macros for custom types are correct, but the tests are not exhaustive. --------- Co-authored-by: Leigh McCulloch <[email protected]> Co-authored-by: Graydon Hoare <[email protected]>
Per 225cf4f#r1144119247 it is crucial that for all equivalent pairs of RawVals and ScVals, the pair has the same comparison results.
It looks to me that RawVal uses a custom Compare trait - RawVals don't appear to implement Ord; ScVal does implement Ord. So presumably the result of Compare on RawVals needs to be equivalent to the results of Ord::cmp (and PartialOrd::partial_cmp).
This should be testable with the fuzzing infrastructure in stellar/rs-soroban-sdk#878 so I intend to try that. Fuzzing won't guarantee all possible values compare the same, but better than nothing.
cc @graydon @dmkozh
The text was updated successfully, but these errors were encountered: