-
Notifications
You must be signed in to change notification settings - Fork 990
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor/fix: use sort_unstable instead of sort, more efficient with_* methods for TransactionBody #2564
Conversation
…thods for TransactionBody
Agreed. Not so much a bug as undefined behavior. But should never happen. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Some minor comments but looks good. Thanks for this!
self.out_full.sort(); | ||
self.kern_full.sort(); | ||
self.kern_ids.sort(); | ||
self.out_full.sort_unstable(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 These are good.
@@ -501,7 +501,7 @@ impl TransactionBody { | |||
kernels: Vec<TxKernel>, | |||
verify_sorted: bool, | |||
) -> Result<TransactionBody, Error> { | |||
let body = TransactionBody { | |||
let mut body = TransactionBody { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Originally wrote this to avoid making the body mut
unless we needed it to be (i.e. if we needed to sort it).
But it definitely reads better like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was torn on this one -- technically doesn't need to be a part of this PR, but I felt it was a bit less noisy.
generally agree though that only mut binding where absolutely needed is better.
inputs: new_ins, | ||
..self | ||
} | ||
pub fn with_input(mut self, input: Input) -> TransactionBody { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are actually (potentially) breaking changes.
The original intent was for these to be factories that left the original body unchanged, creating new instances each time.
I'm not sure if we actually needed this guarantee though (to leave the original one unchanged).
👍 If you are confident this is ok to change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm -- I don't think that's quite true -- the contract of these functions is a pass by ownership (I'm only touching the mutability).
Because the self passed in is an owned self, the passed in instance is consumed so I figured it's OK to trash it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No you're right - we're consuming the instance here each time. 👍
self.inputs | ||
.binary_search(&input) | ||
.err() | ||
.map(|e| self.inputs.insert(e, input)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really torn here - it feels wrong to use the err() here as control flow, but it does seem to be a pattern used (and recommended) in Rust...
Result<usize, usize>
looks and feels like a nasty hack to me...
One suggestion for readability - can we use idx
instead of e
? Its technically an "error" but in practice this is really just the idx to insert at.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah! I agree 100% -- one of my least favorite things in the rust API.
Haskell's result type is the value-neutral 'Either'
http://hackage.haskell.org/package/base-4.12.0.0/docs/Data-Either.html where Left and Right don't really convey anything.
In Rust, we have a hatred/annoyance for the Right Err type, and using it in a normal control flow feels gross. Rust also used to have Either for this purpose rust-lang/rust#11149
In this case, this is the exact intended use of binary_search as per the docs:
If the value is found then Result::Ok is returned, containing the index of the matching element.
If there are multiple matches, then any one of the matches could be returned.
If the value is not found then Result::Err is returned,
containing the index where a matching element could be inserted while maintaining sorted order.
We could also use map_err
if you prefer, but I prefer to use err()
because it actively discards the Ok result, which feels safer in general (map_err
leaves it untouched which might cause you to later forget that the err()
was the desired type)
Previously, the codebase uses sort. In rust, sort performs a stable sort, which is slow and allocates memory. On the other hand, sort_unstable is supposedly faster and does not require the allocation of memory, so I switch to using unstable sort as we don't need sort stability (our vecs should already be or will be made unique).
According to RFC documentation, the ordinary sort may be faster in the following case:
As we're generally not sorting inputs from remote peers (we usually are validating they sent it sorted) this should be ok -- and in the worst case, it's not much slower.
I also switch the with_* functions to insert after a failed binary search for the kernel, input, or output rather than unconditionally insert. This has a improved runtime of O(n + log(n)) v.s. O(n log n), but I don't think we use it in any performance sensitive context, so the major change is the behavioral change. This is a minor breaking change -- previously, adding two of the same output, input, or kernel would result in two of that item being inserted. I think this is almost always a bug, so it may be worth handling this case differently (e.g., not returning anything on duplicate add). In theory, this is also a little bit less safe if a user passes in a TransactionBody with an unsorted vec, the
with_*
will do something unspecified. We'll get an error atverify_sorted_and_unique
in such a case.