-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Differences in parsing UUIDs from uint128 in Python and Rust #406
Comments
cc @KodrAus |
Looks like endianness? Will dig in properly in a bit, but maybe we’re reversing the bits in that 128bit number so they’re big endian whereas Python isn’t? |
Now I feel silly... thats the first thing I ruled out :P |
That's exactly the case. I did the following: if let Some(i) = int {
handle = i.swap_bytes().into();
} and the tests pass. I'm not sure which implementation is correct (or if both are correct). |
Yeh, I think because a I think we should consider deprecating the 128bit conversions. |
I disagree. |
@thedrow In that case you could define your own function in your bindings to Python with the semantics you need, though. The problem is we’ve chosen one arbitrary representation where the 128bit number has a platform-specific endianness but the representation you want is for the 128bit number to act like a Since there are multiple equally valid approaches here I think we should consider punting supporting it. |
I think a better way will be to provide another adapter |
The other thing that I was thinking last night was that we do not have any types that guarantee about the endianness of the bytes within. We could introduce a |
407: remove support for u128 r=Dylan-DPC a=kinggoesgaming **I'm submitting a(n)** other # Description Remove the `u128_support` module and the current implementations for converting from `u128` to `Uuid`s. New alternative will be provided later. # Motivation #406 shows that `u128`, number type that is affected by the target's endianness. # Tests All tests related to `u128` removed. The rest of the tests should work as expected, as before. # Related Issue(s) #406 Co-authored-by: Hunar Roop Kahlon <[email protected]>
427: Breaking API features and refactoring for `0.8` r=kinggoesgaming a=KodrAus **I'm submitting a(n)** (|feature|refactor|) # Description ## Modules and Errors This PR is a broad refactoring of our error types and module layout. The general pattern I've gone for is: - Define specific error types in submodules. These are currently private. - Collect these specific error types in an opaque root `Error`. All methods return this single `Error` type so it's easier for consumers to carry `uuid::Error`s in their own code. It'll also include some implementations for open PRs as we've been discussing. I imagine we'll want to spend time working through these changes :) I've hidden our `prelude` module for now, because I'm not sure it's something we'll want to stabilize with (it's only got a few bits in it afterall). ## 128bit support Re-enables support for 128-bit numbers in the form of constructors on `Uuid`, following the pattern that method names without an endian suffix are BE. ## No-std Refactors our `no-std` support so we always mark the crate as `no-std` so our std imports are always the same. This simplifies our std/core support so we need fewer modules. ## Timestamps Includes the design proposed in #405 for timestamp support originally implemented by @jonathanstrong. # Related Issue(s) - #406 # Related PR(s) - #405 - #416 Co-authored-by: Ashley Mannix <[email protected]> Co-authored-by: Jonathan Strong <[email protected]>
The behavior here is not surprising, well, not to me at least. Literals are big endian and get converted to the target endianess. Given that most (all?) x86/x64 CPUs are little endian we end up with exactly that. Python – like the JVM – is probably normalizing exactly that so that users do not get confused. There is The – by far – safest choice to to simply call |
Looking at the code I was wondering why there even is a |
All primitive numbers follow the target platform’s endianness... the from_le_bytes and from_be_bytes are noop if endian already matches, IIRC |
yeah, im not talking about the native endianness the cpu is using, that is abstracted away by the compiler anyway. But the "conceptual"/"logical" endianness perceived by the developer. Im just saying that the edit: "should" as "The existence of |
|
I think I'm still in favour of deprecating the |
If we did want to keep this functionality maybe it would be better called |
Why do you say, that it's dependent on the native endianess? const UUID: Uuid = Uuid::from_u128(0xd8d2cf50_a5c6_433b_98e6_8c268fd84fa0); is just the same on a little endianess machine and on a big endianess machine. There's no "endianess", because it gets abstracted away by the compiler. What matters here is the way how you're reading the data. And that's my argument, endianess should be handled on the reading side not in If you're reading an u128 from a file and need to convert it in |
I don't find the API confusing and it is very useful for my fastuuid library. |
This is where the confusion is I think. I've added some docs to our I totally agree that handling any endianness at the read side probably makes the most sense if you're encoding them, but am not sure that means we'd need a
The |
I'm just coming through some triage. I think the specific issue here is "resolved" as use I'll go ahead and close this one for now, but please feel free to re-open or create new issues for anything else that comes up! |
Describe the bug
I'm creating a binding of this library to Python for learning purposes and possible production usage.
I've encountered a difference in the values Python provides and the values Rust provides when parsing a UUID from an unsigned 128-bit integer.
To Reproduce
When parsing a UUID from an unsigned integer in Python I get the following UUID:
However, when I use the following rust code:
The output is the following UUID:
2cc645a4-ce9e-6fb3-964f-24e99f8aafd7
Expected behavior
I think that the value should be the same in both implementations.
If not, we need to figure out the difference and decide which implementation is correct.
Screenshots
If applicable, add screenshots to help explain your problem.
Specifications (please complete the following information):
Additional context
See the Python implementation here.
Other
N/A
The text was updated successfully, but these errors were encountered: