-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow returning bytes in MSB order #46
Comments
I agree that functionality like this would be nice in |
I might be able to contribute this, what would be your suggested API for it? |
Tough question. I haven't made up too many thoughts about it, yet but I always try to keep the API simple and contained and if it is not possible to keep it as such with the addition of a new feature I mostly think that it is some way to measure that a feature does not naturally belong in there. That's to say that there are some options forward of which I am not sure if I like them:
Don't get me wrong from the above. I think it might be nice for some projects that work mainly on bit-endian architectures to have this feature but I really don't want to introduce more complexity if we can prevent it. Now I am curious about your ideas, @henrikssn ! :) |
Thanks, 1 would work for my use cases and I can't come up with one where the same register needs to be sent both MSB and LSB (but it is certainly possible that someone else can). I don't think we can easily store e.g. integers as MSB if the platform is little endian, because each setter call would come with a non-trivial conversion cost. Actually, what is the current memory layout on big endian platforms? I assume integers are stored MSB then? What about the result of Are there any performance implications (memory) to this choice, as in: are there situations where the current implementation can optimize away the copy into a byte buffer? |
Hm, if I may put my two cents. My intuition leans towards option 2, both because e.g. u32 has In particular, I'm working on a program which generates Rust-friendly wrappers around register maps, so my code is generating I can't think of a third option though, sadly. |
It is always stored as little-endian at the moment. Even with
I think this is where we should put the focus. Do we actually need this additional complexity or will the compiler take care of it anyways? I actually believe the compiler will optimize all these conversion down to the bare minimum needed as compilers nowadays are brilliant when it comes to bit twiddling tricks.
Thank you very much for putting in your use case! This basically disqualifies option 1 in my opinion.
I think the third option is to create some proper benchmarks using |
My 2¢: in my opinion this is also deserving of attention, because I think and being able to |
Can you please elaborate why transmuting between bitfield structs and their underlying representation is important? In little-endian systems the So in conclusion I think we really should have some benchmarks to measure this impact before we further complicate the API of bitfields in this crate. I am willing to implement this (or be implemented by you) if it turns out to be useful. To emphasize what I mean:
With |
Hm. I didn't realize the |
But why would you do that if you could as well just read out the bitfield behind the pointer and use the (mostly) no-op |
Hm, I think I see the use case (correct me if I'm wrong)... imagine it's some HW register somewhere, you could:
vs. having to do a two-step process:
It'd prevent you from being able to atomically set/get all fields simultaneously, but that may be fine. In this case there's only one interesting field anyway. |
Just to summarize here, if we do these changes:
This should not have any performance implications, as it is essentially just a rename for existing use cases. I think allowing the user to additionally specify the memory layout of bitfield could probably be a FR on its own, as it would just determine whether |
Going a step further... The internal bit storage order being able to be reversed would be nice. Trying to write a driver for this (look at page 63+) and am finding that a crate like this would be quite helpful to define how to speak with the device, but the LE ordering of the crate makes it so I have to do a lot of driver internal fiddling, even on the I know you can swap the SPI controllers read/write mode to support LSB, but most controllers I've seen are MSB by default. It'd be nice to have a "protocol mode" as a result (for lack of a better term). |
Yeah, in the meanwhile, I have even worked with protocols which use mixed LSB/MSB encoding (yes, i am talking to you, IEEE 802.15.4). Probably the proper way is to introduce
How does the memory layout when |
I'm interested in this feature as well. Getting bitfields and endianness to work correctly is very tricky and needs a library solution. From my experience, the only way to handle this situation is to carefully define a bitfield layout in a way that's independent of endianness. You do this by defining it in terms of the underlying representation. So taking the example from the previous comment and using
Notice that this layout works regardless of the machine endianness: within the 16 bits of a (Note, however, that the order of the fields within this representation is completely ambiguous. As a separate step, you define the byte order when this representation needs to be viewed as bytes. There are three options: host endian, little endian, and big endian. The representation in memory should always be in the host endianness (so the underlying The above solution works with every interface I've worked with and layout I've seen from a C or C++ compiler. It's also compatible with how people do bitfields manually (bit shifts on integer values). (With the possible exception of mixed endianness mentioned above, though to be honest I don't even know what that really means in this context. Regardless, I think it's rare enough to ignore from a library perspective.) This interpretation apparently departs from the current implementation, which uses byte arrays under the hood. To be frank, I think that's a dubious implementation to begin with. However, it does allow for space optimization when the number of bits is something like 24 bits instead of 32, and it allows for very large bitfields. I'm not sure the advantages are significant (you can break up a bitfield if needed and the space overhead is likely minimal). As a compromise if that's still desired, I suggest using a byte array when a representation isn't provided, but consider not providing a way to access the bytes. In summary, I recommend doing the following:
|
Mixed endianness means the following: #[bitfield]
pub struct MyBitfield {
foo: LE<U16>, // Little endian u16
bar: BE<U16>, // Big endian u16
}
let s = MyBitfield {
foo: 0x1234.into(),
bar: 0x5678.into(),
};
s.into_bytes() should produce |
How common is mixed endianess in structs that could profit from I do not like the idea of limiting |
Yeah, if you can specify endianness on the struct level and it is supported to mix struct fields of different endianness in a bitfield struct, then I am happy (I would just wrap For me to understand correctly, the current standing proposal is to add an |
There have already been many proposals in this thread but this one I consider to be implementable without the need of having way too many design discussions about potential pitfalls. If a proc. macro cannot provide a proper solution to a problem I do not really consider implementing it since that's then better done at the user level. Proc. macros are already complicated enough so let's not try to creep them up with additional cognitive load that can be avoided and shifted onto the user layer. |
@henrikssn @Robbepop But there's some significant downsides, most of which come because you're forcing this to be a packed representation. Even aligned fields could require unaligned reads (unless you're aligning the byte array?) which can have significant performance impacts, you're forcing everything to go through getters and setters (also unfortunate for other reasons, a topic for a different day), etc. (On the plus side, the interface doesn't allow forming a pointer to a field so you can't fall into that particular undefined behavior trap). It takes a bit more work and has indirection to get at the fields you want, but I think it's overall better to see this as a member of a larger structure. Something like this, for example:
(It's slightly unfortunate that Rust doesn't have nested anonymous structures like C, which makes this a bit less messy.) From here, we can use something like the
(Incidentally, you can also solve byte-aligned mixed endianness this way since BinRead can apply different endianness to individual fields). We're actually tantalizingly close to this already, come to think of it. Just change it to: Anyway, this is why I'm suggesting |
I originally used the suggestion of creating a I considered the As a proof of concept, that allows setting #[derive(BitfieldSpecifier)]
#[bits = 16]
#[endian = "big"]
pub enum QClass {
Reserved = 0, // [RFC6895]
Internet = 1, // [RFC1035]
// ...
} The changes to the procedure macros seem simple enough, and this seems flexible to mix and match endian in a struct, but to otherwise be opinionated about the endian of the struct, without changing the API. I plan to support `endian = "host"` // the default and today's behaviour
`endian = "big"` // Big endian
`endian = "little"` // Little endian I'm happy to extend the |
Hi all. I also needed a way to represent big endian byte orders for a network protocol. I've modified the original POC by creating an enum for specifying the byte order, and also supporting structs. The only thing I have left to do is to add support for little-endian orders for B2 - B128. Any suggestions how you may want that to work? |
@vhdirk I would vary the order of the bytes which are being written out for types which are crossing the byte boundary. This means that aligned even bytes like #[bitfield]
#[endian(Little)]
struct example {
// First two bytes
foo: U15,
bit_one: bool,
// Third byte is always unaffected by endianness
bit_two: bool,
bar: U7,
}
// In serialized format:
// [foo(lsb)] [foo(msb) bit_one] [bit_two bar]
// with big endian:
// [foo(msb)] [foo(lsb) bit_one] [bit_two bar] |
I've been trying something like that, but I'm afraid it is still buggy. See PR here: #90 Basically, you can also set the
This would yield
I'm quite sure my implementation is not quite correct stil, and the're some conflicts in specifications as well:
What would be the byte order of |
I don't think a boolean should be affected by endianness. Additionally, the serialization of your example should be unaffected by endianness, since it is just one byte long. Even if I would advise to wait with supporting specifying (altering) endianness on nested structs, since it makes the proposal a lot more complex with more edge cases to consider. |
The order of the boolean in |
In my case (assembler instructions), handling endianness of structs and fields is essential. What is endianness?Basically, it is the order of whatever unit of data you are refering to... which is not part of the definition. The unit of data can be a bit, Endianess can be different for each unit of data. Endianness of a
|
Currently (at least on my platform, ARMv7), bytes are returned in LSB order but my driver requires them to be MSB. How tricky would it be to add a configuration option for this?
Example:
test.as_bytes()
returns[0x34, 0x12]
but I need it to be[0x12, 0x34]
. While I could reverse the order of the result fromas_bytes
, it is error-prone and would IMO be better captured by the bitfield itself.The text was updated successfully, but these errors were encountered: