Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Bit Packed Bools #10

Open
MiddleMan5 opened this issue Jun 13, 2024 · 5 comments
Open

Support Bit Packed Bools #10

MiddleMan5 opened this issue Jun 13, 2024 · 5 comments
Labels
enhancement New feature or request

Comments

@MiddleMan5
Copy link

Hey, I've been using this library and it's great!

One limitation I ran into using it is that the binary format I'm working with encodes bools as single bits instead of an entire byte. Maybe offer a mode to extract individual bits out as bools?

@ghostiam
Copy link
Owner

Hi, thanks for your interest in the project!

I thought about this, but there might be a problem, what to do with the remaining bits? Just ignore them?

Now, as an option, you can create your own enum type with bit flags.
You can also add methods to the new type to more conveniently obtain bool values.

type MyDataFlags uint8

const (
	MyDataFlag1 MyDataFlags = 1 << iota
	MyDataFlag2
	MyDataFlag3
	MyDataFlag4
	MyDataFlag5
	MyDataFlag6
	MyDataFlag7
	MyDataFlag8
)

func (f MyDataFlags) HasFlag1() bool {
	return f&MyDataFlag1 != 0
}

func (f MyDataFlags) HasFlag2() bool {
	return f&MyDataFlag2 != 0
}

// etc...

type MyData struct {
	Flags MyDataFlags
}

func main() {
	data := []byte{0b01010101}
	var actual MyData
	err := binstruct.UnmarshalBE(data, &actual)
	if err != nil {
		log.Fatal(err)
	}

	println(actual.Flags & MyDataFlag1) // 1
	println(actual.Flags & MyDataFlag2) // 0

	// or

	println(actual.Flags.HasFlag1()) // true
	println(actual.Flags.HasFlag2()) // false
}

@MiddleMan5
Copy link
Author

Thanks for the example code! Yeah that's basically the approach we're taking now. The downside is we have to keep track of a "bit offset" manually externally to the code which is kind of a pain. Bools being packed as single bits is pretty common in a lot of the binary formats I've worked with, so I think this is a valid use case.

One possible implementation would involve tracking the total bit offset instead of byte offset internally. As you pop off byte aligned chunks to decode from the stream this offset would get incremented * 8. When decoding you could check that the current offset was a multiple of 8 bits and if so retain the current logic.

Decoding bit-packed bool fields would be a little different; when popping off a bit the bit offset would become a non-multiple of 8 bits. Any subsequent operations would need to read in the correct number of bits from the stream and re-align them to the data type.

I think offering a "read n bits" function would also be helpful to allow the user to choose to drop the remaining bits.

Underflow logic would remain the same aligned vs. unaligned. If you try to read a byte and only 7 bits remain then a regular underflow error occurs (user or input is wrong)

I might have time to open up an example PR if you're interested, what do you think?

@ghostiam ghostiam added the enhancement New feature or request label Jun 18, 2024
@ghostiam
Copy link
Owner

ghostiam commented Jun 18, 2024

Thanks for the detailed description!
Yes, I would be glad to see an example. Is there an example of some open/popular protocol/data using bit offset?

I think we can add such functionality, but we need to have functionality for explicit transition to unaligned (maybe a tag "read bits and remain unaligned", like bits:3,unalign), and return to aligned mode, which will discard unread bits.
Ideally, we'd add an option to NewDecoder/Unmarshal, but the library doesn't support it at the moment :( (Improvement in plans).

One possible implementation would involve tracking the total bit offset instead of byte offset internally.

I think I would still stay with byte offset, but add additional fields to control the bit shift. As long as the bit shift is 0, we can use the same logic as we have now. But I'll think about it some more.

@MiddleMan5
Copy link
Author

MiddleMan5 commented Jun 19, 2024

I think I like the idea of controlling shifting between unaligned and aligned access modes and discarding bits on the transition yeah!

The codebase I've been working with most recently that provides generic binary encoding/decoding approaches supporting non-byte-aligned codecs is openc3. The codebase is written in python/ruby and is by no means easy to read but I'll include it here for reference. Specifically the packet item accessors that handle converting from input fields to binary representations:
https://github.com/OpenC3/cosmos/blob/a3f4b9a3ccb9097fd9d1a73645886ad60c83f754/openc3/python/openc3/accessors/binary_accessor.py#L157

The protocols I work with unfortunately are proprietary, but the ones I know of that are public that allow for non-byte aligned packing off the top of my head are:

@ghostiam
Copy link
Owner

I thought about what functionality is needed to introduce bit-offset:

  • BE/LE tags should behave like MSB/LSB when working with bits (or add explicit aliases?);
  • Support for signed numbers when converting from bits;
  • What to do with offset? it's in bytes. Reset to aligned mode?
  • I think it’s worth reading several bytes at once into a buffer like uint64 (reading only 7 bytes, 1 byte will remain for shifted bits) to make it easier to shift bits in several bytes at a time.

I will add more as ideas and questions arise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants