Conversation
|
@tqchen was it correct that if we use the new Bool dtype with a size of 1-bit, than the storage would be assumed to actually be packed into a single bit? There is some need for passing bit-masks around in the dataframe community, so it would be good to clarify this use-case, if valid. |
|
in this case i think it should be ideally represented as I am not too sure if we want to specify |
|
@tqchen to be honest, I have always had trouble understanding how lanes are to be used. If I have say a 1111 elements in my array and bits does it work to say that |
|
The lane represents the lanes of the unit-data type. Say we want to store a bit mas, which is represented by int32. To store 65 bits, we will need 3 integers, in this case, it is
in the low level, so we have 3 * 32 bits in total. If we store bool as a normal byte, and to store 65 bools, we need
|
|
@tqchen but that is a problem, because how do I pass the 65 bits information there since |
|
@seberg I get what you mean. I feel that could be something being addressed by enhancing the array information to include sub-byte boundary information. I am mainly describing what is being interpreted from the spec right now in a way that is also mostly consistent with compilers like LLVM |
|
I guess that the use-case I was asking for could use/abuse "lanes", because there is a side-channel to pass the actual shape. But it doesn't feel ideal to me, so I am wondering if we can think of a pragmatic way to make this possible. |
|
Maybe it would make sense to either introduce a new dtype or some sort of flag for bitmasks? |
Close #75. Close #76. Supersedes #76.
It turns out we all forgot @alonre24 has already pushed a PR (#76), so all I did was minor edits in the docstrings and sync with the latest master, with his commit preserved.
cc: @tqchen