-
Notifications
You must be signed in to change notification settings - Fork 642
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Buffer #816
Buffer #816
Conversation
Very nice work! I don't see any road blockers for a port to Java, which I will do next. |
Travis is complaining for reasons I don't understand, so I've restarted the build... otherwise this looks like a good thing to have, but I have a couple of questions (I might have missed things due to only looking over the code quickly):
|
|
With my last 2 commits, the
comments no longer apply, and instead there is just |
@JanBessai Note that the list of primitives needed has changed slightly with my last two commits. |
Alright, the latest rebase moves the test into |
Should be okay to go now. And thanks for fixing the .gitignore - I forgot! There still appears to be a merge conflict though... Also, don't forget to add yourself to the CONTRIBUTORS file, for credit where it is due :). |
@edwinb rebased to fix merge conflict and added myself to |
@edwinb tests seem to be failing due to problems in the distribution tarball. I can't seem to get |
Note that tests work successfully for all backends locally. |
@edwinb ok latest rebase fixes sdist. All green! |
Implements Buffer, allocate, copy, append*, and peek* in terms of as-yet non-existent (unsafe) primitives.
Buffers don't have endianness, though values stored in buffers do.
It's just the show instance for Endianness
This simplifies the native code significantly, including removing GETTYs at every Buffer-related function call, at the cost of: * Needing to store an offset of 0 in cases where the old code could just refer directly to the struct Buffer * Boxing of the offset * When an append to a buffer requires an allocation + copy, all of the data before the offset needs to be allocated and copied too (as there doesn't seem to be a good way for the append functions to tell the idris code when the offset needs to be changed and when not) I think the simplification outweighs the cost, but I don't have hard numbers so YMMV.
As proposed in #802, this implements
Data.Buffer
, a pure interface to contiguous chunks of memory. At the interface level, this consists of the following terms inData.Buffer
:Buffer : Nat -> Type
, where theNat
represents the number of valid (i.e. initialized) bytes in theBuffer
allocate : ( sizeHint : Nat ) -> Buffer Z
, wheresizeHint
gives a hint as to how much space should be reserved for future appends to theBuffer
copy : Buffer n -> Buffer n
. MultipleBuffer
objects may point to the same underlying storage, which may mean that some relatively smallBuffer
is keeping a large chunk from being freed.copy
is semantically equivalent toid
except that the underlying storage should be initially uniquely pointed-to by the returnedBuffer
.append
functions for appending values to the buffer in various endiannesses. They are namedappend{Type}{EN}
, whereType
is one of[ Bits8, Bits16, Bits32, Bits64, Buffer ]
andEN
is one of[ Native, BE, LE ]
. Each such function has typeBuffer n -> ( count : Nat ) -> ( val : {Type} ) -> Buffer (n + (sizeof {Type}) * count
and addscount
repetitions ofval
represented in the relevant byte-order. For example,appendBits32LE : Buffer n -> ( count : Nat ) -> ( val : Bits32 ) -> Buffer (n + 4 * count)
will append the passedBits32
, in little-endian byte order, to theBuffer
count
times. ForBuffer
andBits8
the endianness has no real meaning, but theBE
andLE
variants are provided for symmetry.peek
functions for retrieving values from the buffer in various endiannesses. They are namedpeek{Type}{EN}
, whereType
is one of[ Bits8, Bits16, Bits32, Bits64, Buffer ]
andEN
is one of[ Native, BE, LE ]
. Each such function has typeBuffer (sizeof {Type} + n) -> ( offset : Fin S n ) -> ( val : {Type} )
and readsval
represented in the relevant byte-order at the given byteoffset
in the Buffer. For example,peekBits16Native : Buffer (2 + n) -> ( offset : Fin S n ) -> ( val : Bits16 )
will readBits16
, in machine-native byte order from theBuffer
starting atoffset
bytes. ForBuffer
andBits8
the endianness has no real meaning, but theBE
andLE
variants are provided for symmetry.Under the hood, each function is implemented on top of a primitive that takes
Bits64
s in place ofFin n
s andNat
s, some of which also explicitly take theBuffer
size as aBits64
. As I commented in theData.Buffer
module, there are a few issues with this interface:Buffer
size asFin (2 ^ WORD_BITS)
instead ofNat
Bits64
when really they should take the equivalent of C'ssize_t
(ideally unboxed)Int
,Char
, andFloat
, all have fixed (though possibly implementation-dependent) widths. Currently not in place due to lack of host system introspection.Fin
toBits64
(which, re2
, should eventually be a fixed-width implementation-dependent type) is likely inefficient relative to conversion fromNat
toBits64
For now, I've only implemented the
Buffer
primitives in the C codegen. There was a fair amount of duplication there, which might be reducible somewhat with macros. The implementation uses a fill value and power-of-2 allocation for efficient appends in many cases, see #802 or the code for details. I can try my hand at the LLVM codegen implementation once this is merged in some form.Fixes #802