Core protocol v3.0 - data types#18
Core protocol v3.0 - data types#18alimanfoo merged 4 commits intozarr-developers:core-protocol-v3.0-devfrom
Conversation
| Core data types | ||
| ~~~~~~~~~~~~~~~ | ||
|
|
||
| .. list-table:: Data types |
There was a problem hiding this comment.
This should render as a table when built via sphinx.
I know it's a bit repetitive to enumerate every data type explicitly, given that many have both a big- and little-endian form. However for now I thought it simplest to just enumerate them.
|
A straw man for discussion. Note that tentatively I'm suggesting the core protocol is limited to boolean, integer, float and complex types, but that protocol extensions can define other data types. So, e.g., this would mean that other features like datetime/timedelta data types, structured (struct) data types, and variable length data types could each be addressed via separate protocol extensions. The idea behind dividing it up this way is just to keep the core protocol small and simple to implement. |
4631c80 to
de32e65
Compare
|
Also realising that I haven't mentioned a fixed length bytes type (i.e., corresponding to types like 'S4' in numpy for an array containing length 4 byte strings) or a fixed length unicode type (i.e., corresponding to types like '<U4' in numpy for an array containing length 4 unicode code points). Up for discussion whether these should be in the core protocol spec. |
|
I think for the ease of implementation in different languages I would tend to move string-type specs (S4, <U4) to a protocol extension. From our experience implementing the specs in Julia, it was very simple to implement the numeric types, but starting with the different fixed-size and variable-sized string encodings needed a lot of special-casing and made the code much less generic, because the Julia However, I don't have strong feelings about this and can definitely see the advantage of simply supporting all numpy dtypes. |
Thanks @meggart, that's very useful to know. I'm in favour of making the core protocol as easy as possible to implement in different languages, and so would be happy if these types were addressed via a protocol extension. |
de32e65 to
32f4ab2
Compare
|
In the interests of having content together in one place, I'd like to merge this PR into the core-protocol-v3.0-dev branch. We can still discuss, revise and revisit anything after merge. I'll merge tomorrow if no objections. |
This PR proposes a section of the v3.0 core protocol specification describing a set of data types for array elements.
Some discussion/decision points, for the following, should they be defined in the core protocol or via a protocol extension: