Skip to content

Making chained compression an extension #38

@jakirkham

Description

@jakirkham

Sorry if I'm rehashing things. Asked some colleagues to take a look at the spec and provide feedback. This is basically just me transcribing their thoughts. Also if I missed things or made errors, would encourage them to hop in and correct me. 🙂

One of the things that they found somewhat concerning about the existing Zarr implementation was support of chained compression. Namely that one could do things like apply GZIP and then LZ4. It would be preferable to just allow one form of compression by default. Allowing general purpose chaining like this complicates implementation and likely adds little value. Maybe this is already the common case?

That said, they did note that other storage specs (like Parquet or ORC) may have a data packing step prior to compression (like RLE or dictionary coding). This seems to still make sense to support, but it would be preferable to have only one optional compressor follow an optional data packing step.

Thoughts?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions