Skip to content

Can we implement a video codec with Zarr? #157

@alxmrs

Description

@alxmrs

Here's a thought experiment: Given the new version of the Zarr specification (#149) along with its extension system, could we implement a video codec?

Why is this a useful question? It tests how flexible the new Zarr spec is. Video is a ubiquitous type of data that overlaps with a lot of the core Zarr processes. A huge amount of engineering effort and thought are put in to make video codecs (this is still quite an understatement). If Zarr aims to be a "meta format" for the cloud, eventually it will have to solve difficult cases like this.

Video use cases have an interesting overlap with representing scientific data: Video is multi-band (images + sound), bands include metadata, compression is required to store or work with data, and chunks of data are streamed in to be used (often).

In my initial estimate (especially, given the last Zarr open discussion), the core nuance provided by video formats is that they require a totally different chunking strategy for data. In video formats, data is typically compressed across time, where keyframes contain full images of data for specific intervals of time, and the rest of the frames are represented as diffs.

If Zarr could be extended to mimic some of the approaches used in video, it could be used to efficiently store time-series data, no matter the scientific domain. It could become a https://mpeg-g.org/ for non-bioinformatics workloads.

To further hypothesize, the infrastructure that makes video possible includes hardware acceleration. In this approach, could Zarr implementations make use of such optimizations?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions