Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read Zarr Datasets #6019

Merged
merged 48 commits into from
Mar 22, 2022
Merged

Read Zarr Datasets #6019

merged 48 commits into from
Mar 22, 2022

Conversation

fm3
Copy link
Member

@fm3 fm3 commented Feb 3, 2022

Adds a first implementation of reading Zarr-encoded datasets.

  • new jzarr package (based on https://github.com/bcdev/jzarr ) but heavily edited (converted to scala, refactored a bit, removed writing-related code, added uint64, removed some checks that required too many permissions on the underlying storage)
  • new classes ZarrDataLayer, ZarrMag, ZarrBucketProvider to handle data loading in a way that is compatible with existing abstractions
  • The wkw-specific shard-handle cache is not used, instead there is a chunk cache inside of the ZarrArray class
  • Datasets can be local or remote via https (with optional basicauth) or s3
  • new HttpsFileSystemProvider (partially) implements the java standrard FileSystemProvider interface for https so that jzarr can use it along s3
  • new object FileSystemHolder stores FileSystems as created by the FileSystemProviders

URL of deployed dev instance (used for testing):

TODO

  • open zarr images locally
  • open zarr images with s3
  • open zarr images with https
  • use FileSystemProvider abstraction
  • Take uri and credentials from datasource-properties.json
  • fix jzarr existance checks problems with s3
  • convert data to bytes depending on dtype
  • basic auth for https
  • automatic axes padding
  • mags
  • Replace infinite chunk cache by LRU cache
  • cache clear for array handles (including chunk data)
  • how to get blosc lib to production?
  • finalize datasource-properties format, coordinate with Initial Zarr support  webknossos-libs#627
  • convert remaining java classes to scala, clean up
  • avoid sbt compile loop (get rid of java sources?)
  • ensure upload (backend-side) works with zarr data
  • uint64
  • isotropic mag path should be “1” rather than ”1-1-1”
  • add 64 MB chunk size limit
  • do not pass exceptions (replace Fulls by tryos in ZarrBucketProvider) (can logging be retained?)
  • assertions for too-big chunks etc?
  • fix axis order
  • write follow-up issues from list below:

Follow-Up

Steps to test:

  • Example datasource-properties.json for remote sample_l4 in zarr format:
  "id" : {
    "name" : "test.zarr",
    "team" : "sample_organization"
  },
  "dataLayers" : [
    {
      "name" : "0",
      "category" : "color",
      "boundingBox" : {
        "topLeft" : [ 0, 0, 0 ],
        "width" : 1024,
        "height": 1024,
        "depth" : 1024
      },
      "elementClass" : "uint8",
      "dataFormat" : "zarr",
      "mags": [
        {
          "mag": [1, 1, 1],
          "path": "https://…/l4_sample_zarr/color.zarr/0/",
          "credentials": {
            "user": "…",
            "password": "…"
          }
        },
        {
          "mag": [2, 2, 1],
          "path": "https://…/l4_sample_zarr/color.zarr/1/",
          "credentials": {
            "user": "…",
            "password": "…"
          }
        }
      ]
    }
  ],
  "scale" : [
    11.239999771118164,
    11.239999771118164,
    28
  ]
}

Issues:


@fm3 fm3 self-assigned this Feb 8, 2022
@fm3 fm3 mentioned this pull request Mar 4, 2022
3 tasks
@fm3 fm3 changed the title [WIP] First Zarr Reading Experiments [WIP] Read Zarr Datasets Mar 8, 2022
@fm3 fm3 changed the title [WIP] Read Zarr Datasets Read Zarr Datasets Mar 17, 2022
@fm3 fm3 marked this pull request as ready for review March 17, 2022 09:36
@fm3 fm3 requested a review from jstriebel March 17, 2022 09:50
Copy link
Contributor

@jstriebel jstriebel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

@jstriebel
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support Reading Zarr Images
2 participants