Implement default dimension names in open_zarr #8749
#11006
Closed
+169
−41
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
_ARRAY_DIMENSIONSxarray's special zarr attribute #280 #8749whats-new.rstAddresses issue #8749 by implementing default dimensions when reading
zarrstores with missing metadata. With this PR, if dimension names are missingxarraywill try to build aDatasetfrom azarrstore using default dimension names,dim_0,dim_1etc. Note we can only use default dimensions if each variable in the store has a consistent shape, as discussed by @TomNicholas and @etienneschalk discussed in #8749.Motivating Example
Extending the example of @etienneschalk to both
zarr2 and 3 specifications, considerWith this PR, the code above will no longer raise an error, but instead return
General Notes
It appears we have at least 3 zarr conventions considered by
xarray.xarrayflavouredzarr2, with the optional.zattrshttps://zarr-specs.readthedocs.io/en/latest/v2/v2.0.html#attributes used to store dimension names in_ARRAY_DIMENSIONS.zarr3, with dimension names stored in the optionaldimension_namesmetadata attribute https://zarr-specs.readthedocs.io/en/latest/v3/core/index.html#dimension-nameszarrhttps://docs.unidata.ucar.edu/netcdf/NUG/nczarr_head.html, which stores the dimension names indim_refs.The
_get_zarr_dims_and_attrsfunction tries to get the dimension names by checking all three of these conventions. Perhaps the convention should be handled more explicitly somehow?