Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot open NetCDF file with more than 3 dimensions #76

Open
amarandon opened this issue Dec 2, 2024 · 2 comments
Open

Cannot open NetCDF file with more than 3 dimensions #76

amarandon opened this issue Dec 2, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@amarandon
Copy link
Collaborator

amarandon commented Dec 2, 2024

Given the following code:

from pathlib import Path
import xarray as xr
import rioxarray
from eodag import EODataAccessGateway, setup_logging
from rasterio.crs import CRS


def separator():
    print("=" * 72)

setup_logging(3)

dag = EODataAccessGateway()

product_type = "CAMS_EU_AIR_QUALITY_FORECAST"
search_results = dag.search(
    productType=product_type,
    provider="wekeo_ecmwf",
    variable=["ozone"],
    start="2024-01-15",
    end="2024-01-17",
    format="netcdf_zip",
    level=["0"],
)
product = search_results[0]
eodag_da = product.get_data(r".*\.nc")
separator()
print("DataArray returned by eodag-cube:")
print(eodag_da)

product_path = Path(product.location.split("/", maxsplit=2)[2])
asset_path = product_path / "ENS_FORECAST.nc"
xr_da = xr.open_dataarray(asset_path)
separator()
print("DataArray returned by xarray:")
print(xr_da)


separator()
print("Trying to open with rioxarray:")
rioxarray.open_rasterio(asset_path)

We get this output:

========================================================================
DataArray returned by eodag-cube:
<xarray.DataArray (dim_0: 0)> Size: 0B
array([], dtype=float64)
Dimensions without coordinates: dim_0
[...]
========================================================================
DataArray returned by xarray:
<xarray.DataArray 'o3_conc' (time: 3, level: 1, latitude: 420, longitude: 700)> Size: 4MB
[882000 values with dtype=float32]
Coordinates:
  * longitude  (longitude) float32 3kB 335.0 335.1 335.2 ... 44.75 44.85 44.95
  * latitude   (latitude) float32 2kB 71.95 71.85 71.75 ... 30.25 30.15 30.05
  * level      (level) float32 4B 0.0
  * time       (time) timedelta64[ns] 24B 0 days 1 days 2 days
Attributes:
    species:        Ozone
    units:          µg/m3
    value:          hourly values
    standard_name:  mass_concentration_of_ozone_in_air
========================================================================
Trying to open with rioxarray:
Traceback (most recent call last):
  File "/home/amarando/Projects/eodag-cube/t_rasterio_netcdf.py", line 41, in <module>
    rioxarray.open_rasterio(asset_path)
    ~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^
  File "/home/amarando/venvs/dedl/lib/python3.13/site-packages/rioxarray/_io.py", line 1238, in open_rasterio
    result = DataArray(
        data=data, dims=(coord_name, "y", "x"), coords=coords, attrs=attrs, name=da_name
    )
  File "/home/amarando/venvs/dedl/lib/python3.13/site-packages/xarray/core/dataarray.py", line 479, in __init__
    coords, dims = _infer_coords_and_dims(data.shape, coords, dims)
                   ~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/amarando/venvs/dedl/lib/python3.13/site-packages/xarray/core/dataarray.py", line 210, in _infer_coords_and_dims
    _check_coords_dims(shape, new_coords, dims_tuple)
    ~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/amarando/venvs/dedl/lib/python3.13/site-packages/xarray/core/dataarray.py", line 132, in _check_coords_dims
    raise ValueError(
    ...<3 lines>...
    )
ValueError: coordinate level has dimensions (np.str_('level'),), but these are not a subset of the DataArray dimensions (np.str_('time'), 'y', 'x')

If requesting another level, we get another error:

========================================================================
DataArray returned by eodag-cube:
<xarray.DataArray (dim_0: 0)> Size: 0B
array([], dtype=float64)
Dimensions without coordinates: dim_0
[...]
========================================================================
DataArray returned by xarray:
<xarray.DataArray 'o3_conc' (time: 3, level: 2, latitude: 420, longitude: 700)> Size: 7MB
[1764000 values with dtype=float32]
Coordinates:
  * longitude  (longitude) float32 3kB 335.0 335.1 335.2 ... 44.75 44.85 44.95
  * latitude   (latitude) float32 2kB 71.95 71.85 71.75 ... 30.25 30.15 30.05
  * level      (level) float32 8B 0.0 100.0
  * time       (time) timedelta64[ns] 24B 0 days 1 days 2 days
Attributes:
    species:        Ozone
    units:          µg/m3
    value:          hourly values
    standard_name:  mass_concentration_of_ozone_in_air
========================================================================
Trying to open with rioxarray:
Traceback (most recent call last):
  File "/home/amarando/Projects/eodag-cube/t_rasterio_netcdf.py", line 41, in <module>
    rioxarray.open_rasterio(asset_path)
    ~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^
  File "/home/amarando/venvs/dedl/lib/python3.13/site-packages/rioxarray/_io.py", line 1238, in open_rasterio
    result = DataArray(
        data=data, dims=(coord_name, "y", "x"), coords=coords, attrs=attrs, name=da_name
    )
  File "/home/amarando/venvs/dedl/lib/python3.13/site-packages/xarray/core/dataarray.py", line 479, in __init__
    coords, dims = _infer_coords_and_dims(data.shape, coords, dims)
                   ~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/amarando/venvs/dedl/lib/python3.13/site-packages/xarray/core/dataarray.py", line 210, in _infer_coords_and_dims
    _check_coords_dims(shape, new_coords, dims_tuple)
    ~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/amarando/venvs/dedl/lib/python3.13/site-packages/xarray/core/dataarray.py", line 140, in _check_coords_dims
    raise ValueError(
    ...<3 lines>...
    )
ValueError: conflicting sizes for dimension np.str_('time'): length 6 on the data but length 3 on coordinate np.str_('time')

Looking at the code of rioxarray, we see that it's hard-coded to support only 3 dimensions:

    result = DataArray(
        data=data, dims=(coord_name, "y", "x"), coords=coords, attrs=attrs, name=da_name
    )
@amarandon amarandon added the bug Something isn't working label Dec 2, 2024
@amarandon
Copy link
Collaborator Author

The root of the issue is in rioxarray: corteva/rioxarray#174

@sbrunato
Copy link
Collaborator

sbrunato commented Dec 4, 2024

related to #78

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants