Skip to content
Open
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,9 @@ doc/team-panel.txt
doc/external-examples-gallery.txt
doc/notebooks-examples-gallery.txt
doc/videos-gallery.txt
doc/*.zarr
doc/*.nc
doc/*.h5

# Until we support this properly, excluding from gitignore. (adding it to
# gitignore to make it _easier_ to work with `uv`, not as an indication that I
Expand Down
5 changes: 0 additions & 5 deletions doc/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -178,11 +178,6 @@
# mermaid config
mermaid_version = "11.6.0"

# sphinx-llm config
# Some jupyter-execute cells are not thread-safe, so we need to build sequentially.
# See https://github.com/pydata/xarray/pull/11003#issuecomment-3641648868
llms_txt_build_parallel = False

# Add any paths that contain templates here, relative to this directory.
templates_path = ["_templates", sphinx_autosummary_accessors.templates_path]

Expand Down
25 changes: 20 additions & 5 deletions doc/getting-started-guide/quick-overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -213,17 +213,32 @@ You can directly read and write xarray objects to disk using :py:meth:`~xarray.D

.. jupyter-execute::

ds.to_netcdf("example.nc")
reopened = xr.open_dataset("example.nc")
reopened
filename = "example.nc"

.. jupyter-execute::
:hide-code:

import os
# Ensure the file is located in a unique temporary directory
# so that it doesn't conflict with parallel builds of the
# documentation.

import tempfile
import os.path

tempdir = tempfile.TemporaryDirectory()
filename = os.path.join(tempdir.name, filename)

.. jupyter-execute::

ds.to_netcdf(filename)
reopened = xr.open_dataset(filename)
reopened

.. jupyter-execute::
:hide-code:

reopened.close()
os.remove("example.nc")
tempdir.cleanup()


It is common for datasets to be distributed across multiple files (commonly one file per timestep). Xarray supports this use-case by providing the :py:meth:`~xarray.open_mfdataset` and the :py:meth:`~xarray.save_mfdataset` methods. For more, see :ref:`io`.
Expand Down
83 changes: 59 additions & 24 deletions doc/internals/time-coding.rst
Original file line number Diff line number Diff line change
Expand Up @@ -459,59 +459,94 @@ Default Time Unit

The current default time unit of xarray is ``'ns'``. When setting keyword argument ``time_unit`` unit to ``'s'`` (the lowest resolution pandas allows) datetimes will be converted to at least ``'s'``-resolution, if possible. The same holds true for ``'ms'`` and ``'us'``.

.. jupyter-execute::

datetimes1_filename = "test-datetimes1.nc"

.. jupyter-execute::
:hide-code:

# Ensure the file is located in a unique temporary directory
# so that it doesn't conflict with parallel builds of the
# documentation.

import tempfile
import os.path

tempdir = tempfile.TemporaryDirectory()
datetimes1_filename = os.path.join(tempdir.name, datetimes1_filename)

.. jupyter-execute::

attrs = {"units": "hours since 2000-01-01"}
ds = xr.Dataset({"time": ("time", [0, 1, 2, 3], attrs)})
ds.to_netcdf("test-datetimes1.nc")
ds.to_netcdf(datetimes1_filename)

.. jupyter-execute::

xr.open_dataset("test-datetimes1.nc")
xr.open_dataset(datetimes1_filename)

.. jupyter-execute::

coder = xr.coders.CFDatetimeCoder(time_unit="s")
xr.open_dataset("test-datetimes1.nc", decode_times=coder)
xr.open_dataset(datetimes1_filename, decode_times=coder)

If a coarser unit is requested the datetimes are decoded into their native
on-disk resolution, if possible.

.. jupyter-execute::

datetimes2_filename = "test-datetimes2.nc"

.. jupyter-execute::
:hide-code:

datetimes2_filename = os.path.join(tempdir.name, datetimes2_filename)

.. jupyter-execute::

attrs = {"units": "milliseconds since 2000-01-01"}
ds = xr.Dataset({"time": ("time", [0, 1, 2, 3], attrs)})
ds.to_netcdf("test-datetimes2.nc")
ds.to_netcdf(datetimes2_filename)

.. jupyter-execute::

xr.open_dataset("test-datetimes2.nc")
xr.open_dataset(datetimes2_filename)

.. jupyter-execute::

coder = xr.coders.CFDatetimeCoder(time_unit="s")
xr.open_dataset("test-datetimes2.nc", decode_times=coder)
xr.open_dataset(datetimes2_filename, decode_times=coder)

Similar logic applies for decoding timedelta values. The default resolution is
``"ns"``:

.. jupyter-execute::

timedeltas1_filename = "test-timedeltas1.nc"

.. jupyter-execute::
:hide-code:

timedeltas1_filename = os.path.join(tempdir.name, timedeltas1_filename)

.. jupyter-execute::

attrs = {"units": "hours"}
ds = xr.Dataset({"time": ("time", [0, 1, 2, 3], attrs)})
ds.to_netcdf("test-timedeltas1.nc")
ds.to_netcdf(timedeltas1_filename)

.. jupyter-execute::
:stderr:

xr.open_dataset("test-timedeltas1.nc")
xr.open_dataset(timedeltas1_filename)

By default, timedeltas will be decoded to the same resolution as datetimes:

.. jupyter-execute::

coder = xr.coders.CFDatetimeCoder(time_unit="s")
xr.open_dataset("test-timedeltas1.nc", decode_times=coder, decode_timedelta=True)
xr.open_dataset(timedeltas1_filename, decode_times=coder, decode_timedelta=True)

but if one would like to decode timedeltas to a different resolution, one can
provide a coder specifically for timedeltas to ``decode_timedelta``:
Expand All @@ -520,32 +555,41 @@ provide a coder specifically for timedeltas to ``decode_timedelta``:

timedelta_coder = xr.coders.CFTimedeltaCoder(time_unit="ms")
xr.open_dataset(
"test-timedeltas1.nc", decode_times=coder, decode_timedelta=timedelta_coder
timedeltas1_filename, decode_times=coder, decode_timedelta=timedelta_coder
)

As with datetimes, if a coarser unit is requested the timedeltas are decoded
into their native on-disk resolution, if possible:

.. jupyter-execute::

timedeltas2_filename = "test-timedeltas2.nc"

.. jupyter-execute::
:hide-code:

timedeltas2_filename = os.path.join(tempdir.name, timedeltas2_filename)

.. jupyter-execute::

attrs = {"units": "milliseconds"}
ds = xr.Dataset({"time": ("time", [0, 1, 2, 3], attrs)})
ds.to_netcdf("test-timedeltas2.nc")
ds.to_netcdf(timedeltas2_filename)

.. jupyter-execute::

xr.open_dataset("test-timedeltas2.nc", decode_timedelta=True)
xr.open_dataset(timedeltas2_filename, decode_timedelta=True)

.. jupyter-execute::

coder = xr.coders.CFDatetimeCoder(time_unit="s")
xr.open_dataset("test-timedeltas2.nc", decode_times=coder, decode_timedelta=True)
xr.open_dataset(timedeltas2_filename, decode_times=coder, decode_timedelta=True)

To opt-out of timedelta decoding (see issue `Undesired decoding to timedelta64 <https://github.com/pydata/xarray/issues/1621>`_) pass ``False`` to ``decode_timedelta``:

.. jupyter-execute::

xr.open_dataset("test-timedeltas2.nc", decode_timedelta=False)
xr.open_dataset(timedeltas2_filename, decode_timedelta=False)

.. note::
Note that in the future the default value of ``decode_timedelta`` will be
Expand All @@ -557,13 +601,4 @@ To opt-out of timedelta decoding (see issue `Undesired decoding to timedelta64 <
:hide-code:

# Cleanup
import os

for f in [
"test-datetimes1.nc",
"test-datetimes2.nc",
"test-timedeltas1.nc",
"test-timedeltas2.nc",
]:
if os.path.exists(f):
os.remove(f)
tempdir.cleanup()
55 changes: 36 additions & 19 deletions doc/internals/zarr-encoding-spec.rst
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,18 @@ with zarr-python.

**Example 1: Zarr V2 Format**

.. jupyter-execute::

zarr_v2_filename = "example_v2.zarr"

.. jupyter-execute::
:hide-code:

import tempfile
import os.path
tempdir = tempfile.TemporaryDirectory()
zarr_v2_filename = os.path.join(tempdir.name, zarr_v2_filename)

.. jupyter-execute::

import os
Expand All @@ -98,30 +110,33 @@ with zarr-python.

# Load tutorial dataset and write as Zarr V2
ds = xr.tutorial.load_dataset("rasm")
ds.to_zarr("rasm_v2.zarr", mode="w", consolidated=False, zarr_format=2)
ds.to_zarr(zarr_v2_filename, mode="w", consolidated=False, zarr_format=2)

# Open with zarr-python and examine attributes
zgroup = zarr.open("rasm_v2.zarr")
zgroup = zarr.open(zarr_v2_filename)
print("Zarr V2 - Tair attributes:")
tair_attrs = dict(zgroup["Tair"].attrs)
for key, value in tair_attrs.items():
print(f" '{key}': {repr(value)}")

**Example 2: Zarr V3 Format**

.. jupyter-execute::
:hide-code:

import shutil
shutil.rmtree("rasm_v2.zarr")
zarr_v3_filename = "example_v3.zarr"

**Example 2: Zarr V3 Format**
.. jupyter-execute::
:hide-code:

zarr_v3_filename = os.path.join(tempdir.name, zarr_v3_filename)

.. jupyter-execute::

# Write the same dataset as Zarr V3
ds.to_zarr("rasm_v3.zarr", mode="w", consolidated=False, zarr_format=3)
ds.to_zarr(zarr_v3_filename, mode="w", consolidated=False, zarr_format=3)

# Open with zarr-python and examine attributes
zgroup = zarr.open("rasm_v3.zarr")
zgroup = zarr.open(zarr_v3_filename)
print("Zarr V3 - Tair attributes:")
tair_attrs = dict(zgroup["Tair"].attrs)
for key, value in tair_attrs.items():
Expand All @@ -131,12 +146,6 @@ with zarr-python.
tair_array = zgroup["Tair"]
print(f"\nZarr V3 - dimension_names in metadata: {tair_array.metadata.dimension_names}")

.. jupyter-execute::
:hide-code:

import shutil
shutil.rmtree("rasm_v3.zarr")


Chunk Key Encoding
------------------
Expand All @@ -148,6 +157,16 @@ dimension separator in chunk keys.

For example, to specify a custom separator for chunk keys:


.. jupyter-execute::

example_filename = "example.zarr"

.. jupyter-execute::
:hide-code:

example_filename = os.path.join(tempdir.name, example_filename)

.. jupyter-execute::

import xarray as xr
Expand All @@ -161,7 +180,7 @@ For example, to specify a custom separator for chunk keys:
arr = np.ones((42, 100))
ds = xr.DataArray(arr, name="var1").to_dataset()
ds.to_zarr(
"example.zarr",
example_filename,
zarr_format=2,
mode="w",
encoding={"var1": {"chunks": (42, 50), "chunk_key_encoding": enc}},
Expand All @@ -179,8 +198,6 @@ when working with tools that expect a particular chunk key format.
chunk key encoding based on the store's format and configuration.

.. jupyter-execute::
:hide-code:

import shutil
:hide-code:

shutil.rmtree("example.zarr")
tempdir.cleanup()
Loading
Loading