Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
512e337
ci: restore default ci (#392)
romanc Mar 5, 2026
60f2c43
Add `to_xarray` API to State (#395)
FlorianDeconinck Mar 6, 2026
7eb52db
BREAKING CHANGE: drop support for `X_DIM` and friends, remove depreca…
romanc Mar 6, 2026
ada2d53
pref: improve ochestration transpile/compile times (#396)
romanc Mar 11, 2026
b6a25a3
build: update gt4py to get compiler support (#399)
romanc Mar 12, 2026
38197e1
gt4py update: fix GCC 12/13 compiler flags (#400)
romanc Mar 16, 2026
eb8c618
docs|ci: versioning and release (#401)
romanc Mar 16, 2026
af09f73
build: update gt4py (numpy 2 compatibility and `ipcx` support) (#403)
romanc Mar 23, 2026
469a58a
typing: add types to `ndsl.stencils.testing.grid` (#404)
romanc Mar 23, 2026
86d7d54
build: update gt4py (data dimensions size one) (#406)
romanc Mar 24, 2026
b6e94f1
fix: cleanups in translate test discovery (#405)
romanc Mar 25, 2026
0fff114
Improved translate test logging (#398)
CharlesKrop Mar 25, 2026
ccf2ed3
Improved data_loader (#402)
CharlesKrop Mar 25, 2026
c47852d
fix backend default in translate test conftest (#407)
romanc Mar 27, 2026
30b068c
build: update GT4Py (loop layout fixes) (#410)
romanc Mar 30, 2026
4f5315e
fix: xumpy.random honors dtype now (#412)
romanc Mar 31, 2026
da721f1
refactor: use xumpy for allocation in gt4py_utils (#388)
romanc Mar 31, 2026
db026fc
build: update gt4py (fix scalarization issue with temporaries) (#413)
romanc Mar 31, 2026
a622894
Default to `BuildAndRun` (#408)
FlorianDeconinck Apr 1, 2026
9dfdaf5
build: update submodules (#417)
romanc Apr 1, 2026
4294976
[Update] Upgrade to the `numpy` 2x series (#415)
twicki Apr 1, 2026
f4ba5d6
build: Add support for python 3.13 (#394)
romanc Apr 2, 2026
bfeac8e
[Translate] Fix default shape for KJI / Fortran-aligned backend (#409)
FlorianDeconinck Apr 2, 2026
1815e4f
Translate test dimension fix during comparison (#419)
CharlesKrop Apr 3, 2026
80bd437
restore the ci hooks for shield and pace (#420)
twicki Apr 8, 2026
220f3a5
update gt4py (#421)
twicki Apr 8, 2026
fa6503a
Release: NDSL `2026.03.00` (#422)
twicki Apr 9, 2026
6f1ca86
Revert "Release: NDSL `2026.03.00` (#422)" (#424)
twicki Apr 9, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/PULL_REQUEST_TEMPLATE/release-patch.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,4 +29,4 @@ What to do to actually release:
What to do after a release:

- [ ] update the pace PR from the pre-commit checklist to include the released version of NDSL and merge it.
- [ ] in NDSL, merge `main` back into `develop` (potentially adding a commit to fix the issue "properly")
- [ ] in NDSL, merge `main` back into `develop` (potentially adding a commit to fix the issue "properly") to have all changes in develop and ensure `setuptools_scm` finds the latest release tag
1 change: 1 addition & 0 deletions .github/PULL_REQUEST_TEMPLATE/release.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,5 +22,6 @@ What to do to actually release:

What to do after a release:

- [ ] merge `main` down into `develop` to ensure `setuptools_scm` finds the latest release tag
- [ ] update the pace PR from the pre-commit checklist to include the released version of NDSL and merge it.
- [ ] merge breaking changes in NDSL (e.g. search for deprecation warnings)
5 changes: 1 addition & 4 deletions .github/workflows/fv3_translate_tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,7 @@ on:

jobs:
fv3_translate_tests:
# TODO
# restore once NDSL 2026.02.00 is released and pyFV3 is updated.
# uses: NOAA-GFDL/pyFV3/.github/workflows/translate.yaml@develop
uses: romanc/pyFV3/.github/workflows/translate.yaml@noop
uses: NOAA-GFDL/pyFV3/.github/workflows/translate.yaml@develop
with:
component_trigger: true
component_name: NDSL
5 changes: 1 addition & 4 deletions .github/workflows/pace_tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,7 @@ on:

jobs:
pace_main_tests:
# TODO
# restore once NDSL 2026.02.00 is released and pace is updated.
# uses: NOAA-GFDL/pace/.github/workflows/main_unit_tests.yaml@develop
uses: romanc/pace/.github/workflows/main_unit_tests.yaml@noop
uses: NOAA-GFDL/pace/.github/workflows/main_unit_tests.yaml@develop
with:
component_trigger: true
component_name: NDSL
5 changes: 1 addition & 4 deletions .github/workflows/shield_tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,7 @@ on:

jobs:
shield_translate_tests:
# TODO
# restore once NDSL 2026.02.00 is released and pySHiELD is updated.
# uses: NOAA-GFDL/pySHiELD/.github/workflows/translate.yaml@develop
uses: romanc/pySHiELD/.github/workflows/translate.yaml@noop
uses: NOAA-GFDL/pySHiELD/.github/workflows/translate.yaml@develop
with:
component_trigger: true
component_name: NDSL
2 changes: 1 addition & 1 deletion .github/workflows/unit_tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ['3.11', '3.12']
python-version: ['3.11', '3.12', '3.13']
name: Python ${{ matrix.python-version }}
steps:
- name: Checkout repository
Expand Down
8 changes: 4 additions & 4 deletions examples/NDSL/03_orchestration_basics.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@
" orchestrate,\n",
" QuantityFactory,\n",
")\n",
"from ndsl.constants import X_DIM, Y_DIM, Z_DIM\n",
"from ndsl.constants import I_DIM, J_DIM, K_DIM\n",
"from ndsl.dsl.typing import FloatField, Float\n",
"from ndsl.boilerplate import get_factories_single_tile_orchestrated"
]
Expand Down Expand Up @@ -93,7 +93,7 @@
" domain=grid_indexing.domain_compute(),\n",
" )\n",
" self._tmp_field = quantity_factory.zeros(\n",
" [X_DIM, Y_DIM, Z_DIM], \"n/a\", dtype=dtype\n",
" [I_DIM, J_DIM, K_DIM], \"n/a\", dtype=dtype\n",
" )\n",
" self._n_halo = quantity_factory.sizer.n_halo\n",
"\n",
Expand Down Expand Up @@ -134,9 +134,9 @@
" )\n",
" local_sum = LocalSum(stencil_factory, qty_factory)\n",
"\n",
" in_field = qty_factory.zeros([X_DIM, Y_DIM, Z_DIM], \"n/a\", dtype=dtype)\n",
" in_field = qty_factory.zeros([I_DIM, J_DIM, K_DIM], \"n/a\", dtype=dtype)\n",
" in_field.view[:] = 2.0\n",
" out_field = qty_factory.zeros([X_DIM, Y_DIM, Z_DIM], \"n/a\", dtype=dtype)\n",
" out_field = qty_factory.zeros([I_DIM, J_DIM, K_DIM], \"n/a\", dtype=dtype)\n",
"\n",
" # Run\n",
" local_sum(in_field, out_field)"
Expand Down
2 changes: 1 addition & 1 deletion external/gt4py
Submodule gt4py updated 122 files
19 changes: 10 additions & 9 deletions ndsl/buffer.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,10 @@

import contextlib
from collections.abc import Callable, Generator, Iterable
from typing import Any

import numpy as np
from numpy.lib.index_tricks import IndexExpression
import numpy.typing as npt

from ndsl.performance.timer import NullTimer, Timer
from ndsl.types import Allocator
Expand All @@ -16,7 +17,7 @@
)


BufferKey = tuple[Callable, Iterable[int], type]
BufferKey = tuple[Callable, Iterable[int], npt.DTypeLike]
BUFFER_CACHE: dict[BufferKey, list["Buffer"]] = {}


Expand All @@ -41,7 +42,7 @@ def __init__(self, key: BufferKey, array: np.ndarray):

@classmethod
def pop_from_cache(
cls, allocator: Allocator, shape: Iterable[int], dtype: type
cls, allocator: Allocator, shape: Iterable[int], dtype: npt.DTypeLike
) -> Buffer:
"""Retrieve or insert then retrieve of buffer from cache.

Expand Down Expand Up @@ -78,8 +79,8 @@ def finalize_memory_transfer(self) -> None:
def assign_to(
self,
destination_array: np.ndarray,
buffer_slice: IndexExpression = np.index_exp[:],
buffer_reshape: IndexExpression = None,
buffer_slice: Any = np.index_exp[:],
buffer_reshape: Any | None = None,
) -> None:
"""Assign internal array to destination_array.

Expand All @@ -95,7 +96,7 @@ def assign_to(
)

def assign_from(
self, source_array: np.ndarray, buffer_slice: IndexExpression = np.index_exp[:]
self, source_array: np.ndarray, buffer_slice: Any = np.index_exp[:]
) -> None:
"""Assign source_array to internal array.

Expand All @@ -107,7 +108,7 @@ def assign_from(

@contextlib.contextmanager
def array_buffer(
allocator: Allocator, shape: Iterable[int], dtype: type
allocator: Allocator, shape: Iterable[int], dtype: npt.DTypeLike
) -> Generator[Buffer, Buffer, None]:
"""
A context manager providing a contiguous array, which may be re-used between calls.
Expand All @@ -132,7 +133,7 @@ def send_buffer(
allocator: Callable,
array: np.ndarray,
timer: Timer | None = None,
) -> np.ndarray:
) -> Generator[np.ndarray]:
"""A context manager ensuring that `array` is contiguous in a context where it is
being sent as data, copying into a recycled buffer array if necessary.

Expand Down Expand Up @@ -166,7 +167,7 @@ def recv_buffer(
allocator: Callable,
array: np.ndarray,
timer: Timer | None = None,
) -> np.ndarray:
) -> Generator[np.ndarray]:
"""A context manager ensuring that array is contiguous in a context where it is
being used to receive data, using a recycled buffer array and then copying the
result into array if necessary.
Expand Down
12 changes: 6 additions & 6 deletions ndsl/comm/communicator.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

import abc
from collections.abc import Mapping, Sequence
from types import ModuleType
from typing import Any, Self, cast

import numpy as np
Expand All @@ -16,7 +17,6 @@
from ndsl.optional_imports import cupy
from ndsl.performance.timer import NullTimer, Timer
from ndsl.quantity import Quantity, QuantityHaloSpec, QuantityMetadata
from ndsl.types import NumpyModule


def to_numpy(array, dtype=None) -> np.ndarray: # type: ignore[no-untyped-def]
Expand Down Expand Up @@ -83,7 +83,7 @@ def size(self) -> int:
"""Total number of ranks in this communicator"""
return self.comm.Get_size()

def _maybe_force_cpu(self, module: NumpyModule) -> NumpyModule:
def _maybe_force_cpu(self, module: ModuleType) -> ModuleType:
"""
Get a numpy-like module depending on configuration and
Quantity original allocator.
Expand Down Expand Up @@ -223,7 +223,7 @@ def _get_gather_recv_quantity(
) -> Quantity:
"""Initialize a Quantity for use when receiving global data during gather"""
recv_quantity = Quantity(
send_metadata.np.zeros(global_extent, dtype=send_metadata.dtype), # type: ignore
send_metadata.np.zeros(global_extent, dtype=send_metadata.dtype),
dims=send_metadata.dims,
units=send_metadata.units,
origin=tuple([0 for dim in send_metadata.dims]),
Expand All @@ -238,7 +238,7 @@ def _get_scatter_recv_quantity(
) -> Quantity:
"""Initialize a Quantity for use when receiving subtile data during scatter"""
recv_quantity = Quantity(
send_metadata.np.zeros(shape, dtype=send_metadata.dtype), # type: ignore
send_metadata.np.zeros(shape, dtype=send_metadata.dtype),
dims=send_metadata.dims,
units=send_metadata.units,
backend=send_metadata.backend,
Expand Down Expand Up @@ -837,7 +837,7 @@ def _get_gather_recv_quantity(
# needs to change the quantity dimensions since we add a "tile" dimension,
# unlike for tile scatter/gather which retains the same dimensions
recv_quantity = Quantity(
metadata.np.zeros(global_extent, dtype=metadata.dtype), # type: ignore
metadata.np.zeros(global_extent, dtype=metadata.dtype),
dims=(constants.TILE_DIM,) + metadata.dims,
units=metadata.units,
origin=(0,) + tuple([0 for dim in metadata.dims]),
Expand All @@ -859,7 +859,7 @@ def _get_scatter_recv_quantity(
# needs to change the quantity dimensions since we remove a "tile" dimension,
# unlike for tile scatter/gather which retains the same dimensions
recv_quantity = Quantity(
metadata.np.zeros(shape, dtype=metadata.dtype), # type: ignore
metadata.np.zeros(shape, dtype=metadata.dtype),
dims=metadata.dims[1:],
units=metadata.units,
backend=metadata.backend,
Expand Down
14 changes: 7 additions & 7 deletions ndsl/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,13 +38,13 @@ def _get_constant_version(
# Common constants
#####################

I_DIM = X_DIM = "i"
I_INTERFACE_DIM = X_INTERFACE_DIM = "i_interface"
J_DIM = Y_DIM = "j"
J_INTERFACE_DIM = Y_INTERFACE_DIM = "j_interface"
K_DIM = Z_DIM = "k"
K_INTERFACE_DIM = Z_INTERFACE_DIM = "k_interface"
K_SOIL_DIM = Z_SOIL_DIM = "k_soil"
I_DIM = "i"
I_INTERFACE_DIM = "i_interface"
J_DIM = "j"
J_INTERFACE_DIM = "j_interface"
K_DIM = "k"
K_INTERFACE_DIM = "k_interface"
K_SOIL_DIM = "k_soil"

I_DIMS = (I_DIM, I_INTERFACE_DIM)
J_DIMS = (J_DIM, J_INTERFACE_DIM)
Expand Down
10 changes: 0 additions & 10 deletions ndsl/dsl/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
# Literal precision for both GT4Py & NDSL
import os
import platform
import sys
from typing import Literal

Expand Down Expand Up @@ -36,15 +35,6 @@ def _get_literal_precision(default: Literal["32", "64"] = "64") -> Literal["32",
os.environ["GT4PY_LITERAL_INT_PRECISION"] = str(NDSL_GLOBAL_PRECISION)
os.environ["GT4PY_LITERAL_FLOAT_PRECISION"] = str(NDSL_GLOBAL_PRECISION)

# OpenMP handling

detected_macos = platform.system() == "Darwin"
if detected_macos:
ndsl_log.warning(
"Multithreading is deactivated under MacOS due to apple-clang not handling OpenMP by default."
)
os.environ["GT4PY_CARTESIAN_ENABLE_OPENMP"] = "False" if detected_macos else "True"


# Set cache names for default gt backends workflow
import gt4py.cartesian.config # noqa: E402
Expand Down
21 changes: 13 additions & 8 deletions ndsl/dsl/dace/dace_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@

import dace.config
from gt4py.cartesian.config import GT4PY_COMPILE_OPT_LEVEL
from gt4py.cartesian.utils.compiler import cxx_compiler_defaults, gpu_configuration

from ndsl import LocalComm
from ndsl.comm.communicator import Communicator
Expand Down Expand Up @@ -226,23 +227,18 @@ def __init__(
else:
dace.config.Config.set("compiler", "build_type", value="Release")

# Required to True for gt4py storage/memory
dace.config.Config.set(
"compiler",
"allow_view_arguments",
value=True,
)
# Resolve "march/mtune" option for GPU
# - turn on numeric-centric SSE by default
# - Neoverse-V2 Grace CPU is too new for GCC 14 and -march=native will fail
# - use alternative march=armv8-a instead
march_cpu = "armv8-a" if is_arm_neoverse else "native"
# Removed --fmath
cxx_defaults = cxx_compiler_defaults(GT4PY_COMPILE_OPT_LEVEL)
dace.config.Config.set(
"compiler",
"cpu",
"args",
value=f"-march={march_cpu} -std=c++17 -fPIC -Wall -Wextra -O{optimization_level}",
value=f"-march={march_cpu} -std=c++17 -fPIC -Wall -Wextra -O{optimization_level} {cxx_defaults.cxx_compile_flags}",
)
# Potentially buggy - deactivate
dace.config.Config.set(
Expand All @@ -257,11 +253,12 @@ def __init__(
# - use alternative mcpu=native instead
march_option = "-mcpu=native" if is_arm_neoverse else "-march=native"
# Removed --fast-math
gpu_config = gpu_configuration(GT4PY_COMPILE_OPT_LEVEL)
dace.config.Config.set(
"compiler",
"cuda",
"args",
value=f"-std=c++14 -Xcompiler -fPIC -O3 -Xcompiler {march_option}",
value=f"-std=c++14 -Xcompiler -fPIC -O{optimization_level} -Xcompiler {march_option} {gpu_config.gpu_compile_flags}",
)

cuda_sm = cp.cuda.Device(0).compute_capability if cp else 60
Expand All @@ -280,6 +277,14 @@ def __init__(
"max_concurrent_streams",
value=-1, # no concurrent streams, every kernel on defaultStream
)

# Required to True for gt4py storage/memory
dace.config.Config.set(
"compiler",
"allow_view_arguments",
value=True,
)

# Speed up built time
dace.config.Config.set(
"compiler",
Expand Down
12 changes: 2 additions & 10 deletions ndsl/dsl/dace/orchestration.py
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@ def _to_gpu(sdfg: SDFG) -> None:
def _simplify(
sdfg: SDFG,
*,
validate: bool = True,
validate: bool = False,
validate_all: bool = False,
verbose: bool = False,
) -> dict | None:
Expand All @@ -146,9 +146,6 @@ def _build_sdfg(
backend_name = config.get_backend()

if is_compiling:
with DaCeProgress(config, "Validate original SDFG"):
sdfg.validate()

# Fully specialize all known symbols and then propagate these changes in the simplify
# pass that follows. This is not only a smart idea in general, but also simplifies (haha)
# the schedule tree (optimization) roundtrip.
Expand Down Expand Up @@ -271,9 +268,6 @@ def _build_sdfg(
negative_delp_checker(sdfg)
negative_qtracers_checker(sdfg)

with DaCeProgress(config, "Validate before compile"):
sdfg.validate()

# Compile
with DaCeProgress(config, "Codegen & compile"):
sdfg.compile()
Expand Down Expand Up @@ -646,9 +640,7 @@ def __call__(self, *arg, **kwarg): # type: ignore[no-untyped-def]
return wrapped(*arg, **kwarg)

def __sdfg__(self, *args, **kwargs): # type: ignore[no-untyped-def]
sdfg = wrapped.__sdfg__(*args, **kwargs)
sdfg.validate()
return sdfg
return wrapped.__sdfg__(*args, **kwargs)

def __sdfg_closure__(self, reevaluate=None): # type: ignore[no-untyped-def]
return wrapped.__sdfg_closure__(reevaluate)
Expand Down
Loading