Skip to content

Release version 2025.10.00#289

Merged
romanc merged 88 commits into
mainfrom
develop
Oct 28, 2025
Merged

Release version 2025.10.00#289
romanc merged 88 commits into
mainfrom
develop

Conversation

@romanc
Copy link
Copy Markdown
Collaborator

@romanc romanc commented Oct 28, 2025

Description

This PR brings changes from develop into main for the release of version 2025.10.00. The version will be tagged on release once merged down.

How has this been tested?

All good as long as CI is still green.

Temporary release checklists

Pre-release checklist

  • setup a PR in pace with the submodule updated to the commit that you want to release

Release checklist

  • setup a PR to merge changes from develop into main
  • once merged, create a GitHub release and tag the new version (on main)
    • version format is [year].[month].[patch], e.g. 2025.10.00
    • let GitHub auto-generate release notes from the last tagged version
  • send an announcement on Mattermost

Post-release checklist

romanc and others added 30 commits June 3, 2025 18:31
Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com>
In newer versions, `Dataset.dims` will be changed to return a set of
dimension names. The mapping from dimension names to lengths will move
(and is already available) in `Dataset.sizes`.

This will make a warning go away that we currently see in pyFV3
translate tests.

Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com>
This PR updates pre-commit tools to newer versions. Some defaults changed in the black formatter, which affects the styling of a couple files.
…ks (#174)

* Allow for rank != 0 to miss certain variables

* Lint
* Re-enable caching tests
* Type hints in tests/dsl/compilation_config
* more test cleanups
* simplify
* More robust caches tests.
* MPI cleanup

---------

Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com>
Co-authored-by: Florian Deconinck <deconinck.florian@gmail.com>
* Generating eta files for tests on the fly

* Amended eta testing to use proper teardown methods and pytest fixtures

* Forgot to remove import of generate_eta_files in test_eta

* Removing commented out snippets from test_eta

* Moving fixtures inside of test_eta, and introduction of data and scripts directories

* Forgot to add eta.py and generate_eta_files.py in their new locations

* Adding only netcdf files for eta file read-in testing

* Moved data files for eta tests to tests directory

* Adding README for eta tests

* Amending top level README
* Changes needed for external grid data read-in, needed halo update and fill corners call

* Transpose of read in lon/lat arrays

* Removing transpose of data in from_external MetricTerms class method
…ernization (#159)

* Moving to pyproject.toml and minimal setup.py

* Linting

* Fixing setup.py syntax

* Moving setup.cfg contents to pyproject.toml

* Adding license information for ndsl/viz/fv3/README.md

* Adding private classifier to pyproject.toml

* Removing setup.cfg

* Fixing unit test workflow

* Changing install option from 'develop' to 'dev'

* Adding linting for yamls

* Adding Flake8-pyproject as additional dependency in pre-commit-config.yaml
Update `__init__.py` files to use populate the `__all__` variable. That way, we don't need to handle `__init__.py` files separately.
* Align logging default to usage in Pace

* Fix compile flags for GH boxes

* Update ndsl/dsl/dace/dace_config.py

Co-authored-by: Roman Cattaneo <romanc@users.noreply.github.com>

---------

Co-authored-by: Roman Cattaneo <romanc@users.noreply.github.com>
* Extend black formatter to jupyter notebooks
* Redact Chris' home folder path
* Upgrade `mpi4py` to >4 to allow new unified allocator on GH box to not die

* Restrict further to bug fixed 4.1+

* A tad too strict

* Use bare metal Ubuntu for NDSL unit tests
* Add cache for pre-commit environments
* Don't pull submodules (we don't lint external code
PR GridTools/gt4py#2067 in GT4Py fundamentally changed how the `dace:*` backends behave. In that PR, we changed the strategy to make use of an upcoming DaCe feature called "Schedule Tree", which we will use for optimization purposes. This new "bridge" between GT4Py and DaCe, allows for a much cleaner design where both packages handle nothing more than what they need to. A drop in performance is to be expected (especially on CPU) as we have deactivated local caching for now. But the very next task is to re-use this new platform to allow for much more improved and aggressive merging capacities, local caching and hardware-driven tiling.

This PR updates GT4Py to a version that includes the above mentioned "Schedule tree bridge" and removes NDSL-level optimizations in the orchestration pipeline. As said, this will come with a temporary dip in performance, which we plan to restore with upcoming pull requests.

In addition to updating the GT4Py and DaCe versions, we include the following two changes in this PR

1. Expose compiler optimization level as `GT4PY_COMPILE_OPT_LEVEL`, defaults to `3` as before.
2. Minor change in import style in `ndsl/dsl/dace/orchestration.py`.

---------

Co-authored-by: Florian Deconinck <deconinck.florian@gmail.com>
Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com>
* Update readme with MPI / compiler recommendataions
* specify compiler version for macos users
* switch suggested mpi installation from `mpich` to `openmpi` because it works (better) with GEOS
* ci: configure tests to run on merge-queue

* fixup: fix workflow configuration

---------

Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com>
* Deprecate PACE_DACE_DEBUG in favor of NDSL_DACE_DEBUG

* Deprecate PACE_LOGLEVEL in favor of NDSL_LOGLEVEL

* Deprecate PACE_FLOAT_PRECISION in favor of NDSL_LITERAL_PRECISION

* Deprecate PACE_CONSTANTS in favor of NDSL_CONSTANTS

* Don't use ndsl_log before it is defined

* Document correctly renamed env variable

---------

Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com>
* Merge queue test

* Continued testing of merge-queue

* Removing file from docs
* Update gt4py: support for literal precision

* Actually forwarding literal precision to gt4py

* Exposing type casts and new math functions

* Documentation update

---------

Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com>
PR #74 introduced better comparison in the `LegacyMetric`.
Unfortunately, this overrides the reference values under certain
conditions. The regression was realized because NaNs were copied into
the reference values where 0 was expected.

This PR fixes the LegacyMetric to copy the reference data before
manipulating the array to avoid devision by 0 errors. In case the
reference value is 0 (and thus the denominater would be 0), we set the
denominator to 1 because we expect the computed value to be (close to)
0. In that case (abs(computed - expected) / 1.0) is a good value for the
error being reported.

Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com>
The package `pytest-subtests` is used in `ndsl/stencils/testing` and
thus a required dependency of NDSL. This allows packages depending on
NDSL (e.g. PyFV3) to `pip install NDSL` without the "tests" extra (since
we aren't running NDSL tests) and still get a functioning translate test
system.

Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com>
Co-authored-by: Florian Deconinck <deconinck.florian@gmail.com>
* Initial NDSL Python files documentation via MkDocs

* Incorporated docstring documentation into currently-implemented NDSL documentation

* Add mkdocstrings to docs_requirements

* More linting

* optional dependencies moved to pyproject.toml

* Fix a couple warnings that show on the console when starting the server

---------

Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com>
Co-authored-by: Roman Cattaneo <>
* adding new feature for 2d stencil indexing

* added test, updated method

* unused line

* lint

* updating test

* lint
* input/output netcdf override. added error when overwriting self.outputs

* pre-commit changes

* Added documentation for the netcdf overrides.
This PR updates the GT4Py submodule fixing a bug with `np.bool_` types
when used as externals (missing support for that type in the
ValueInliner). We now also expose `round()` and `round_away_from_zero()`
in gtscript.

Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com>
* Updating NetCDF4 version to 1.7.2 for use by pyRTE-RRTMGP

* Checking changes for introduction of pyrte_rrtmgp in PySHiELD are no longer breaking

* Testing passes, testing pyshield develop again
romanc and others added 21 commits October 16, 2025 15:43
* WIP: mypy with strict(er) type checking

* remove a bunch of type: ignore statements

* remove redundant cast operation

* WIP moaaaar types

* moar types

* all the types

* the cherry on top

* fix path / str update after rebase

* fix up: allow None type in local_comm._get_buffer()

* fixup: don't rename arguments ;)

* be nice with pyfv3 and loading layouts from namelists

loading the layout from a f90nml namelists, we end up with a layout of
type `list[int]`, e.g. [1, 1]. The partitioner expects a `tuple(int,
int)`, e.g. (1, 1) and since this is at the interface betweeen pyfv3 and
ndsl, nobody complains (probably because we don't use mypy correctly in
pyfv3).

* readable legacy namelist flag

* more cleanup

* fixing newly added code

* just change "true" -> true

* remove old mypy.ini

---------

Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com>
…odoc generation) (#268)

* Add build docs step to PR workflows

* add docstrings doc for new / moved files

* update checklist for case of adding a new module

---------

Co-authored-by: Roman Cattaneo <>
* modern types in NDSL

* Replace `Union` with ` | `

* replace `Optional[x]` with `x | None`

---------

Co-authored-by: Roman Cattaneo <>
Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com>
* store nhalo in the quantity

* reviewer's comments
* Temporaries base dataclass
Allow `units` to not be specified
+ unit test

* Lint

* Hide `_transient` flag of Quantity away

* `QuantityFactory` has now a `is_local` option

* Update wording to `Local`, trash `Temporaries` state idea

* lint

* Public API clean up

* Simplify and fix test

* Oops, restore code for `from_array` allocator

* Remove keyword allocation

* Introduce `Local` and `NDSLRuntime`

* Lint new files

* Remove the odd `_transient` and tag transientness correctly in `Local`

* Repeat of the Quantity trick to get a proper type hint

* Lint `local.py`

* Protect against bad init for orchestration
Move all unit test into a `test_ndsl_runtime`

* Lint

* Revert uneeded change to `Quantity`

* Correct type hint for Callable

* Revert orthogonal changes to this PR

* Lint

---------

Co-authored-by: Tobias Wicky-Pfund <tobias.wicky@meteoswiss.ch>
This PR updates GT4Py and DaCe submodules. It brings the last changes
from the M2 branch, namely K iterator access, back to mainline. The DaCe
update brings updates for usage of structures in DaCe.

Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com>
* propagate halo-information

* pc

* proper halo default usage
* add domain-size checks on FrozenStencil

* missing updated docstring

* reviewer's comments

* add warning

* add warning

* warning is in

* capture extra dimensions right

* store nhalo in the quantity

* reviewer's comments

* go warn

* clean up merge

* fix merge

* get rid of string magic and use for loop magic

* init=False and mocking is odd

* update docstring

* cleanup
* test this versioning

* update build as dev dep

* pc

* Raise `setuptools` to >= 80

---------

Co-authored-by: Florian Deconinck <deconinck.florian@gmail.com>
* Slicing of data dimensions

* Cleaner type hints

* Type checking trick for optional cupy

* Lint

* Trick for when cupy is not there

* Trick for when cupy is not there

* Flip type hint to NDarray to `numpy.typing` to circumvent `cupy` being optional
This PR deprecates usage of `ndsl.Namelist`. I also tried to deprecate
`ndsl.NamelistDefaults` but since we don't use it as a class (i.e. it's
not a dataclass and we just have it as class to group things), there's
no static constructor to (easily) intercept.

Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com>
* Add `add|update_data_dimensions`
Deprecate `set_extra_dim_lengths`
+ utest

* Update usage of `set_extra_dim_lenghts`

* `GridSizer` and `SubtileGridSizer` rename `extra_dim_lengths` attempt (dependencies might die)

* Verbose comment

* Missed pytest.fixture

* Better doc

* MetricTerms: default constructor take care of data dimensions

* Make `MetricTerms` reentrant
* tools: use flake8-bugbear

python has a bunch of non-obvious language decisions, like every other
language too. One of which is that default function arguments are
instantiated once (at parse time) and then re-used throughout the
program's lifetime. For mutable default arguments (e.g. lists and
dicts), this can lead to subtle surprises, e.g. if that default argument
is kept alive and modified within a class. To help with oddities like
this, flake8 has a plugin called "bugbear" that aims to detect these
cases and warns developers.

I've kept all the standard rules, except for rule B019, which states
that `functools.lru_cache` is evil because it interferes with garbage
collection. While true, I'd rather keep `lru_cache` for its advantages,
than outright banning it from the codebase for potential memory savings.
After all, that's the deal with a cache: more memory usage in exchange
for more speed.

* fixup: re-add default value

* SubtileGridSizer.from_tile_params(): no need for defaults

* NullComm: no need to specify defaults

---------

Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com>
GT4Py just dropped version 1.0.10 yesterday. The release includes
everything we need (and two unrelated gt4py.next commits on top). Since
we are on a mission to come back to mainline and be stricter about
releases, we could release NDSL with that version of GT4Py. I think it
would be nice for the baseline numbers. I wouldn't say we always have to
coordinate NDSL and GT4Py release, but sice we have this nice
coincidence, let's use it.

Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com>
…ne (#189)

* NASA Team: Mileston 2 "release" branch

This branch is what we use for the NASA team as we start to prepare for
Milestone 2.

Currently uses the following versions of externals:

- GT4Py: follows "milestone2" branch on Roman's fork
- DaCe: whatever GT4Py's uv.lock file says about dace-cartesian

* Expose erf, erfc, round, and new typecasts from ndsl.dsl.gt4py

* gt4py update: abs k and current k in debug backend

This commit updates GT4Py to add support for the experimental features
"absolute k indexing" and "expose current k-level" in the debug backend.

* gt4py update: fix literal precision

* dace|orchestration: Schedule tree roundrip work (#206)

* Roundtrip sdfg -> stree -> sdfg in orchestration
   with moaar validation
* Remove debug prints and intermediate sdfg saving
* Use default when calling simplify
* Update gt4py/dace submodules (roundtrip work)
   This commit brings the changes needed for stree rountrips to validate
   with the AI2 data (in the PyFV3 translate tests).
* Update README
* Quick note: skip ScalarToSymbolPromotion for now
   The pass messes up previously valid & validating SDFGs. We can live
   (performance wise) without it for the current milestone. Let's
   re-evaluate once we get back to DaCe mainline (v2).
* Update gt4py & dace submodules (stree/rountrip)

* update gt4py to milestone2

* Added device_synchronize call to fix GPU/MPI synchronization issue on MPI inplace all_reduce calls.  Note that device_synchronize is Cupy/CUDA specific at the moment.

* Linting

* Linting again

* perf: set build type to release in dace config

* perf: set -march=native flag for cpu

* fix: stencil wrapper field origins with data_dims

Add support for fields with data_dims (or data_dims only fields) in the
stencil wrapper's function to computae field origins.

* Unrelated: no unused arguments in stencil definition

* Update gt4py to lastest romanc/milestone2

* tests: Add test case for orchestrated tables

Add a non-trival test case for orchestrating tables. This is a
mitigation for a gt4py-orchestration-issue that is easiest reproduced
from NDSL (compared to a adding a test in gt4py directly).

* [orchestration] common cast operation replacments

Cherry-picking (parts of) PR #211
into the milestone2 branch.

* FieldBundle memoization fix

* Update gt4py: fix memlets into FrozenSDFG

* update gt4py: tests memlet dimesion / fix domain symbols

This gt4py update includes

- tests for the memlet dimension fix
- another fix to ensure that we always define all three cartesian
  symbols (even if we are only passing 2d fields and scalars into the
  stencil).

* cleanup: backends raise if not defined (#234)

No need to assert - `from_backend()` raises a `ValueError` if a requested backend doesn't exist.

* GT4Py update

This GT4Py update includes

- dace fixes: FrozenSDFG fixes, iterator symbols
- feature: `dace:cpu_kfirst` backend
- tests: remove unused test utils
- tests: print cache location at start (not end)
- dace fixes: merge schedule tree roundtrip work
- dace fix: memlet size of data dimensions
- dace fix: use cached SDFGs from disk
- dace perf: align loop structure and data layout
- dace: remove unsued tile symbol function
- refacor: invalid backend/frontend raise ValueError

* gt4py update: no major changes in cartesian

this is just to be up to date with the `milestone2` in gt4py, which was
updated as preparation for setting up a PR for absolute k indexing as
experimental feature.

* gt4py update (abs K index fix in debug & dace)

Bring in a fix for the issue that showed when IJ or K fields were used
in combination with absolute K indexing.

* gt4py update: absolute K indexing in mainline

* absolute k indexing is now part of mainline gt4py (experimental)

* Schedule Tree Pipeline + Untested Axis Merge (#251)

* Roundtrip sdfg -> stree -> sdfg in orchestration

- with moaar validation

* Move in code the merge passes + K offset check

* Insert optimization in orchestration

* Conserve correct code-flow and stop merging when hitting non-map node as second candidate

* Debug: Save STREE post opt
Remove still assert

* Split AxisMerge, add scalar tasklet push

* Move algorithms to a 3-step method

* Move up `ndsl_log` in `__init__` stack because it's a standalone file (cut on potential circular imports)

* Working PushIfElse operator (on FvTp2D) dies on D_SW

* Fix `list.index` re-using `_list_index` written by hand
Allow scope operation to look more broadly at `next_node` mergeability

* Remove debug prints and intermediate sdfg saving

* Use default when calling simplify

* Update gt4py/dace submodules (roundtrip work)

This commit brings the changes needed for stree rountrips to validate
with the AI2 data (in the PyFV3 translate tests).

* Update README

* New algorithms - with revert when failure to merge and more aggresive depth-first merges

* Quick note: skip ScalarToSymbolPromotion for now

The pass messes up previously valid & validating SDFGs. We can live
(performance wise) without it for the current milestone. Let's
re-evaluate once we get back to DaCe mainline (v2).

* Add default ControlFlow behavior (recurse)
Add - deactivated -  AxisIteartor name sanitizer
Fix single axis merge test

* Add helper to detect if log is in Debug
Add Release  & march=native into dace compiler flags
Unused orchestration pass
Unused stree pass

* Fix the GT4Py dependancy

* Bad merge fix

* Bad merge fix

* Internal in code flag for stree optimizer

* Move helpful "Make Sequential" SDFG transformation

* Lint

* Remove original Roman code that has been harvested for good

* Unit test for roundtrip, proper pipeline setup

* Lint

* Fix to default Pipeline

* Move helper function for SDFG, delete unused code

* Move out tree common operations

* Add mock test for optimization

* Lint

* Lint

---------

Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com>

* [Clean up] Schedule Tree optimizer (WIP) (#255)

* Use `ndsl_log`

* Remove missed `breakpoint` and turn dead code into coding comment

* gt4py update: push forscope down, shiny error messages

* update dace (& gt4py): fixes from v1/maintenance

* fixup: add missing type after merge

* update gt4py: K iteration index

* De-dragon the README

* Rename `dst` to `stree` for moniker of `dace.sdfg.analysis.schedule_tree.treenodes`
Better docs

* Flip `Protocol` base class to the broader and cleaner ABC

* Lint

---------

Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com>
Co-authored-by: Roman Cattaneo <>
Co-authored-by: Christopher W. Kung <ckung@gh004.atusrvm.adapt.nccs.nasa.gov>
Co-authored-by: Florian Deconinck <deconinck.florian@gmail.com>
* Pass down NDSL types for 2D temporaries parsing

* Update GT4Py to 4caf03c (2D tmps + `cuda` backend removed)

* Simplify test and fix kwargs test check

* Remove reference to "cuda" backend

* Update gt4py to `6afc22` - remove ADR hard check
* Updated NDSL notebooks with clarifications

* fix typos

* cleanup

* whitespace cleanup and typos

* Update headers to be pure markdown

* update error message

* update error message

---------

Co-authored-by: Christopher Kung <christopher.w.kung@nasa.gov>
Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com>
* ci: cache translate test data

* use create tests from pyshield repo

* fix worfklow file name

* specialize concurrency groups per repo

* test

* test also with pyfv3

* change back to NOAA-GFDL org

* remove unused env vars

* consistent comments

---------

Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com>
This PR udpates GT4Py to bring in the fix for parsing temporary
annotations.

Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com>
* Add 2D temporary support  for NDSL's `BoolFieldIJ`

* Fix test
@romanc romanc marked this pull request as ready for review October 28, 2025 15:14
@romanc romanc requested a review from fmalatino October 28, 2025 15:17
Copy link
Copy Markdown
Collaborator

@FlorianDeconinck FlorianDeconinck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@romanc romanc merged commit 66ca013 into main Oct 28, 2025
16 checks passed
@fmalatino
Copy link
Copy Markdown
Contributor

🥳

@romanc
Copy link
Copy Markdown
Collaborator Author

romanc commented Oct 28, 2025

PR to update submodules in pace: NOAA-GFDL/pace#156

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants