Skip to content

[Clean up] Schedule Tree optimizer (WIP)#255

Merged
FlorianDeconinck merged 2 commits into
NOAA-GFDL:nasa/milestone2from
FlorianDeconinck:cleanup/stree_opt
Oct 10, 2025
Merged

[Clean up] Schedule Tree optimizer (WIP)#255
FlorianDeconinck merged 2 commits into
NOAA-GFDL:nasa/milestone2from
FlorianDeconinck:cleanup/stree_opt

Conversation

@FlorianDeconinck
Copy link
Copy Markdown
Collaborator

Small clean up left from #251, sorry @twicki

@FlorianDeconinck FlorianDeconinck merged commit 00f8326 into NOAA-GFDL:nasa/milestone2 Oct 10, 2025
5 checks passed
@FlorianDeconinck FlorianDeconinck deleted the cleanup/stree_opt branch October 10, 2025 16:10
github-merge-queue Bot pushed a commit that referenced this pull request Oct 24, 2025
…ne (#189)

* NASA Team: Mileston 2 "release" branch

This branch is what we use for the NASA team as we start to prepare for
Milestone 2.

Currently uses the following versions of externals:

- GT4Py: follows "milestone2" branch on Roman's fork
- DaCe: whatever GT4Py's uv.lock file says about dace-cartesian

* Expose erf, erfc, round, and new typecasts from ndsl.dsl.gt4py

* gt4py update: abs k and current k in debug backend

This commit updates GT4Py to add support for the experimental features
"absolute k indexing" and "expose current k-level" in the debug backend.

* gt4py update: fix literal precision

* dace|orchestration: Schedule tree roundrip work (#206)

* Roundtrip sdfg -> stree -> sdfg in orchestration
   with moaar validation
* Remove debug prints and intermediate sdfg saving
* Use default when calling simplify
* Update gt4py/dace submodules (roundtrip work)
   This commit brings the changes needed for stree rountrips to validate
   with the AI2 data (in the PyFV3 translate tests).
* Update README
* Quick note: skip ScalarToSymbolPromotion for now
   The pass messes up previously valid & validating SDFGs. We can live
   (performance wise) without it for the current milestone. Let's
   re-evaluate once we get back to DaCe mainline (v2).
* Update gt4py & dace submodules (stree/rountrip)

* update gt4py to milestone2

* Added device_synchronize call to fix GPU/MPI synchronization issue on MPI inplace all_reduce calls.  Note that device_synchronize is Cupy/CUDA specific at the moment.

* Linting

* Linting again

* perf: set build type to release in dace config

* perf: set -march=native flag for cpu

* fix: stencil wrapper field origins with data_dims

Add support for fields with data_dims (or data_dims only fields) in the
stencil wrapper's function to computae field origins.

* Unrelated: no unused arguments in stencil definition

* Update gt4py to lastest romanc/milestone2

* tests: Add test case for orchestrated tables

Add a non-trival test case for orchestrating tables. This is a
mitigation for a gt4py-orchestration-issue that is easiest reproduced
from NDSL (compared to a adding a test in gt4py directly).

* [orchestration] common cast operation replacments

Cherry-picking (parts of) PR #211
into the milestone2 branch.

* FieldBundle memoization fix

* Update gt4py: fix memlets into FrozenSDFG

* update gt4py: tests memlet dimesion / fix domain symbols

This gt4py update includes

- tests for the memlet dimension fix
- another fix to ensure that we always define all three cartesian
  symbols (even if we are only passing 2d fields and scalars into the
  stencil).

* cleanup: backends raise if not defined (#234)

No need to assert - `from_backend()` raises a `ValueError` if a requested backend doesn't exist.

* GT4Py update

This GT4Py update includes

- dace fixes: FrozenSDFG fixes, iterator symbols
- feature: `dace:cpu_kfirst` backend
- tests: remove unused test utils
- tests: print cache location at start (not end)
- dace fixes: merge schedule tree roundtrip work
- dace fix: memlet size of data dimensions
- dace fix: use cached SDFGs from disk
- dace perf: align loop structure and data layout
- dace: remove unsued tile symbol function
- refacor: invalid backend/frontend raise ValueError

* gt4py update: no major changes in cartesian

this is just to be up to date with the `milestone2` in gt4py, which was
updated as preparation for setting up a PR for absolute k indexing as
experimental feature.

* gt4py update (abs K index fix in debug & dace)

Bring in a fix for the issue that showed when IJ or K fields were used
in combination with absolute K indexing.

* gt4py update: absolute K indexing in mainline

* absolute k indexing is now part of mainline gt4py (experimental)

* Schedule Tree Pipeline + Untested Axis Merge (#251)

* Roundtrip sdfg -> stree -> sdfg in orchestration

- with moaar validation

* Move in code the merge passes + K offset check

* Insert optimization in orchestration

* Conserve correct code-flow and stop merging when hitting non-map node as second candidate

* Debug: Save STREE post opt
Remove still assert

* Split AxisMerge, add scalar tasklet push

* Move algorithms to a 3-step method

* Move up `ndsl_log` in `__init__` stack because it's a standalone file (cut on potential circular imports)

* Working PushIfElse operator (on FvTp2D) dies on D_SW

* Fix `list.index` re-using `_list_index` written by hand
Allow scope operation to look more broadly at `next_node` mergeability

* Remove debug prints and intermediate sdfg saving

* Use default when calling simplify

* Update gt4py/dace submodules (roundtrip work)

This commit brings the changes needed for stree rountrips to validate
with the AI2 data (in the PyFV3 translate tests).

* Update README

* New algorithms - with revert when failure to merge and more aggresive depth-first merges

* Quick note: skip ScalarToSymbolPromotion for now

The pass messes up previously valid & validating SDFGs. We can live
(performance wise) without it for the current milestone. Let's
re-evaluate once we get back to DaCe mainline (v2).

* Add default ControlFlow behavior (recurse)
Add - deactivated -  AxisIteartor name sanitizer
Fix single axis merge test

* Add helper to detect if log is in Debug
Add Release  & march=native into dace compiler flags
Unused orchestration pass
Unused stree pass

* Fix the GT4Py dependancy

* Bad merge fix

* Bad merge fix

* Internal in code flag for stree optimizer

* Move helpful "Make Sequential" SDFG transformation

* Lint

* Remove original Roman code that has been harvested for good

* Unit test for roundtrip, proper pipeline setup

* Lint

* Fix to default Pipeline

* Move helper function for SDFG, delete unused code

* Move out tree common operations

* Add mock test for optimization

* Lint

* Lint

---------

Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com>

* [Clean up] Schedule Tree optimizer (WIP) (#255)

* Use `ndsl_log`

* Remove missed `breakpoint` and turn dead code into coding comment

* gt4py update: push forscope down, shiny error messages

* update dace (& gt4py): fixes from v1/maintenance

* fixup: add missing type after merge

* update gt4py: K iteration index

* De-dragon the README

* Rename `dst` to `stree` for moniker of `dace.sdfg.analysis.schedule_tree.treenodes`
Better docs

* Flip `Protocol` base class to the broader and cleaner ABC

* Lint

---------

Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com>
Co-authored-by: Roman Cattaneo <>
Co-authored-by: Christopher W. Kung <ckung@gh004.atusrvm.adapt.nccs.nasa.gov>
Co-authored-by: Florian Deconinck <deconinck.florian@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant