[Clean up] Schedule Tree optimizer (WIP)#255
Merged
FlorianDeconinck merged 2 commits intoOct 10, 2025
Merged
Conversation
github-merge-queue Bot
pushed a commit
that referenced
this pull request
Oct 24, 2025
…ne (#189) * NASA Team: Mileston 2 "release" branch This branch is what we use for the NASA team as we start to prepare for Milestone 2. Currently uses the following versions of externals: - GT4Py: follows "milestone2" branch on Roman's fork - DaCe: whatever GT4Py's uv.lock file says about dace-cartesian * Expose erf, erfc, round, and new typecasts from ndsl.dsl.gt4py * gt4py update: abs k and current k in debug backend This commit updates GT4Py to add support for the experimental features "absolute k indexing" and "expose current k-level" in the debug backend. * gt4py update: fix literal precision * dace|orchestration: Schedule tree roundrip work (#206) * Roundtrip sdfg -> stree -> sdfg in orchestration with moaar validation * Remove debug prints and intermediate sdfg saving * Use default when calling simplify * Update gt4py/dace submodules (roundtrip work) This commit brings the changes needed for stree rountrips to validate with the AI2 data (in the PyFV3 translate tests). * Update README * Quick note: skip ScalarToSymbolPromotion for now The pass messes up previously valid & validating SDFGs. We can live (performance wise) without it for the current milestone. Let's re-evaluate once we get back to DaCe mainline (v2). * Update gt4py & dace submodules (stree/rountrip) * update gt4py to milestone2 * Added device_synchronize call to fix GPU/MPI synchronization issue on MPI inplace all_reduce calls. Note that device_synchronize is Cupy/CUDA specific at the moment. * Linting * Linting again * perf: set build type to release in dace config * perf: set -march=native flag for cpu * fix: stencil wrapper field origins with data_dims Add support for fields with data_dims (or data_dims only fields) in the stencil wrapper's function to computae field origins. * Unrelated: no unused arguments in stencil definition * Update gt4py to lastest romanc/milestone2 * tests: Add test case for orchestrated tables Add a non-trival test case for orchestrating tables. This is a mitigation for a gt4py-orchestration-issue that is easiest reproduced from NDSL (compared to a adding a test in gt4py directly). * [orchestration] common cast operation replacments Cherry-picking (parts of) PR #211 into the milestone2 branch. * FieldBundle memoization fix * Update gt4py: fix memlets into FrozenSDFG * update gt4py: tests memlet dimesion / fix domain symbols This gt4py update includes - tests for the memlet dimension fix - another fix to ensure that we always define all three cartesian symbols (even if we are only passing 2d fields and scalars into the stencil). * cleanup: backends raise if not defined (#234) No need to assert - `from_backend()` raises a `ValueError` if a requested backend doesn't exist. * GT4Py update This GT4Py update includes - dace fixes: FrozenSDFG fixes, iterator symbols - feature: `dace:cpu_kfirst` backend - tests: remove unused test utils - tests: print cache location at start (not end) - dace fixes: merge schedule tree roundtrip work - dace fix: memlet size of data dimensions - dace fix: use cached SDFGs from disk - dace perf: align loop structure and data layout - dace: remove unsued tile symbol function - refacor: invalid backend/frontend raise ValueError * gt4py update: no major changes in cartesian this is just to be up to date with the `milestone2` in gt4py, which was updated as preparation for setting up a PR for absolute k indexing as experimental feature. * gt4py update (abs K index fix in debug & dace) Bring in a fix for the issue that showed when IJ or K fields were used in combination with absolute K indexing. * gt4py update: absolute K indexing in mainline * absolute k indexing is now part of mainline gt4py (experimental) * Schedule Tree Pipeline + Untested Axis Merge (#251) * Roundtrip sdfg -> stree -> sdfg in orchestration - with moaar validation * Move in code the merge passes + K offset check * Insert optimization in orchestration * Conserve correct code-flow and stop merging when hitting non-map node as second candidate * Debug: Save STREE post opt Remove still assert * Split AxisMerge, add scalar tasklet push * Move algorithms to a 3-step method * Move up `ndsl_log` in `__init__` stack because it's a standalone file (cut on potential circular imports) * Working PushIfElse operator (on FvTp2D) dies on D_SW * Fix `list.index` re-using `_list_index` written by hand Allow scope operation to look more broadly at `next_node` mergeability * Remove debug prints and intermediate sdfg saving * Use default when calling simplify * Update gt4py/dace submodules (roundtrip work) This commit brings the changes needed for stree rountrips to validate with the AI2 data (in the PyFV3 translate tests). * Update README * New algorithms - with revert when failure to merge and more aggresive depth-first merges * Quick note: skip ScalarToSymbolPromotion for now The pass messes up previously valid & validating SDFGs. We can live (performance wise) without it for the current milestone. Let's re-evaluate once we get back to DaCe mainline (v2). * Add default ControlFlow behavior (recurse) Add - deactivated - AxisIteartor name sanitizer Fix single axis merge test * Add helper to detect if log is in Debug Add Release & march=native into dace compiler flags Unused orchestration pass Unused stree pass * Fix the GT4Py dependancy * Bad merge fix * Bad merge fix * Internal in code flag for stree optimizer * Move helpful "Make Sequential" SDFG transformation * Lint * Remove original Roman code that has been harvested for good * Unit test for roundtrip, proper pipeline setup * Lint * Fix to default Pipeline * Move helper function for SDFG, delete unused code * Move out tree common operations * Add mock test for optimization * Lint * Lint --------- Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com> * [Clean up] Schedule Tree optimizer (WIP) (#255) * Use `ndsl_log` * Remove missed `breakpoint` and turn dead code into coding comment * gt4py update: push forscope down, shiny error messages * update dace (& gt4py): fixes from v1/maintenance * fixup: add missing type after merge * update gt4py: K iteration index * De-dragon the README * Rename `dst` to `stree` for moniker of `dace.sdfg.analysis.schedule_tree.treenodes` Better docs * Flip `Protocol` base class to the broader and cleaner ABC * Lint --------- Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com> Co-authored-by: Roman Cattaneo <> Co-authored-by: Christopher W. Kung <ckung@gh004.atusrvm.adapt.nccs.nasa.gov> Co-authored-by: Florian Deconinck <deconinck.florian@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Small clean up left from #251, sorry @twicki