Skip to content

[Orchestration] Common cast operation replacements#211

Merged
FlorianDeconinck merged 5 commits into
NOAA-GFDL:developfrom
FlorianDeconinck:feature/orchestration_dace_replacement_ops
Sep 5, 2025
Merged

[Orchestration] Common cast operation replacements#211
FlorianDeconinck merged 5 commits into
NOAA-GFDL:developfrom
FlorianDeconinck:feature/orchestration_dace_replacement_ops

Conversation

@FlorianDeconinck
Copy link
Copy Markdown
Collaborator

Replace Float and Int with a DaCe cast node in orchestration to go around the lack of TypeAlias inference in the DaCe parser.

This allows for easier orchestration going forward by automatically parsing the Float / Int keywords both in stencils and in glue code.

Copy link
Copy Markdown
Collaborator

@romanc romanc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Any way we could programmatically test this?

Comment thread ndsl/dsl/dace/replacements.py Outdated
romanc pushed a commit that referenced this pull request Sep 4, 2025
Cherry-picking (parts of) PR #211
into the milestone2 branch.
@FlorianDeconinck
Copy link
Copy Markdown
Collaborator Author

FlorianDeconinck commented Sep 4, 2025

Looks good to me. Any way we could programatically test this?

Yes and no. I can build a SDFG from a small bit of code that has Float and go check that the cast op has been dropped proper. But the fix is for cases where that doesn't done auto, which it is sometime! So not easy...

@FlorianDeconinck
Copy link
Copy Markdown
Collaborator Author

Will move this code in M2

@romanc
Copy link
Copy Markdown
Collaborator

romanc commented Sep 5, 2025

Let's re-open and (also move) this into develop (since we are anyway trying to merge as much as we can back from the M2 branch).

Any way we could programatically test this?

Yes and no. I can build a SDFG from a small bit of code that has Float and go check that the cast op has been dropped proper. But the fix is for cases where that doesn't done auto, which it is sometime! So not easy...

Alright - we want to have that for orchestration anyway; with or without tests. This will be implicitly tested by upcoming orchestration work and - who knows - maybe we even get to ramp up the CI in the coming months in which case we'd have another safety net.

Let's just fix the typos and then get this into develop.

@romanc romanc reopened this Sep 5, 2025
@FlorianDeconinck FlorianDeconinck added this pull request to the merge queue Sep 5, 2025
Merged via the queue into NOAA-GFDL:develop with commit c6788d9 Sep 5, 2025
5 checks passed
github-merge-queue Bot pushed a commit that referenced this pull request Oct 24, 2025
…ne (#189)

* NASA Team: Mileston 2 "release" branch

This branch is what we use for the NASA team as we start to prepare for
Milestone 2.

Currently uses the following versions of externals:

- GT4Py: follows "milestone2" branch on Roman's fork
- DaCe: whatever GT4Py's uv.lock file says about dace-cartesian

* Expose erf, erfc, round, and new typecasts from ndsl.dsl.gt4py

* gt4py update: abs k and current k in debug backend

This commit updates GT4Py to add support for the experimental features
"absolute k indexing" and "expose current k-level" in the debug backend.

* gt4py update: fix literal precision

* dace|orchestration: Schedule tree roundrip work (#206)

* Roundtrip sdfg -> stree -> sdfg in orchestration
   with moaar validation
* Remove debug prints and intermediate sdfg saving
* Use default when calling simplify
* Update gt4py/dace submodules (roundtrip work)
   This commit brings the changes needed for stree rountrips to validate
   with the AI2 data (in the PyFV3 translate tests).
* Update README
* Quick note: skip ScalarToSymbolPromotion for now
   The pass messes up previously valid & validating SDFGs. We can live
   (performance wise) without it for the current milestone. Let's
   re-evaluate once we get back to DaCe mainline (v2).
* Update gt4py & dace submodules (stree/rountrip)

* update gt4py to milestone2

* Added device_synchronize call to fix GPU/MPI synchronization issue on MPI inplace all_reduce calls.  Note that device_synchronize is Cupy/CUDA specific at the moment.

* Linting

* Linting again

* perf: set build type to release in dace config

* perf: set -march=native flag for cpu

* fix: stencil wrapper field origins with data_dims

Add support for fields with data_dims (or data_dims only fields) in the
stencil wrapper's function to computae field origins.

* Unrelated: no unused arguments in stencil definition

* Update gt4py to lastest romanc/milestone2

* tests: Add test case for orchestrated tables

Add a non-trival test case for orchestrating tables. This is a
mitigation for a gt4py-orchestration-issue that is easiest reproduced
from NDSL (compared to a adding a test in gt4py directly).

* [orchestration] common cast operation replacments

Cherry-picking (parts of) PR #211
into the milestone2 branch.

* FieldBundle memoization fix

* Update gt4py: fix memlets into FrozenSDFG

* update gt4py: tests memlet dimesion / fix domain symbols

This gt4py update includes

- tests for the memlet dimension fix
- another fix to ensure that we always define all three cartesian
  symbols (even if we are only passing 2d fields and scalars into the
  stencil).

* cleanup: backends raise if not defined (#234)

No need to assert - `from_backend()` raises a `ValueError` if a requested backend doesn't exist.

* GT4Py update

This GT4Py update includes

- dace fixes: FrozenSDFG fixes, iterator symbols
- feature: `dace:cpu_kfirst` backend
- tests: remove unused test utils
- tests: print cache location at start (not end)
- dace fixes: merge schedule tree roundtrip work
- dace fix: memlet size of data dimensions
- dace fix: use cached SDFGs from disk
- dace perf: align loop structure and data layout
- dace: remove unsued tile symbol function
- refacor: invalid backend/frontend raise ValueError

* gt4py update: no major changes in cartesian

this is just to be up to date with the `milestone2` in gt4py, which was
updated as preparation for setting up a PR for absolute k indexing as
experimental feature.

* gt4py update (abs K index fix in debug & dace)

Bring in a fix for the issue that showed when IJ or K fields were used
in combination with absolute K indexing.

* gt4py update: absolute K indexing in mainline

* absolute k indexing is now part of mainline gt4py (experimental)

* Schedule Tree Pipeline + Untested Axis Merge (#251)

* Roundtrip sdfg -> stree -> sdfg in orchestration

- with moaar validation

* Move in code the merge passes + K offset check

* Insert optimization in orchestration

* Conserve correct code-flow and stop merging when hitting non-map node as second candidate

* Debug: Save STREE post opt
Remove still assert

* Split AxisMerge, add scalar tasklet push

* Move algorithms to a 3-step method

* Move up `ndsl_log` in `__init__` stack because it's a standalone file (cut on potential circular imports)

* Working PushIfElse operator (on FvTp2D) dies on D_SW

* Fix `list.index` re-using `_list_index` written by hand
Allow scope operation to look more broadly at `next_node` mergeability

* Remove debug prints and intermediate sdfg saving

* Use default when calling simplify

* Update gt4py/dace submodules (roundtrip work)

This commit brings the changes needed for stree rountrips to validate
with the AI2 data (in the PyFV3 translate tests).

* Update README

* New algorithms - with revert when failure to merge and more aggresive depth-first merges

* Quick note: skip ScalarToSymbolPromotion for now

The pass messes up previously valid & validating SDFGs. We can live
(performance wise) without it for the current milestone. Let's
re-evaluate once we get back to DaCe mainline (v2).

* Add default ControlFlow behavior (recurse)
Add - deactivated -  AxisIteartor name sanitizer
Fix single axis merge test

* Add helper to detect if log is in Debug
Add Release  & march=native into dace compiler flags
Unused orchestration pass
Unused stree pass

* Fix the GT4Py dependancy

* Bad merge fix

* Bad merge fix

* Internal in code flag for stree optimizer

* Move helpful "Make Sequential" SDFG transformation

* Lint

* Remove original Roman code that has been harvested for good

* Unit test for roundtrip, proper pipeline setup

* Lint

* Fix to default Pipeline

* Move helper function for SDFG, delete unused code

* Move out tree common operations

* Add mock test for optimization

* Lint

* Lint

---------

Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com>

* [Clean up] Schedule Tree optimizer (WIP) (#255)

* Use `ndsl_log`

* Remove missed `breakpoint` and turn dead code into coding comment

* gt4py update: push forscope down, shiny error messages

* update dace (& gt4py): fixes from v1/maintenance

* fixup: add missing type after merge

* update gt4py: K iteration index

* De-dragon the README

* Rename `dst` to `stree` for moniker of `dace.sdfg.analysis.schedule_tree.treenodes`
Better docs

* Flip `Protocol` base class to the broader and cleaner ABC

* Lint

---------

Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com>
Co-authored-by: Roman Cattaneo <>
Co-authored-by: Christopher W. Kung <ckung@gh004.atusrvm.adapt.nccs.nasa.gov>
Co-authored-by: Florian Deconinck <deconinck.florian@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants