-
Notifications
You must be signed in to change notification settings - Fork 37
2023.07.13 Meeting Notes
Philipp Grete edited this page Jul 27, 2023
·
4 revisions
- Individual/group updates
- Load balancing strategies
- IO design for non-cell centered fields (deferred)
- Cyl/Sph coordinates in Parthenon (deferred)
- review non-WIP PRs
LR
- AMR for non-cell-centered fields is ready for review (no IO yet) with 3 PRs
- Morton indexing
- Ownership model
- prolong/restrict in one with new generalized operators
- Showed movies of OT and MHD rotor with AMR!!! (using potential based formulation)
- Question for downstream codes: what to do about EMF correction when doing Athena++-style CT
- machinery in place, but not used/tested yet (ownership model applies, similar to cell centered flux correction)
- might need to separate out flux correction step from boundary comm
- other items to be discussed: what should be communicated/corrected (only fine/coarse, but not same-same?)
BP
- chased down bug (in KHARMA) when creating new containers (should data be copied or not)
- created PR for PEP1
JM
- Riot is now on
parthenon/develop
- Riot now entirely based on sparse packs
- some quality of life improvements should be available upstream shortly
- will also push custom load balancing upstream
- Q: what about BiCGStab?
- may or may not be updated (currently lives on separate branch)
- LR more interested in pushing for Multigrid rather than
- BP has backport, will open PR
PM
- also worked on riot <-> develop
- bug reported last time (fine/coarse round of error when run on different ranks)
- problem gone away by changing new comm task
- not sure why it went away (maybe because of
local
/nonlocal
versusany
comm, but that doesn't explain round of error level)
- will create PR to add CI machinery to cover multiple ranks and pack sizes
FG
- fixing INCITE runs on Frontier
- working with co-design summer school students (found issue with reflective bounds in phydro)
- looking at sph/cyl coordinates, early PR expected next week
- got AthenaPK compiled on Chicoma (also related to also https://github.com/lanl/phoebus/issues/70)
BW
- debugging various Ascent issues
- slice perp to y axis when running on GPUs
- ghost zones
- got couple of open WIP PRs
PG
- still tracking down IO performance issues on Frontier
- discovered that our chunking strategy is not optimal
- working on a best-practice solution and looking for external input (from people with more expert knowledge)
- Question on extra variables for rst outputs. No objection (though the parameter name should probably differ from the normal outputs)
AJ
- Results from load balancing work over past months:
- Can now assign arbitrary blocks to arb ranks
- Test setup (30k timesteps, 16^3 blocks, 512 ranks, spherical blast with Phoebus, so work per block varies)
- Implemented different load balancing and comm locality policies
- contiguous (good locality, poor load balance, given vary. work per block)
- longest processing time (good load balance, poor locality)
- contiguous-improved (dynamic programming to eval balance)
- contiguous-improved-iterative (iteratively improve on prev solution)
- Currently, load imbalance per block is ~30-40%. Gains should be much higher with larger imbalance.
- Will look at comparing to Riot LB (see above)
- Other interesting outcomes
- (de)refinements oscillate (and are quite costly), so reducing the number of derefinements improves performance
- For given setup, compute load even per block evolves with time, which is not naturally captured by standard LB. Enforcing LB helps in reducing runtime.
- next
- additional input decks
- GPU vs CPU