Skip to content

2023.10.05 Meeting Notes

Philipp Grete edited this page Oct 19, 2023 · 3 revisions

Agenda

  • Individual/group updates
  • IO at scale
  • Buffer packing
  • Next developer/user(?) meeting
  • review non-WIP PRs

Updates

LR

  • finalized MG PR including adaptive meshes with precond.
  • works on Poisson with spatially varying diffusion coeff
  • ready to review now
  • in process to adding a cleaned up downstream user-friendly interface

RC

  • added support for nodal fields
  • most changes went into xdmf
  • should be fairly straightforward to to extend to face and edge centered fields (BP volunteered to look into this)

BP

  • made minor changes to PEP1
  • found more issues (hang) wrt MPI_Finalize on Mesh destruction (related to reduction objects)
    • PR is open as WIP as it's not clear that it's a more broader issue (only showed up on limited CI runs)

PM

  • Found bug wrt missing comm object with ForceRemeshComm. Bugfix already merged.
  • Q: Discovered intermittent bug with -d on multi-rank runs on a cluster. Did anyone else encounter this?
    • Yes, BP did. There might be a race condition with create_directory
    • PM will check and potentially fix

FG

  • working on coordinate systems including yt frontend
  • Keplerian disk in AthenaPK test just not stable yet -- searching for bug

PG

  • looked into buffer packing performance (there seems to be a performance regression versus "a year ago", which is still to be identified) but suggested changes are definitely an improvement also for AthenaPK
  • discovered another performance regression related to chunking in HDF5, see IO below

IO

  • HDF5 (especially with chunking but also without) has been performing below expectation at scale (10000+ ranks), see https://github.com/parthenon-hpc-lab/parthenon/pull/905 for more detailed number
  • BW brought up ADIOS2 and OpenPMD as alternative IO option as it seems to perform quite well for AMReX based codes
  • RC has some experience and enjoyed the clean interface
  • path forward, try to implemented new IO side-by-side with the existing one
  • if it turns out to broadly solve IO issue, then we m might consider replacing the original one (as ADIOS2 also has an HDF5 backend, if required)

Buffer packing

  • PG brought up question on barriers in new buffer packing kernels
    • probably not needed as all inner loops go over the same indices
  • LR experimented more with separate kernels for sparse and non sparse variables as PG noticed performance regression (versus an old version that had some in-kernel compile-time switch)
    • didn't change performance much for Riot but will made a separate branch available for PG to test
  • After that test either one of those version gets merged (as it's a general win -- noting that there's likely still room for improvement in the future)

Next developer/user(?) meeting

  • deferred to when JD and JM are present

Next meeting tentatively 19 Oct

Clone this wiki locally