Skip to content
Kathryn Mohror edited this page Mar 19, 2016 · 2 revisions

Participants

  • Jean-Baptiste Besnard
  • Ralph Castain
  • John DelSignore
  • Marc-Andre Hermanns
  • Kathryn Mohror
  • Jeff Squyres
  • Anh Vo

Notes

MPI Handles - Status update

  • Currently programs hang in attach when TV attaches
    • In Open MPI there are 3 parallel paths for registering error handles, missed one, needs fixing
    • Probably done next week

Issue about attach and Open MPI

  • Issue raised in CORAL meeting that when launching an Open MPI job with a debugger, the attach happens later than would be liked
    • Happens in MPI_Init and not in execv
    • Apparently no one has raised this issue before
    • There is no barrier to this, just needs to be enabled/fixed
    • Using PMIx to launch for Aurora, so they need an option to stop on exec
    • Support for starting in a suspended state is not as clean as you would expect on Linux, so that is why it wasn't done before ("buncha nonsense involved")
    • Is easier on newer versions of Linux with support for cloning, but it is not uniformly implemented across all versions of Linux
      • Also, HPC systems don't usually run the newest versions of Linux
    • In Windows, it's easy, just a flag

PMIx debugger launch

  • Idea is to trigger a debugger launch based on some user-specified condition in the execution
    • E.g., when I see a questionable variable value, API call to launch debugger
    • Need a way to notify the user that the debugger is waiting, tweet, email
  • From a debugger standpoint, this should be straightforward
  • Think Cray has a similar capability (perhaps not supported in TV), can start debugger if job is in "trouble"
  • Working on getting Open MPI to support it
  • PMIx folks want to know what a debugger needs to be launched in this way
    • How to support what node the debugger is launched on? How to support the debugger's environment requirements, e.g. X?
    • TV needs Xwin (login or frontend node)
      • access to starter program (attach to starter on node, or tell it what node to remote attach)
      • remote attach requires rsh/ssh ability to compute nodes
    • From that point, it's just a regular attach
  • What does the MPI implementation need for this?
    • For Open MPI testing, they are using orte as a representation of the runtime
    • Getting it set up as an example, so people can see what they need to do in their runtimes to support it
    • Issue a pause or "prepare for debugger" command to all daemons
    • Notify the user that the debugger is waiting
  • Is it necessary to mess with the MPI processes at all?
    • Will some processes run ahead and cause a problem?
    • In experience with MPI programs this is not a big problem
      • Either it's a hung job and that's easy, or
      • Once one process gets in trouble/blocks, all the rest soon block anyway
      • Depends on the app of course
    • Do we need to provide an info key, something like "stop all processes"?
    • It will be good to have this Open MPI prototype so we can test out what is needed here

MPIR Upgrade

  • Where do we look for the DLL?
    • argv style array of dll locations
    • Why wouldn't we have this array returned by a function?
    • Not sure how this would work in practice
      • A known variable name is more reliable than a function call
      • It's not always safe to call a function in the target
      • This is a reliability and robustness issue for debuggers
  • How do we indicate that the dll locations array is ready?
    • Atomic set. It's either NULL or filled in, no in between
  • When can the debugger attach then? Is there a time when the application would say it is not ready?
    • Not sure
    • Perhaps instead of having an event "it's okay to attach" how about "the dll vector has changed"?
      • This could happen if the set of dlls has changed
        • In light of the Sessions proposal from last Forum meeting, dll loading might be dynamic
      • Or perhaps if the number of processes has changed, more, less
  • Rest of the notes are in Anh's slides
Clone this wiki locally