Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iterative tasking error in particles example #1155

Open
brryan opened this issue Aug 20, 2024 · 1 comment
Open

Iterative tasking error in particles example #1155

brryan opened this issue Aug 20, 2024 · 1 comment

Comments

@brryan
Copy link
Collaborator

brryan commented Aug 20, 2024

I attempted to switch the particles example to use iterative tasking which resulted in very nice code in the example, but unfortunately I appear to be hitting two separate bugs when I try to use this:

  1. If the iterative SubList has a none dependency, it is skipped entirely during execution

  2. If I add a dummy dependency, after the first successful iteration only one meshblock's iteration is processed again, which causes the particles example to hang because that meshblock's Receive call never receives MPI messages from the other meshblocks' Send calls.

Worth noting that I am using iterative tasking per-meshblock which is a bit unusual and may be related to the bug.

Also, it's possible that I simply am not setting up the iterative task list correctly, maybe with incorrect TaskQualifiers.

@brryan
Copy link
Collaborator Author

brryan commented Aug 26, 2024

The code snippet to be used in the particles example going forward:

TaskStatus CheckCompletion(MeshBlock *pmb, const Real tf) {
  auto swarm = pmb->meshblock_data.Get()->GetSwarmData()->Get("my_particles");

  int max_active_index = swarm->GetMaxActiveIndex();

  auto &t = swarm->Get<Real>("t").Get();

  auto swarm_d = swarm->GetDeviceContext();

  int num_unfinished = 0;
  parthenon::par_reduce(
      PARTHENON_AUTO_LABEL, 0, max_active_index,
      KOKKOS_LAMBDA(const int n, int &num_unfinished) {
        if (swarm_d.IsActive(n)) {
          if (t(n) < tf) {
            num_unfinished++;
          }
        }
      },
      Kokkos::Sum<int>(num_unfinished));

  if (num_unfinished > 0) {
    return TaskStatus::iterate;
  } else {
    return TaskStatus::complete;
  }
}
TaskCollection ParticleDriver::MakeParticlesTransportTaskCollection() const {
  using TQ = TaskQualifier;

  TaskCollection tc;

  TaskID none(0);
  BlockList_t &blocks = pmesh->block_list;

  const int max_transport_iterations = 1000;

  const Real t0 = tm.time;
  const Real dt = tm.dt;

  auto &reg = tc.AddRegion(blocks.size());

  for (int i = 0; i < blocks.size(); i++) {
    auto &pmb = blocks[i];
    auto &sc = pmb->meshblock_data.Get()->GetSwarmData();
    auto &tl = reg[i];

    // Add task sublist
    auto [itl, push] = tl.AddSublist(none, {i, max_transport_iterations});
    auto transport = itl.AddTask(none, TransportParticles, pmb.get(), t0, dt);
    auto reset_comms =
        itl.AddTask(transport, &SwarmContainer::ResetCommunication, sc.get());
    auto send = itl.AddTask(reset_comms, &SwarmContainer::Send, sc.get(),
                            BoundaryCommSubset::all);
    auto receive =
        itl.AddTask(send, &SwarmContainer::Receive, sc.get(), BoundaryCommSubset::all);

    auto complete = itl.AddTask(TQ::global_sync | TQ::completion, receive,
                                CheckCompletion, pmb.get(), t0 + dt);
  }

  return tc;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant