Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try to speed up compilation of params under nvhpc #1007

Merged
merged 13 commits into from
Mar 1, 2024
Merged
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
- [[PR978]](https://github.com/parthenon-hpc-lab/parthenon/pull/978) remove erroneous sparse check

### Infrastructure (changes irrelevant to downstream codes)
- [[PR 1007]](https://github.com/parthenon-hpc-lab/parthenon/pull/1007) Split template instantiations for HDF5 Read/Write attributes to speed up compile times
- [[PR 990]](https://github.com/parthenon-hpc-lab/parthenon/pull/990) Partial refactor of HDF5 I/O code for readability/extendability
- [[PR 982]](https://github.com/parthenon-hpc-lab/parthenon/pull/982) add some gut check testing for parthenon-VIBE

Expand Down
12 changes: 6 additions & 6 deletions doc/sphinx/src/tasks.rst
Original file line number Diff line number Diff line change
Expand Up @@ -138,9 +138,9 @@ with one thread.
TaskQualifier
-------------

``TaskQualifier``s provide a mechanism for downstream codes to alter the default
``TaskQualifier`` s provide a mechanism for downstream codes to alter the default
behavior of specific tasks in certain ways. The qualifiers are described below:
- ``TaskQualifier::local_sync``: Tasks marked with ``local_sync`` synchronize across
- ``TaskQualifier::local_sync`` : Tasks marked with ``local_sync`` synchronize across
lists in a region on a given MPI rank. Tasks that depend on a ``local_sync``
marked task gain dependencies from the corresponding task on all lists within
a region. A typical use for this qualifier is to do a rank-local reduction, for
Expand All @@ -149,22 +149,22 @@ per rank, not once per ``TaskList``). Note that Parthenon links tasks across
lists in the order they are added to each list, i.e. the ``n``th ``local_sync`` task
in a list is assumed to be associated with the ``n``th ``local_sync`` task in all
lists in the region.
- ``TaskQualifier::global_sync``: Tasks marked with ``global_sync`` implicitly have
- ``TaskQualifier::global_sync`` : Tasks marked with ``global_sync`` implicitly have
the same semantics as ``local_sync``, but additionally do a global reduction on the
``TaskStatus`` to determine if/when execution can proceed on to dependent tasks.
- ``TaskQualifier::completion``: Tasks marked with ``completion`` can lead to exiting
- ``TaskQualifier::completion`` : Tasks marked with ``completion`` can lead to exiting
execution of the owning ``TaskList``. If these tasks return ``TaskStatus::complete``
and the minimum number of iterations of the list have been completed, the remainder
of the task list will be skipped (or the iteration stopped). Returning
``TaskList::iterate`` leads to continued execution/iteration, unless the maximum
number of iterations has been reached.
- ``TaskQualifier::once_per_region``: Tasks with the ``once_per_region`` qualifier
- ``TaskQualifier::once_per_region`` : Tasks with the ``once_per_region`` qualifier
will only execute once (per iteration, if relevant) regardless of the number of
``TaskList``s in the region. This can be useful when, for example, doing MPI
reductions, printing out some rank-wide state, or calling a ``completion`` task
that depends on some global condition where all lists would evaluate identical code.

``TaskQualifier``s can be combined via the ``|`` operator and all combinations are
``TaskQualifier`` s can be combined via the ``|`` operator and all combinations are
supported. For example, you might mark a task ``global_sync | completion | once_per_region``
if it were a task to determine whether an iteration should continue that depended
on some previously reduced quantity.
7 changes: 7 additions & 0 deletions src/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,7 @@ add_library(parthenon
interface/mesh_data.hpp
interface/meshblock_data.cpp
interface/meshblock_data.hpp
interface/swarm_comms.cpp
interface/swarm_container.cpp
interface/swarm.cpp
interface/swarm.hpp
Expand Down Expand Up @@ -183,6 +184,10 @@ add_library(parthenon
outputs/outputs.cpp
outputs/outputs.hpp
outputs/parthenon_hdf5.cpp
outputs/parthenon_hdf5_attributes.cpp
outputs/parthenon_hdf5_attributes_read.cpp
outputs/parthenon_hdf5_attributes_write.cpp
outputs/parthenon_hdf5_types.hpp
outputs/parthenon_xdmf.cpp
outputs/parthenon_hdf5.hpp
outputs/parthenon_xdmf.hpp
Expand Down Expand Up @@ -264,6 +269,8 @@ add_library(parthenon
parameter_input.cpp
parameter_input.hpp
parthenon_array_generic.hpp
parthenon_arrays.cpp
parthenon_arrays.hpp
parthenon_manager.cpp
parthenon_manager.hpp
parthenon_mpi.hpp
Expand Down
17 changes: 2 additions & 15 deletions src/interface/params.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -26,19 +26,6 @@

namespace parthenon {

// JMM: This could probably be done with template magic but I think
// using a macro is honestly the simplest and cleanest solution here.
// Template solution would be to define a variatic class to conain the
// list of types and then a hierarchy of structs/functions to turn
// that into function calls. Preprocessor seems easier, given we're
// not manipulating this list in any way.
#define VALID_VEC_TYPES(T) \
T, std::vector<T>, ParArray1D<T>, ParArray2D<T>, ParArray3D<T>, ParArray4D<T>, \
ParArray5D<T>, ParArray6D<T>, ParArray7D<T>, ParArray8D<T>, HostArray1D<T>, \
HostArray2D<T>, HostArray3D<T>, HostArray4D<T>, HostArray5D<T>, HostArray6D<T>, \
HostArray7D<T>, Kokkos::View<T *>, Kokkos::View<T **>, ParArrayND<T>, \
ParArrayHost<T>

#ifdef ENABLE_HDF5

template <typename T>
Expand All @@ -63,7 +50,7 @@ void Params::WriteToHDF5AllParamsOfMultipleTypes(const std::string &prefix,
template <typename T>
void Params::WriteToHDF5AllParamsOfTypeOrVec(const std::string &prefix,
const HDF5::H5G &group) const {
WriteToHDF5AllParamsOfMultipleTypes<VALID_VEC_TYPES(T)>(prefix, group);
WriteToHDF5AllParamsOfMultipleTypes<PARTHENON_ATTR_VALID_VEC_TYPES(T)>(prefix, group);
}

template <typename T>
Expand Down Expand Up @@ -91,7 +78,7 @@ void Params::ReadFromHDF5AllParamsOfMultipleTypes(const std::string &prefix,
template <typename T>
void Params::ReadFromHDF5AllParamsOfTypeOrVec(const std::string &prefix,
const HDF5::H5G &group) {
ReadFromHDF5AllParamsOfMultipleTypes<VALID_VEC_TYPES(T)>(prefix, group);
ReadFromHDF5AllParamsOfMultipleTypes<PARTHENON_ATTR_VALID_VEC_TYPES(T)>(prefix, group);
}

void Params::WriteAllToHDF5(const std::string &prefix, const HDF5::H5G &group) const {
Expand Down
2 changes: 1 addition & 1 deletion src/interface/params.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
#include "utils/error_checking.hpp"

#ifdef ENABLE_HDF5
#include "outputs/parthenon_hdf5.hpp"
#include "outputs/parthenon_hdf5_types.hpp"
#endif

namespace parthenon {
Expand Down
Loading
Loading