Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try to speed up compilation of params under nvhpc #1007

Merged
merged 13 commits into from
Mar 1, 2024
Merged
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
- [[PR978]](https://github.com/parthenon-hpc-lab/parthenon/pull/978) remove erroneous sparse check

### Infrastructure (changes irrelevant to downstream codes)
- [[PR 1007]](https://github.com/parthenon-hpc-lab/parthenon/pull/1007) Split template instantiations for HDF5 Read/Write attributes to speed up compile times
- [[PR 990]](https://github.com/parthenon-hpc-lab/parthenon/pull/990) Partial refactor of HDF5 I/O code for readability/extendability
- [[PR 982]](https://github.com/parthenon-hpc-lab/parthenon/pull/982) add some gut check testing for parthenon-VIBE

Expand Down
1 change: 1 addition & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ option(PARTHENON_DISABLE_HDF5 "HDF5 is enabled by default if found, set this to
option(PARTHENON_DISABLE_HDF5_COMPRESSION "HDF5 compression is enabled by default, set this to True to disable compression in HDF5 output/restart files" OFF)
option(PARTHENON_DISABLE_SPARSE "Sparse capability is enabled by default, set this to True to compile-time disable all sparse capability" OFF)
option(PARTHENON_ENABLE_ASCENT "Enable Ascent for in situ visualization and analysis" OFF)
option(PARTHENON_PRE_INSTANTIATE_KOKKOS_VIEWS "Pre-instantiate kokkos views. May speed up compile times." OFF)
option(PARTHENON_LINT_DEFAULT "Linting is turned off by default, use the \"lint\" target or set \
this to True to enable linting in the default target" OFF)
option(PARTHENON_COPYRIGHT_CHECK_DEFAULT "Copyright check is turned off by default, use the \
Expand Down
73 changes: 37 additions & 36 deletions doc/sphinx/src/building.rst

Large diffs are not rendered by default.

12 changes: 6 additions & 6 deletions doc/sphinx/src/tasks.rst
Original file line number Diff line number Diff line change
Expand Up @@ -138,9 +138,9 @@ with one thread.
TaskQualifier
-------------

``TaskQualifier``s provide a mechanism for downstream codes to alter the default
``TaskQualifier`` s provide a mechanism for downstream codes to alter the default
behavior of specific tasks in certain ways. The qualifiers are described below:
- ``TaskQualifier::local_sync``: Tasks marked with ``local_sync`` synchronize across
- ``TaskQualifier::local_sync`` : Tasks marked with ``local_sync`` synchronize across
lists in a region on a given MPI rank. Tasks that depend on a ``local_sync``
marked task gain dependencies from the corresponding task on all lists within
a region. A typical use for this qualifier is to do a rank-local reduction, for
Expand All @@ -149,22 +149,22 @@ per rank, not once per ``TaskList``). Note that Parthenon links tasks across
lists in the order they are added to each list, i.e. the ``n``th ``local_sync`` task
in a list is assumed to be associated with the ``n``th ``local_sync`` task in all
lists in the region.
- ``TaskQualifier::global_sync``: Tasks marked with ``global_sync`` implicitly have
- ``TaskQualifier::global_sync`` : Tasks marked with ``global_sync`` implicitly have
the same semantics as ``local_sync``, but additionally do a global reduction on the
``TaskStatus`` to determine if/when execution can proceed on to dependent tasks.
- ``TaskQualifier::completion``: Tasks marked with ``completion`` can lead to exiting
- ``TaskQualifier::completion`` : Tasks marked with ``completion`` can lead to exiting
execution of the owning ``TaskList``. If these tasks return ``TaskStatus::complete``
and the minimum number of iterations of the list have been completed, the remainder
of the task list will be skipped (or the iteration stopped). Returning
``TaskList::iterate`` leads to continued execution/iteration, unless the maximum
number of iterations has been reached.
- ``TaskQualifier::once_per_region``: Tasks with the ``once_per_region`` qualifier
- ``TaskQualifier::once_per_region`` : Tasks with the ``once_per_region`` qualifier
will only execute once (per iteration, if relevant) regardless of the number of
``TaskList``s in the region. This can be useful when, for example, doing MPI
reductions, printing out some rank-wide state, or calling a ``completion`` task
that depends on some global condition where all lists would evaluate identical code.

``TaskQualifier``s can be combined via the ``|`` operator and all combinations are
``TaskQualifier`` s can be combined via the ``|`` operator and all combinations are
supported. For example, you might mark a task ``global_sync | completion | once_per_region``
if it were a task to determine whether an iteration should continue that depended
on some previously reduced quantity.
7 changes: 7 additions & 0 deletions src/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,7 @@ add_library(parthenon
interface/mesh_data.hpp
interface/meshblock_data.cpp
interface/meshblock_data.hpp
interface/swarm_comms.cpp
interface/swarm_container.cpp
interface/swarm.cpp
interface/swarm.hpp
Expand Down Expand Up @@ -183,6 +184,10 @@ add_library(parthenon
outputs/outputs.cpp
outputs/outputs.hpp
outputs/parthenon_hdf5.cpp
outputs/parthenon_hdf5_attributes.cpp
outputs/parthenon_hdf5_attributes_read.cpp
outputs/parthenon_hdf5_attributes_write.cpp
outputs/parthenon_hdf5_types.hpp
outputs/parthenon_xdmf.cpp
outputs/parthenon_hdf5.hpp
outputs/parthenon_xdmf.hpp
Expand Down Expand Up @@ -264,6 +269,8 @@ add_library(parthenon
parameter_input.cpp
parameter_input.hpp
parthenon_array_generic.hpp
parthenon_arrays.cpp
parthenon_arrays.hpp
parthenon_manager.cpp
parthenon_manager.hpp
parthenon_mpi.hpp
Expand Down
3 changes: 3 additions & 0 deletions src/config.hpp.in
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,9 @@
// define PARTHENON_ENABLE_ASCENT or not at all
#cmakedefine PARTHENON_ENABLE_ASCENT

// define PARTHENON_PRE_INSTANTIATE_KOKKOS_VIEWS
#cmakedefine PARTHENON_PRE_INSTANTIATE_KOKKOS_VIEWS

// Default loop patterns for MeshBlock par_for() wrappers,
// see kokkos_abstraction.hpp for available tags.
// Kokkos tight loop layout
Expand Down
17 changes: 2 additions & 15 deletions src/interface/params.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -26,19 +26,6 @@

namespace parthenon {

// JMM: This could probably be done with template magic but I think
// using a macro is honestly the simplest and cleanest solution here.
// Template solution would be to define a variatic class to conain the
// list of types and then a hierarchy of structs/functions to turn
// that into function calls. Preprocessor seems easier, given we're
// not manipulating this list in any way.
#define VALID_VEC_TYPES(T) \
T, std::vector<T>, ParArray1D<T>, ParArray2D<T>, ParArray3D<T>, ParArray4D<T>, \
ParArray5D<T>, ParArray6D<T>, ParArray7D<T>, ParArray8D<T>, HostArray1D<T>, \
HostArray2D<T>, HostArray3D<T>, HostArray4D<T>, HostArray5D<T>, HostArray6D<T>, \
HostArray7D<T>, Kokkos::View<T *>, Kokkos::View<T **>, ParArrayND<T>, \
ParArrayHost<T>

#ifdef ENABLE_HDF5

template <typename T>
Expand All @@ -63,7 +50,7 @@ void Params::WriteToHDF5AllParamsOfMultipleTypes(const std::string &prefix,
template <typename T>
void Params::WriteToHDF5AllParamsOfTypeOrVec(const std::string &prefix,
const HDF5::H5G &group) const {
WriteToHDF5AllParamsOfMultipleTypes<VALID_VEC_TYPES(T)>(prefix, group);
WriteToHDF5AllParamsOfMultipleTypes<PARTHENON_ATTR_VALID_VEC_TYPES(T)>(prefix, group);
}

template <typename T>
Expand Down Expand Up @@ -91,7 +78,7 @@ void Params::ReadFromHDF5AllParamsOfMultipleTypes(const std::string &prefix,
template <typename T>
void Params::ReadFromHDF5AllParamsOfTypeOrVec(const std::string &prefix,
const HDF5::H5G &group) {
ReadFromHDF5AllParamsOfMultipleTypes<VALID_VEC_TYPES(T)>(prefix, group);
ReadFromHDF5AllParamsOfMultipleTypes<PARTHENON_ATTR_VALID_VEC_TYPES(T)>(prefix, group);
}

void Params::WriteAllToHDF5(const std::string &prefix, const HDF5::H5G &group) const {
Expand Down
2 changes: 1 addition & 1 deletion src/interface/params.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
#include "utils/error_checking.hpp"

#ifdef ENABLE_HDF5
#include "outputs/parthenon_hdf5.hpp"
#include "outputs/parthenon_hdf5_types.hpp"
#endif

namespace parthenon {
Expand Down
Loading
Loading