-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parthenon Enhancement Proposal 1 -- Support Subclassing StateDescriptor #816
Comments
Thanks for submitting this! Exciting to see the first PEP ;) Hmm this proposal seems to be entangling two separate things, I think:
I am 100% in favor of the former... although I don't know if I have a strong preference between an internal map/list-based approach vs. a type-based approach. The type-based approach makes more sense if the desire is to extend As far as extending
This would be a cool feature 👍
Does this not require some additional decoration by the author of said function? Also I think we have this. If you put your function inside a |
Maybe my use case will shed some light here. In KHARMA, for example, I want to add about 8 or 10 callbacks to the Parthenon list: class KHARMAPackage : public StateDescriptor {
public:
KHARMAPackage(std::string name) : StateDescriptor(name) {}
// PHYSICS
// Recovery of primitive variables from conserved.
// These can be host-side functions because they are not called from the Uberkernel --
// rather, they are called on zone center values once per step only.
// Called by various Flux::*UtoP*
std::function<void(MeshBlockData<Real>*, IndexDomain, bool)> BlockUtoP = nullptr;
std::function<void(MeshData<Real>*, IndexDomain, bool)> MeshUtoP = nullptr;
// Maybe at some point we'll have
// Since Flux::prim_to_flux must cover everything, it's not worth splitting now
//std::function<void(MeshBlockData<Real>*, IndexDomain, bool)> BlockPtoU = nullptr;
// Source term to add to the conserved variables during each step
std::function<void(MeshData<Real>*, MeshData<Real>*)> AddSource = nullptr;
// Source term to apply to primitive variables, needed for some problems in order
// to control dissipation (Hubble, turbulence).
// Must be applied over entire domain!
std::function<void(MeshBlockData<Real>*)> BlockApplyPrimSource = nullptr;
// Apply any fixes after the initial fluxes are calculated
std::function<void(MeshData<Real>*)> FixFlux = nullptr;
// Apply any floors or limiters specific to the package (that is, on the package's variables)
// Called by Floors::*ApplyFloors
std::function<void(MeshBlockData<Real>*, IndexDomain)> BlockApplyFloors = nullptr;
std::function<void(MeshData<Real>*, IndexDomain)> MeshApplyFloors = nullptr;
// CONVENIENCE
// Anything to be done after every step is fully complete -- usually reductions or preservation of variables
std::function<void(Mesh*, ParameterInput*, const SimTime&)> MeshPreStepUserWorkInLoop = nullptr;
// Anything to be done after every step is fully complete -- usually reductions or preservation of variables
std::function<void(Mesh*, ParameterInput*, const SimTime&)> MeshPostStepUserWorkInLoop = nullptr;
std::function<void(MeshBlock*, ParameterInput*)> BlockUserWorkBeforeOutput = nullptr;
}; I then create a new namespace I should say, upon reflection adding callbacks doesn't require subclassing Re: print flags: If Kokkos regions allow printing during runtime on entry and exit, that kills two birds with one stone, and KHARMA should definitely adopt regions too. What I was proposing, though, was definitely more involved than the existing regions in Parthenon: placing a print flag (or region markers) around each |
Ok I see. Is the use-case here that you have multiple packages that define, e.g., a con2prim because they work on different variables? And you want to loop over all of them? That makes sense to me and I'm not opposed to supporting that. One related thing that's been on my mind has been pgens and related things. But maybe that should wait for now.
I see. I think profiling regions basically gets you that, since the kernel logger will tell you everything so long as you add regions where you need them, and you get profiling for free. That said, pushing and popping profiling regions is a little manual. So maybe we should think about a way to do that more automatically. |
Yeah, that's the idea. Incidentally, I'm imagining down the line there could be a " I think having Parthenon set a Kokkos profiling region per (nontrivial) call into a user function would be fantastic, and save the users a lot of Re: approaches, I'm increasingly thinking we should do both. There's the (probably very common) case where a user has not subclassed If a registry and type list seems like too much, we could always subclass |
So, I feel a bit silly. It turns out the "deep type incantation" is: if (MyPackage *my_package = dynamic_cast<MyPackage*> package.second) {
my_package->MyCallback();
} Coupled with a virtual destructor in virtual ~StateDescriptor() = default; No errors, no mess. Marking the destructor virtual makes This makes |
👍 for the first PEP! In general, I see the use/need for this kind of pattern -- especially with respect to
This looks like a very clean pattern to me.
Just to be clear: How would that translate to the calls within Parthenon itself? Would all places where we call we iterate over packages be split into iterations over Parthenon and User Packages? And regarding the snippet above with the
Agreed (in fact, agreed for a long time #397). |
Introduction
The Parthenon "Package" is a powerful way to split up code and data/parameters by function, keeping parameters and fields associated with the functions that handle them. However, current limitations make it difficult for users to extend the definition of a "Package" to better fit their needs. This "Parthenon Enhancement Proposal" (PEP) explains the desired design pattern, and outlines ways to accommodate it in Parthenon if this is desired.
I partially also intend this as a template for future PEPs, and would welcome format feedback. Ideally future submissions will not be so verbose.
Relevant background and current status
A Parthenon package consists of three things:
StateDescriptor
object, which is basically a host for aParameters
object, which is more or less astd::map<std::string,std::any>
. A list ofStateDescriptor
objects is then accessible as a member of anyMesh
orMeshBlock
object, i.e., from nearly any function in the codeStateDescriptor
object and called at specific times during a Parthenon step (Some of them only if one subclasses a particularDriver
)This is a neat paradigm! It makes nearly arbitrary data & parameters automatically accessible nearly anywhere in a code from the
Mesh
/MeshBlock
objects, while still hierarchically organized. The function pointer or "callback" structure additionally allows implementing new features in a way that can be enabled & disabled cleanly, without a bunch ofif (feature_enabled)
constructs, which are hard to read and easy to get wrong (e.g. reading options & arrays that don't exist, running an extra op or fix that should have been disabled, etc.)The full list of current callbacks is (as documented somewhat obliquely in
metadata.md
):This list of callbacks is long, yet limiting: only a few of the things the code actually needs to do during a simulation step (notably and understandably not any physics) can be in this list. Even as additions are made, it will not (and should not) cover every part of any given downstream algorithm.
The package pattern has proven a very useful way of splitting up code for handling different physics or interchangeable feature implementations, at least for KHARMA. However, this encapsulation is broken when the
MakeTaskList
function must repeatedly query which packages are loaded, and special-case on their inclusion to form the task list. There are often a number of functionally similar operations which must be completed by different packages, each for its own variables -- these must simply be listed out and special-cased, with new entries added manually to allDriver
implementations.For example, operations common to many packages for their particular fields, like "compute the primitive variables" or "add any source terms" must be treated like completely different operations in every package and Driver, which leads to code duplication and mistakes, not to mention a lot of cruft checking for package enablement before every possible operation, or on every function call inside a package.
Proposed solution
By writing a subclass of
StateDescriptor
, a user can add arbitrary function pointers, making a "user package" specific to the case of their code (e.g., conservative finite-volume schemes in curved space, to choose a random example). Extending this pattern is a powerful way to ensure that enabled features are actually enabled, and disabled features disabled, without either writing enablement guards into every package function, or breaking encapsulation inMakeTaskList
by querying packages for presence and enablement before every call.This pattern does not heavily involve changing Parthenon, though I will suggest a few small key additions. Instead, the proposed changes are mostly to support this as a use case, and gradually see it implemented in downstreams and perhaps more complex examples. As the documentation of packages develops, it could be a note in the relevant documentation with a short example snippet implementing the parent function, which just iterates through packages calling the member function.
The main propsed code change, for now, would be to address a pain point when using the package list (
Packages_t
) with a user package type. With a user type, the list now contains baseStateDescriptor
objects, such as the "Refinement" package, and user package objects of some subclass type. Iterating through the list for user callbacks therefore becomes cumbersome due to casting, as packages in the list lose their type information. That is, the problem is that the pattern:will throw errors if there are members of AllPackages which are not of type
MyPackage
, but instead of the baseStateDescriptor
type. There are two possible solutions, and I propose implementing one of them in Parthenon as a functionpmesh->AllUserPackages()
, or better yetpmesh->AllPackagesOfType<Type>()
.Packages_t
object. I think the list itself erases type information, aspackage.second
always returns ashared_ptr
cast toStateDescriptor
. However, there may be a list or pattern which preserves the original type information but otherwise behaves similarly. Alternatively, the type information could be carried in a separate list member ofPackages_t
.Note that another possible solution to this problem would be to eliminate the "Refinement" package, and make the package list entirely user-controlled. This wouldn't be my preferred solution, as I'd like to see parts of Parthenon make use of the package pattern more fully (especially the code for outputs including/especially reductions, and possibly the code for drivers as well).
Corollary considerations
It has proven useful in KHARMA, and I think would be useful generally, to add a compile flag which enables printing exactly what functions are being executed, as they are called. Even in Parthenon's current state, callbacks obfuscate which functions from which packages are actually being called, and this is a significant downside vs an explicit C code, especially when debugging. Parthenon already defines Kokkos profiling regions for many of its own functions, and these can be used with the kokkos-tools kernel logger to print out exactly what the code is doing. Pushing a new region with the package name for every non-trivial callback would make this functionality complete for user code.
Additionally, if we encourage splitting code into many small packages, it would be very useful to provide a way to register package dependencies, and allow Parthenon to load dependent packages automatically. This would ideally mean registering all available packages in a map from package names to initializer functions, then providing a list of dependency names as a part of the
StateDescriptor
object created by each package.Implementation
Implementation would consist of adding either
Mesh::AllUserPackages()
,Mesh::AllPackagesOfType<Type>()
, or both functions (with accompanying versions inMeshBlock
).Mesh::AllUserPackages()
will additionally require registering all Parthenon packages by name, and returning the package list sans those entries -- I could do this.Mesh::AllPackagesOfType<Type>()
will require either deep incantations to the C++ type system beyond my ken, or somehow adding type info toPackages_t
, perhaps as a separate list or map alongside the usualstd::map<std::string,shared_ptr<StateDescriptor>>
.I am happy to work on the corollaries and provide prototype PRs if this is accepted, or to workshop them in other issues. I don't personally think they deserve PEPs.
Documentation of packages generally is a separate issue, and I think anyone who's read this PEP can adequately drop in a paragraph or two illustrating the new-callback pattern to the eventual package documentation.
Implementation in downstreams and examples is a separate issue.
The text was updated successfully, but these errors were encountered: