-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reducing boilerplate / adding high level functions #122
Comments
Hi @samvanstroud! I agree, there is a quite some boilerplate in the examples. However, we intended to not use high-level functions in We have similar high-level plotting functions in So I'd prefer to keep the overall structure like this and keep |
Thanks for the reply, and thank you for pointing to these features in For me though it would be nice to |
I agree with that. we can have a small separate API within puma to perform these plots fast. We could do something similar to this https://gitlab.cern.ch/mguth/ftag-plotting-utils (I just stitched them together) |
@manuguth, absolutely, that interface looks very much along the lines I was thinking! |
I'd support that we have the high-level stuff in a small separate API. Puma is not only written for ATLAS and therefore removing electron jets is rather ATLAS-specific. Therefore I would also leave puma as it is and add this in the small API |
@afroch I agree it's nice to keep the code experiment agnostic, but since all variable names etc are also specific to ATLAS we can just implement this as generic jet selection function as in umami, and remove electrons and apply kinematic selections via some config. |
But we don't have ATLAS specific variables here in puma. The only think which is ATLAS specific is the global config where we define the jet categories but this is mainly for the colours. You can easily change that. In the examples, I think we used a ATLAS specific variable but there is no variable hardcoded here. |
Closing as this is implemented |
The examples have a fair amount of boilerplate in them. We could consider reducing this by adding higher level functions that compute things like the discriminants and rejections for the user.
For example we could define a
Tagger(h5_path, name, label, n_jets, classes=['b', 'c', 'l'])
which reads the relevant arrays from an h5 file to a dataframe. Functions to run selections on jets, compute discriminants, compute rejections, etc can then be provided.When it comes to plotting we could just do
Which under the hood calls
plot_roc.add_roc
for each background the tagger has, using all the info from the tagger object.When writing the plotting code in the GNN repo, the idea was to have all the plotting configurable from yaml files using a list of taggers and a list of plots. Then we can for example maintain a list of suggested taggers that people can add to their plots with a single line of config. Do you think the same approach would be useful here?
The text was updated successfully, but these errors were encountered: