Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reducing boilerplate / adding high level functions #122

Closed
samvanstroud opened this issue Sep 15, 2022 · 8 comments
Closed

Reducing boilerplate / adding high level functions #122

samvanstroud opened this issue Sep 15, 2022 · 8 comments
Assignees

Comments

@samvanstroud
Copy link
Contributor

The examples have a fair amount of boilerplate in them. We could consider reducing this by adding higher level functions that compute things like the discriminants and rejections for the user.

For example we could define a Tagger(h5_path, name, label, n_jets, classes=['b', 'c', 'l']) which reads the relevant arrays from an h5 file to a dataframe. Functions to run selections on jets, compute discriminants, compute rejections, etc can then be provided.

When it comes to plotting we could just do

for tagger in taggers:
    plot_roc.add_tagger(tagger)

Which under the hood calls plot_roc.add_roc for each background the tagger has, using all the info from the tagger object.

When writing the plotting code in the GNN repo, the idea was to have all the plotting configurable from yaml files using a list of taggers and a list of plots. Then we can for example maintain a list of suggested taggers that people can add to their plots with a single line of config. Do you think the same approach would be useful here?

@joschkabirk
Copy link
Contributor

Hi @samvanstroud!

I agree, there is a quite some boilerplate in the examples. However, we intended to not use high-level functions in puma, and instead implemented such functionality in umami. The intention was to keep puma rather "low-level" but therefore highly configurable.

We have similar high-level plotting functions in umami, which use puma under the hood. There you can also define the plots in yaml files as for example shown here.

So I'd prefer to keep the overall structure like this and keep puma as general as possible.

@samvanstroud
Copy link
Contributor Author

Thanks for the reply, and thank you for pointing to these features in umami, I didn't realise a higher level interface was implemented there.

For me though it would be nice to pip install this package and be able to use a high level interface to produce plots directly. As the high level functions can be added here without reducing access to more configurable low level functions, I think it makes sense for all the plotting code to go into a single repository. The user can then decide to go with the high or low level interface. I don't see a reason to include the higher level functions in a different repository. I appreciate it would take some work to transfer this functionality though!

@manuguth
Copy link
Contributor

Hi @samvanstroud

I agree with that. we can have a small separate API within puma to perform these plots fast.
I would say this then directly links to your other issue #121 which we could combine with such a jet selection (to leave the low level code of puma as it is).

We could do something similar to this https://gitlab.cern.ch/mguth/ftag-plotting-utils (I just stitched them together)

@samvanstroud
Copy link
Contributor Author

@manuguth, absolutely, that interface looks very much along the lines I was thinking!

@afroch
Copy link
Contributor

afroch commented Sep 15, 2022

I'd support that we have the high-level stuff in a small separate API. Puma is not only written for ATLAS and therefore removing electron jets is rather ATLAS-specific. Therefore I would also leave puma as it is and add this in the small API

@samvanstroud
Copy link
Contributor Author

@afroch I agree it's nice to keep the code experiment agnostic, but since all variable names etc are also specific to ATLAS we can just implement this as generic jet selection function as in umami, and remove electrons and apply kinematic selections via some config.

@afroch
Copy link
Contributor

afroch commented Sep 15, 2022

But we don't have ATLAS specific variables here in puma. The only think which is ATLAS specific is the global config where we define the jet categories but this is mainly for the colours. You can easily change that. In the examples, I think we used a ATLAS specific variable but there is no variable hardcoded here.

@manuguth manuguth self-assigned this Sep 19, 2022
@samvanstroud
Copy link
Contributor Author

Closing as this is implemented

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants