Reducing boilerplate / adding high level functions #122

samvanstroud · 2022-09-15T09:35:01Z

The examples have a fair amount of boilerplate in them. We could consider reducing this by adding higher level functions that compute things like the discriminants and rejections for the user.

For example we could define a Tagger(h5_path, name, label, n_jets, classes=['b', 'c', 'l']) which reads the relevant arrays from an h5 file to a dataframe. Functions to run selections on jets, compute discriminants, compute rejections, etc can then be provided.

When it comes to plotting we could just do

for tagger in taggers:
    plot_roc.add_tagger(tagger)

Which under the hood calls plot_roc.add_roc for each background the tagger has, using all the info from the tagger object.

When writing the plotting code in the GNN repo, the idea was to have all the plotting configurable from yaml files using a list of taggers and a list of plots. Then we can for example maintain a list of suggested taggers that people can add to their plots with a single line of config. Do you think the same approach would be useful here?

The text was updated successfully, but these errors were encountered:

joschkabirk · 2022-09-15T10:42:39Z

Hi @samvanstroud!

I agree, there is a quite some boilerplate in the examples. However, we intended to not use high-level functions in puma, and instead implemented such functionality in umami. The intention was to keep puma rather "low-level" but therefore highly configurable.

We have similar high-level plotting functions in umami, which use puma under the hood. There you can also define the plots in yaml files as for example shown here.

So I'd prefer to keep the overall structure like this and keep puma as general as possible.

samvanstroud · 2022-09-15T10:59:34Z

Thanks for the reply, and thank you for pointing to these features in umami, I didn't realise a higher level interface was implemented there.

For me though it would be nice to pip install this package and be able to use a high level interface to produce plots directly. As the high level functions can be added here without reducing access to more configurable low level functions, I think it makes sense for all the plotting code to go into a single repository. The user can then decide to go with the high or low level interface. I don't see a reason to include the higher level functions in a different repository. I appreciate it would take some work to transfer this functionality though!

manuguth · 2022-09-15T11:11:44Z

Hi @samvanstroud

I agree with that. we can have a small separate API within puma to perform these plots fast.
I would say this then directly links to your other issue #121 which we could combine with such a jet selection (to leave the low level code of puma as it is).

We could do something similar to this https://gitlab.cern.ch/mguth/ftag-plotting-utils (I just stitched them together)

samvanstroud · 2022-09-15T11:15:01Z

@manuguth, absolutely, that interface looks very much along the lines I was thinking!

afroch · 2022-09-15T11:54:06Z

I'd support that we have the high-level stuff in a small separate API. Puma is not only written for ATLAS and therefore removing electron jets is rather ATLAS-specific. Therefore I would also leave puma as it is and add this in the small API

samvanstroud · 2022-09-15T12:00:34Z

@afroch I agree it's nice to keep the code experiment agnostic, but since all variable names etc are also specific to ATLAS we can just implement this as generic jet selection function as in umami, and remove electrons and apply kinematic selections via some config.

afroch · 2022-09-15T12:20:54Z

But we don't have ATLAS specific variables here in puma. The only think which is ATLAS specific is the global config where we define the jet categories but this is mainly for the colours. You can easily change that. In the examples, I think we used a ATLAS specific variable but there is no variable hardcoded here.

samvanstroud · 2023-03-14T16:00:29Z

Closing as this is implemented

samvanstroud mentioned this issue Sep 15, 2022

Treatment of electron jets #121

Open

manuguth mentioned this issue Sep 19, 2022

Adding high level plotting API #128

Merged

5 tasks

manuguth self-assigned this Sep 19, 2022

samvanstroud closed this as completed Mar 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reducing boilerplate / adding high level functions #122

Reducing boilerplate / adding high level functions #122

samvanstroud commented Sep 15, 2022

joschkabirk commented Sep 15, 2022

samvanstroud commented Sep 15, 2022

manuguth commented Sep 15, 2022

samvanstroud commented Sep 15, 2022

afroch commented Sep 15, 2022

samvanstroud commented Sep 15, 2022

afroch commented Sep 15, 2022

samvanstroud commented Mar 14, 2023

Reducing boilerplate / adding high level functions #122

Reducing boilerplate / adding high level functions #122

Comments

samvanstroud commented Sep 15, 2022

joschkabirk commented Sep 15, 2022

samvanstroud commented Sep 15, 2022

manuguth commented Sep 15, 2022

samvanstroud commented Sep 15, 2022

afroch commented Sep 15, 2022

samvanstroud commented Sep 15, 2022

afroch commented Sep 15, 2022

samvanstroud commented Mar 14, 2023