Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding high level plotting API #128

Merged
merged 41 commits into from
Dec 9, 2022
Merged
Show file tree
Hide file tree
Changes from 40 commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
b8fd3e0
adding high level plotting api
manuguth Sep 15, 2022
e576e5d
Merge branch 'main' of github.com:umami-hep/puma into manuguth-hlapi
manuguth Sep 19, 2022
0046d3f
adding intermediate implementation
manuguth Oct 20, 2022
d26c669
adding user ignore to git ignore
manuguth Oct 24, 2022
7a2489d
removing unit test diffs
manuguth Oct 24, 2022
9f72c83
revert custom changes
manuguth Oct 24, 2022
dc2b44a
Merge branch 'main' of github.com:umami-hep/puma into manuguth-hlapi
manuguth Oct 25, 2022
6d23ac1
small fixes
manuguth Nov 14, 2022
246be1f
Merge branch 'main' of github.com:umami-hep/puma into manuguth-hlapi
manuguth Nov 14, 2022
537bd5c
adding simple discriminant functions
manuguth Nov 15, 2022
623a335
fixing circular import
manuguth Nov 15, 2022
33c2e5c
updating to a new version of tagger class
manuguth Nov 15, 2022
ef0900c
updating tagger class and its tests
manuguth Nov 17, 2022
ee350d2
adding a more general discriminant calculation function
manuguth Nov 18, 2022
e2d0ddf
updating tagger plotting
manuguth Nov 21, 2022
1e4cead
update requirements in package install
manuguth Nov 22, 2022
9201abf
fixing folder for plotting
manuguth Nov 22, 2022
b9faf97
Merge branch 'main' of github.com:umami-hep/puma into manuguth-hlapi
manuguth Nov 22, 2022
444168c
fixing darglint
manuguth Nov 22, 2022
5c6471e
darglint fix
manuguth Nov 22, 2022
2cd9d71
Update puma/hlplots/results.py
manuguth Nov 22, 2022
d631eed
fix linting
manuguth Nov 22, 2022
708304f
Update puma/hlplots/results.py
manuguth Nov 22, 2022
f826a63
improve formatting
manuguth Nov 22, 2022
b954b4d
Update puma/hlplots/tagger.py
manuguth Nov 22, 2022
a11c16e
adding catch for inf values in discriminant calculation
manuguth Nov 22, 2022
af44eb6
adding small epsilon to discriminant calculation
manuguth Nov 25, 2022
fd7fce6
Merge branch 'manuguth-hlapi' of github.com:umami-hep/puma into manug…
manuguth Nov 25, 2022
ed04866
removing inheritance from Tagger and Results classes
manuguth Nov 25, 2022
2d9a9ad
fixing typos
manuguth Nov 25, 2022
3a79c1f
fixing doc string
manuguth Nov 25, 2022
0bae49f
adding improvements to hlevel api
manuguth Dec 9, 2022
e16b867
small gitignore improvement
manuguth Dec 9, 2022
e564131
fixing hard coded values
manuguth Dec 9, 2022
e93635c
fixing pylint issues
manuguth Dec 9, 2022
4199ec5
fix darglint issue
manuguth Dec 9, 2022
d0a2528
adding changelog
manuguth Dec 9, 2022
f155efe
Merge branch 'main' of github.com:umami-hep/puma into manuguth-hlapi
manuguth Dec 9, 2022
e2d516e
added warning in code when using 2 data frames
manuguth Dec 9, 2022
e075361
adding high level API to docs
manuguth Dec 9, 2022
32421c2
Update examples/high_level_plots.py
manuguth Dec 9, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,7 @@ dmypy.json
# VSCode config files
.vscode


# package specific excludes
examples/*.png
examples/*.png
# user specific files
*user*
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ repos:
language: system
entry: black
types: [python]
args: ["--experimental-string-processing"]
args: ["--preview"]
- id: flake8
name: flake8
stages: [commit]
Expand Down
2 changes: 2 additions & 0 deletions changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

### [Latest]

- Adding new high level API [#128](https://github.com/umami-hep/puma/pull/128)

### [v0.1.9] (2022/11/30)

- Adding boosted categories for Xbb to utils [!138](https://github.com/umami-hep/puma/pull/138)
Expand Down
72 changes: 72 additions & 0 deletions docs/source/examples/high_level_api.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# High level API

To set up the inputs for the plots, have a look [here](./index.md).

The following examples use the dummy data which is described [here](./dummy_data.md)

All the previous examples show how to use the plotting of individual plots often requiring
a fair amount of code to produce ROC curves etc.

This high level API facilitates several steps and is designed to quickly plot b- and c-jet
performance plots.


## Initialising the taggers

```py
§§§examples/high_level_plots.py:1:55§§§
```
WARNING: when using 2 different data frames you cannot just use one `tagger_args` but you need
as many as you have data frames defining the flavour classes and performance variables.


## Discriminant plots
To plot the discriminant, you can now simply call one function and everything else is handled automatically,
here for the _b_-jet discriminant
```py
§§§examples/high_level_plots.py:56:58§§§
```

<img src=https://github.com/umami-hep/puma/raw/examples-material/hlplots_disc_b.png width=500>

and similar for the _c_-jet discriminant
```py
§§§examples/high_level_plots.py:59§§§
```

<img src=https://github.com/umami-hep/puma/raw/examples-material/hlplots_disc_c.png width=500>


## ROC plots

In the same manner you can plot ROC curves, here for the _b_-tagging performance
```py
§§§examples/high_level_plots.py:62:64§§§
```
<img src=https://github.com/umami-hep/puma/raw/examples-material/hlplots_roc_b.png width=500>

and similar for the _c_-tagging performance
```py
§§§examples/high_level_plots.py:65§§§
```

<img src=https://github.com/umami-hep/puma/raw/examples-material/hlplots_roc_c.png width=500>


## Performance vs a variable
In this case we plot the performance as a function of the jet pT with the same syntax as above
```py
§§§examples/high_level_plots.py:69:82§§§
```
<img src=https://github.com/umami-hep/puma/raw/examples-material/hlplots_dummy_tagger_pt_b_eff.png width=500>
<img src=https://github.com/umami-hep/puma/raw/examples-material/hlplots_dummy_tagger_pt_c_rej.png width=500>
<img src=https://github.com/umami-hep/puma/raw/examples-material/hlplots_dummy_tagger_pt_light_rej.png width=500>

and similar for the _c_-tagging performance
```py
§§§examples/high_level_plots.py:84:94§§§
```

<img src=https://github.com/umami-hep/puma/raw/examples-material/hlplots_dummy_tagger_fixed_per_bin_pt_b_eff.png width=500>
<img src=https://github.com/umami-hep/puma/raw/examples-material/hlplots_dummy_tagger_fixed_per_bin_pt_c_rej.png width=500>
<img src=https://github.com/umami-hep/puma/raw/examples-material/hlplots_dummy_tagger_fixed_per_bin_pt_light_rej.png width=500>
5 changes: 3 additions & 2 deletions docs/source/examples/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,16 @@ can be used as a starting point for your plotting scripts.

.. toctree::
:maxdepth: 2

dummy_data.md
histograms.md
rocs.md
var_vs_eff.md
fraction_scan.md
pie_charts.md
line_plots.md

high_level_api.md


.. include:: index.md
:parser: myst_parser.sphinx_
94 changes: 94 additions & 0 deletions examples/high_level_plots.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
"""Produce roc curves from tagger output and labels."""
# from pathlib import Path

# import h5py
import numpy as np

from puma.hlplots import Results, Tagger
from puma.utils import get_dummy_2_taggers, logger

# The line below generates dummy data which is similar to a NN output
df = get_dummy_2_taggers(add_pt=True)
class_ids = [0, 4, 5]
# Remove all jets which are not trained on
df.query(f"HadronConeExclTruthLabelID in {class_ids}", inplace=True)
df.query("pt < 250e3", inplace=True)

logger.info("Start plotting")

# WARNING: if you use 2 different data frames you need to specify the `is_light`,
# `is_c` and `is_b` for each data frame separately and thus you cannot use these
# args for each tagger the same applies to the `perf_var`
tagger_args = {
"perf_var": df["pt"] / 1e3,
"is_light": df["HadronConeExclTruthLabelID"] == 0,
"is_c": df["HadronConeExclTruthLabelID"] == 4,
"is_b": df["HadronConeExclTruthLabelID"] == 5,
}


dips = Tagger("dips", template=tagger_args)
dips.label = "dummy DIPS ($f_{c}=0.005$)"
dips.f_c = 0.005
dips.f_b = 0.04
manuguth marked this conversation as resolved.
Show resolved Hide resolved
dips.colour = "#AA3377"
dips.extract_tagger_scores(df)
manuguth marked this conversation as resolved.
Show resolved Hide resolved

rnnip = Tagger("rnnip", template=tagger_args)
rnnip.label = "dummy RNNIP ($f_{c}=0.07$)"
rnnip.f_c = 0.07
rnnip.f_b = 0.04
manuguth marked this conversation as resolved.
Show resolved Hide resolved
rnnip.colour = "#4477AA"
rnnip.reference = True
rnnip.extract_tagger_scores(df)


results = Results()
results.add(dips)
results.add(rnnip)


results.sig_eff = np.linspace(0.6, 0.95, 20)
results.atlas_second_tag = (
"$\\sqrt{s}=13$ TeV, dummy jets \n$t\\bar{t}$, $20<p_{T}<250$ GeV"
manuguth marked this conversation as resolved.
Show resolved Hide resolved
)

# tagger discriminant plots
logger.info("Plotting tagger discriminant plots.")
results.plot_discs("hlplots_disc_b.png")
results.plot_discs("hlplots_disc_c.png", signal_class="cjets")


logger.info("Plotting ROC curves.")
# ROC curves as a function of the b-jet efficiency
results.plot_rocs("hlplots_roc_b.png")
# ROC curves as a function of the c-jet efficiency
results.plot_rocs("hlplots_roc_c.png", signal_class="cjets")


logger.info("Plotting efficiency/rejection vs pT curves.")
# eff/rej vs. variable plots
results.atlas_second_tag = "$\\sqrt{s}=13$ TeV, dummy jets \n$t\\bar{t}$\n70% WP"
# you can either specify a WP per tagger
# dips.working_point = 0.7
# rnnip.working_point = 0.7
# or alternatively also pass the argument `working_point` to the plot_var_perf function.
# to specify the `disc_cut` per tagger is also possible.
results.plot_var_perf(
plot_name="hlplots_dummy_tagger",
working_point=0.7,
bins=[20, 30, 40, 60, 85, 110, 140, 175, 250],
fixed_eff_bin=False,
)

results.atlas_second_tag = (
"$\\sqrt{s}=13$ TeV, dummy jets \n$t\\bar{t}$\n70% WP per bin"
)
results.plot_var_perf(
plot_name="hlplots_dummy_tagger_fixed_per_bin",
bins=[20, 30, 40, 60, 85, 110, 140, 175, 250],
fixed_eff_bin=True,
working_point=0.7,
h_line=0.7,
disc_cut=None,
)
6 changes: 6 additions & 0 deletions puma/hlplots/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"""High level plotting API within puma, to avoid code duplication."""
# flake8: noqa
# pylint: skip-file

from puma.hlplots.results import Results
from puma.hlplots.tagger import Tagger
Loading