Skip to content

Pygrb offline workflow #4288

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 47 commits into from
Jul 10, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
06416d5
Fixed indexing and autochisq argument in pycbc_multi_inspiral
pannarale Feb 5, 2023
1e0713c
Only Hanford is labelled with a 1: avoiding pycbc_pygrb_plot_snr_time…
pannarale Feb 26, 2023
af50dee
Simplified and update pycbc_pygrb_plot_injs_results; adapted pycbc_py…
pannarale Feb 26, 2023
ef79530
Merge remote-tracking branch 'upstream/master' into pygrb_timeslides_…
pannarale Feb 26, 2023
d451505
Forgotten comma
pannarale Feb 26, 2023
611ad0c
Merge branch 'gwastro:master' into pygrb_timeslides_fixes
pannarale Mar 9, 2023
0318a45
Merge branch 'gwastro:master' into pygrb_timeslides_fixes
pannarale Mar 13, 2023
d7dc6bd
Massive commit that stitches together pycbc_make_offline_grb_workflow…
pannarale Mar 13, 2023
05fc4a6
Massive commit that stitches together pycbc_make_offline_grb_workflow…
pannarale Mar 13, 2023
ed44cf0
Code-climate fixes
pannarale Mar 13, 2023
cefe678
More code-climate
pannarale Mar 13, 2023
145c17f
Import error flagged by code-climate
pannarale Mar 13, 2023
08cbb93
Forgotten import fixes
pannarale Mar 13, 2023
1dcf643
Minor cleanup of comments or commented out code lines
pannarale Mar 13, 2023
4099487
Syntax fix
pannarale Mar 14, 2023
fa5b123
Defining 2 empty file lists
pannarale Mar 14, 2023
dc0411e
Merge branch 'gwastro:master' into pygrb_timeslides_fixes
pannarale Mar 15, 2023
32a1f56
First part of PR 4288 review comments
pannarale Mar 15, 2023
7fe1c2e
One more TODO
pannarale Mar 15, 2023
76bd4d2
opt_to_file is now configparser_value_to_file in pycbc/workflow/core.py
pannarale Mar 24, 2023
22533f7
Upgraded configparser_value_to_file with attrs keyword argument
pannarale Mar 24, 2023
9333bd7
Using configparser_value_to_file throughout the PyGRB workflow generator
pannarale Mar 24, 2023
be1aa83
Removing 2 unused PyGRB functions
pannarale Mar 24, 2023
3b48f53
Removed LigolwCBCAlignTotalSpinExecutable, LigolwCBCJitterSkylocExecu…
pannarale Mar 24, 2023
d3ebdec
Codeclimate
pannarale Mar 24, 2023
d3625b5
bank_veto_file variable needs to be a FileList, not a File
pannarale Mar 31, 2023
770e1dc
Merge branch 'gwastro:master' into pygrb_timeslides_fixes
pannarale Jun 25, 2023
dfac88e
Bug with tags for single IFO SNR plots workflow generator
pannarale Jun 26, 2023
983c0d8
Fixing pycbc_pygrb_plot_coh_ifosnr
pannarale Jun 26, 2023
f2d7a74
Fixing layout of chi-square plots and section label for individual de…
pannarale Jun 27, 2023
00bbd2b
Codeclimate
pannarale Jun 28, 2023
08499d0
Codeclimate
pannarale Jun 28, 2023
0d6b4ec
Codeclimate
pannarale Jun 28, 2023
9685852
Removed FIXME for testing purposes
pannarale Jun 28, 2023
cbb62f4
First changes following up on PR review
pannarale Jul 3, 2023
badf31c
Second round of changes following up on PR review
pannarale Jul 3, 2023
177dbec
Flagging with an underscore grb utility functions not used outside th…
pannarale Jul 4, 2023
673e26b
Flagging with an underscore grb utility functions not used outside th…
pannarale Jul 4, 2023
31b666e
Removed _get_antenna_factors from PyGRB utily file
pannarale Jul 4, 2023
e2b8fd8
Removed commented out function from pycbc_pygrb_plot_injs_results
pannarale Jul 4, 2023
7b4e422
Substituted old unused code with a TODO
pannarale Jul 4, 2023
707717b
Adding injection set name as an option in pycbc_pygrb_efficiency
pannarale Jul 4, 2023
0cd748b
extend --> append for single files
pannarale Jul 4, 2023
6c5481d
Documenting setup_pygrb_pp_workflow more appropriately
pannarale Jul 4, 2023
2b10fa8
Reverted name of _get_antenna_responses to get_antenna_responses; cle…
pannarale Jul 5, 2023
908eb83
Line shortened
pannarale Jul 5, 2023
8539b04
Merge branch 'gwastro:master' into pygrb_timeslides_fixes
pannarale Jul 5, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion bin/pycbc_multi_inspiral
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,8 @@ parser.add_argument("--cluster-method", choices=["template", "window"],
"chirp length.")
parser.add_argument("--cluster-window", type=float, default=0,
help="Length of clustering window in seconds.")
parser.add_argument("--bank-veto-bank-file", type=str)
parser.add_argument("--bank-veto-bank-file", type=str, help="Path to the "
"bank file used to compute the the bank chi-square veto.")
parser.add_argument("--chisq-bins", default=0)
# Commenting out options which are not yet implemented
# parser.add_argument("--chisq-threshold", type=float, default=0)
Expand Down
206 changes: 89 additions & 117 deletions bin/pygrb/pycbc_make_offline_grb_workflow

Large diffs are not rendered by default.

19 changes: 7 additions & 12 deletions bin/pygrb/pycbc_pygrb_efficiency
Original file line number Diff line number Diff line change
Expand Up @@ -98,24 +98,20 @@ parser.add_argument("-g", "--glitch-check-factor", action="store",
"exclusion efficiencies this value is multiplied " +
"to the offsource around the injection trigger to " +
"determine if it is just a loud glitch.")
parser.add_argument("--injection-set-name", action="store", type=str,
default="", help="Name of the injection set to be " +
"used in the plot title.")
parser.add_argument("-C", "--cluster-window", action="store", type=float,
default=0.1, help="The cluster window used " +
"to cluster triggers in time.")
ppu.pygrb_add_missed_injs_input_opt(parser)
ppu.pygrb_add_injmc_opts(parser)
ppu.pygrb_add_bestnr_opts(parser)
opts = parser.parse_args()

init_logging(opts.verbose, format="%(asctime)s: %(levelname)s: %(message)s")

# Check options
if (opts.found_file is None) and (opts.missed_file is None):
do_injections = False
elif (opts.found_file) and opts.missed_file:
do_injections = True
else:
err_msg = "Must provide both found and missed file if running injections."
parser.error(err_msg)
do_injections = opts.found_missed_file is not None

if not opts.newsnr_threshold:
opts.newsnr_threshold = opts.snr_threshold
Expand All @@ -124,9 +120,8 @@ if not opts.newsnr_threshold:
outdir = os.path.split(os.path.abspath(opts.background_output_file))[0]
trig_file = opts.offsource_file
onsource_file = opts.onsource_file
found_file = opts.found_file
missed_file = opts.missed_file
inj_set_name = os.path.split(os.path.abspath(missed_file))[1].split('INSPINJ')[1].split('_')[1]
found_missed_file = opts.found_missed_file
inj_set_name = opts.injection_set_name
chisq_index = opts.chisq_index
chisq_nhigh = opts.chisq_nhigh
wf_err = opts.waveform_error
Expand Down Expand Up @@ -376,7 +371,7 @@ if do_injections:
logging.info("%d found injections analysed.", len(found_injs))

# Missed injections (ones not recovered at all)
missed_injs = ppu.load_injections(missed_file, vetoes, sim_table=True)
missed_injs = ppu.load_injections(found_missed_file, vetoes, sim_table=True)

# Process missed injections 'missed_inj'
missed_inj = {}
Expand Down
9 changes: 5 additions & 4 deletions bin/pygrb/pycbc_pygrb_minifollowups
Original file line number Diff line number Diff line change
Expand Up @@ -35,8 +35,9 @@ import pycbc.workflow.minifollowups as mini
import pycbc.version
import pycbc.events
from pycbc.results import layout
from pycbc.results import pygrb_postprocessing_utils as ppu
from pycbc.results.pygrb_postprocessing_utils import extract_ifos
from pycbc.workflow.plotting import PlotExecutable
from pycbc.workflow.grb_utils import build_veto_filelist, build_segment_filelist

__author__ = "Francesco Pannarale <[email protected]>"
__version__ = pycbc.version.git_verbose_msg
Expand Down Expand Up @@ -72,10 +73,10 @@ def make_timeseries_plot(workflow, trig_file, snr_type, central_time,
tags=tags+extra_tags).create_node()
node.add_input_opt('--trig-file', trig_file)
# Pass the veto files
veto_files = ppu.build_veto_filelist(workflow)
veto_files = build_veto_filelist(workflow)
node.add_input_list_opt('--veto-files', veto_files)
# Pass the segment files
seg_files = ppu.build_segment_filelist(workflow)
seg_files = build_segment_filelist(workflow)
node.add_input_list_opt('--seg-files', seg_files)
# Other shared tuning values
for opt in ['chisq-index', 'chisq-nhigh', 'null-snr-threshold',
Expand Down Expand Up @@ -155,7 +156,7 @@ num_events = min(num_events, len(fp['BestNR'][:]))

# Determine ifos used in the analysis
trig_file = resolve_url_to_file(os.path.abspath(args.trig_file))
ifos = ppu.extract_ifos(os.path.abspath(args.trig_file))
ifos = extract_ifos(os.path.abspath(args.trig_file))
num_ifos = len(ifos)

# (Loudest) off/on-source events are on time-slid data so the
Expand Down
52 changes: 41 additions & 11 deletions bin/pygrb/pycbc_pygrb_page_tables
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,38 @@ __program__ = "pycbc_pygrb_page_tables"
# =============================================================================
# Functions
# =============================================================================
# Function to load trigger data
def load_data(input_file, vetoes, ifos, opts, injections=False):
"""Load data from a trigger/injection file"""

# Initialize the dictionary
data = {}
data['trig_time'] = None
data['coherent'] = None
data['reweighted_snr'] = None

if input_file:
if injections:
logging.info("Loading injections...")
# This will eventually become ppu.load_injections()
trigs_or_injs = ppu.load_triggers(input_file, vetoes)
else:
logging.info("Loading triggers...")
trigs_or_injs = ppu.load_triggers(input_file, vetoes)

data['trig_time'] = trigs_or_injs['network/end_time_gc'][:]
num_trigs_or_injs = len(data['trig_time'])
data['coherent'] = trigs_or_injs['network/coherent_snr'][:]
data['reweighted_snr'] = trigs_or_injs['network/reweighted_snr'][:]

if injections:
logging.info("%d injections found.", num_trigs_or_injs)
else:
logging.info("%d triggers found.", num_trigs_or_injs)

return data


def get_column(table, column_name):
"""Wrapper for glue.ligolw.lsctables.MultiInspiralTable.get_column
method. Easier for h5py switch. Note: Must still replace
Expand Down Expand Up @@ -190,7 +222,6 @@ parser.add_argument("-F", "--offsource-file", action="store", required=True,
# As opposed to offsource-file and trig-file, this only contains onsource
parser.add_argument("--onsource-file", action="store", default=None,
help="Location of on-source trigger file.")
ppu.pygrb_add_missed_injs_input_opt(parser)
ppu.pygrb_add_injmc_opts(parser)
ppu.pygrb_add_bestnr_opts(parser)
parser.add_argument("--num-loudest-off-trigs", action="store",
Expand All @@ -200,6 +231,7 @@ parser.add_argument("--quiet-found-injs-output-file", default=None, #required=Tr
help="Quiet-found injections html output file.")
parser.add_argument("--missed-found-injs-output-file", default=None, #required=True,
help="Missed-found injections html output file.")
# TODO: group hdf5 files into a single one
parser.add_argument("--quiet-found-injs-h5-output-file", default=None, #required=True,
help="Quiet-found injections h5 output file.")
parser.add_argument("--loudest-offsource-trigs-output-file", default=None, #required=True,
Expand Down Expand Up @@ -241,9 +273,9 @@ if output_files.count(None) == len(output_files):
if opts.quiet_found_injs_output_file or opts.missed_found_injs_output_file or\
opts.quiet_found_injs_h5_output_file:
do_injections = True
if (opts.found_file is None) and (opts.missed_file is None):
err_msg = "Must provide both found and missed injections file "
err_msg += "locations if processing injections."
if opts.found_missed_file is None:
err_msg = "Must provide found-missed injections file "
err_msg += "location if processing injections."
parser.error(err_msg)
else:
do_injections = False
Expand All @@ -260,8 +292,7 @@ if not opts.newsnr_threshold:
# Store options used multiple times in local variables
trig_file = opts.offsource_file
onsource_file = opts.onsource_file
found_file = opts.found_file
missed_file = opts.missed_file
found_missed_file = opts.found_missed_file
chisq_index = opts.chisq_index
chisq_nhigh = opts.chisq_nhigh
wf_err = opts.waveform_error
Expand Down Expand Up @@ -316,8 +347,7 @@ ifos, vetoes = ppu.extract_ifos_and_vetoes(trig_file, opts.veto_files,

# Load triggers, time-slides, and segment dictionary
logging.info("Loading triggers.")
trigs = ppu.load_xml_table(trig_file, lsctables.MultiInspiralTable.tableName)
logging.info("%d triggers loaded.", len(trigs))
trig_data = load_data(trig_file, vetoes, ifos, opts)
logging.info("Loading timeslides.")
slide_dict = ppu.load_time_slides(trig_file)
logging.info("Loading segments.")
Expand Down Expand Up @@ -587,8 +617,8 @@ else:
# =======================
if do_injections:

found_trig_table = ppu.load_injections(found_file, vetoes)
found_inj_table = ppu.load_injections(found_file, vetoes, sim_table=True)
found_trig_table = ppu.load_injections(found_missed_file, vetoes)
found_inj_table = ppu.load_injections(found_missed_file, vetoes, sim_table=True)

logging.info("Missed/found injections/triggers loaded.")

Expand Down Expand Up @@ -618,7 +648,7 @@ if do_injections:
logging.info("%d found injections analysed.", len(found_injs['mchirp']))

# Missed injections (ones not recovered at all)
missed_inj_table = ppu.load_injections(missed_file, vetoes, sim_table=True)
missed_inj_table = ppu.load_injections(found_missed_file, vetoes, sim_table=True)
missed_injs = lsctable_to_dict(missed_inj_table, ifos, opts)

# Avoids a problem with formatting in the non-static html output file
Expand Down
70 changes: 48 additions & 22 deletions bin/pygrb/pycbc_pygrb_plot_chisq_veto
Original file line number Diff line number Diff line change
Expand Up @@ -76,8 +76,12 @@ def load_data(input_file, vetoes, ifos, opts, injections=False):
if snr_type in ['coherent', 'null', 'reweighted']:
data[snr_type] = trigs_or_injs['network/%s_snr' % snr_type][:]
elif snr_type == 'single':
att = opts.ifo[0].lower()
data[snr_type] = trigs_or_injs['%s/snr_%s' % (opts.ifo, att)][:]
key = opts.ifo + '/snr_' + opts.ifo.lower()
# TODO: Undoes current L710 in EventManagerCoherent.write_to_hdf
# and is ugly hardcoded!
if opts.ifo.lower() != 'h1':
key = key[:-1]
data[snr_type] = trigs_or_injs[key][:]

# Calculate coincident SNR
elif snr_type == 'coincident':
Expand All @@ -88,20 +92,38 @@ def load_data(input_file, vetoes, ifos, opts, injections=False):
'bank': 'bank_chisq',
'auto': 'cont_chisq'}

data[veto_type] = \
trigs_or_injs['%s/%s' % (opts.ifo, veto_tags[veto_type])][:]
# FIXME: network data contains my_network_chisq
chisq_key = 'network/my_network_chisq' if injections \
else opts.ifo + '/' + veto_tags[veto_type]
# Will not work until single IFO data is in found_missed_file
data[veto_type] = trigs_or_injs[chisq_key][:]

# Floor single IFO chi-square at 0.005
numpy.putmask(data[veto_type], data[veto_type] == 0, 0.005)

data['dof'] = numpy.unique(
trigs_or_injs['%s/%s_dof' % (opts.ifo, veto_tags[veto_type])][:])
if not injections:
dof_key = '%s/%s_dof' % (opts.ifo, veto_tags[veto_type])
data['dof'] = numpy.unique(
trigs_or_injs[dof_key][:])

logging.info("%d triggers found.", num_trigs_or_injs)
label = "injections" if injections else "triggers"

logging.info("{0} {1} found.".format(num_trigs_or_injs, label))

return data


# Function to calculate chi-square weight for the reweighted SNR
def new_snr_chisq(snr, new_snr, chisq_dof, chisq_index=4.0, chisq_nhigh=3.0):
"""Returns the chi-square value needed to weight SNR into new SNR"""

chisqnorm = (snr/new_snr)**chisq_index
if chisqnorm <= 1:
return 1E-20

return chisq_dof * (2*chisqnorm - 1)**(chisq_nhigh/chisq_index)
Comment on lines +116 to +124
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this function duplicating code in https://github.com/gwastro/pycbc/blob/master/pycbc/events/ranking.py? If not, does it make sense to move it there?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GRB "new_SNR" is the same as all_sky, except the DOF is sometimes different (for the same number of bins), and the "index" and "nhigh" parameters were different.



# Function that produces the contrours to be plotted
def calculate_contours(trig_data, opts, veto_type, new_snrs=None):
"""Generate the contours for the veto plots"""
Expand Down Expand Up @@ -130,9 +152,9 @@ def calculate_contours(trig_data, opts, veto_type, new_snrs=None):
# Loop over SNR values and calculate chisq variable needed
for j, snr in enumerate(snr_vals):
for i, new_snr in enumerate(new_snrs):
contours[i][j] = plu.new_snr_chisq(snr, new_snr, dof,
opts.chisq_index,
opts.chisq_nhigh)
contours[i][j] = new_snr_chisq(snr, new_snr, dof,
opts.chisq_index,
opts.chisq_nhigh)

return contours, snr_vals, cont_value

Expand Down Expand Up @@ -161,14 +183,17 @@ parser = ppu.pygrb_initialize_plot_parser(description=__doc__,
parser.add_argument("-t", "--trig-file", action="store",
default=None, required=True,
help="The location of the trigger file")
parser.add_argument("--found-missed-file",
help="The hdf injection results file", required=False)
parser.add_argument("-z", "--zoom-in", default=False, action="store_true",
help="Output file a zoomed in version of the plot.")
parser.add_argument("-y", "--y-variable", required=True,
choices=['standard', 'bank', 'auto'],
help="Quantity to plot on the vertical axis.")
parser.add_argument("--snr-type", default=None,
choices=['coherent', 'single', 'coincident', 'null',
'reweighted'], help="SNR value to plot on x-axis. Can be " +
# TODO: get 'single' to work
parser.add_argument("--snr-type", default='coherent',
choices=['coherent', 'coincident', 'null', 'reweighted',
'single'], help="SNR value to plot on x-axis. Can be " +
"coherent, coincident, null, reweighted, or single IFO SNR")
ppu.pygrb_add_bestnr_opts(parser)
opts = parser.parse_args()
Expand All @@ -177,7 +202,7 @@ init_logging(opts.verbose, format="%(asctime)s: %(levelname)s: %(message)s")

# Check options
trig_file = os.path.abspath(opts.trig_file)
found_file = os.path.abspath(opts.found_file) if opts.found_file else None
found_missed_file = os.path.abspath(opts.found_missed_file) if opts.found_missed_file else None
zoom_in = opts.zoom_in
veto_type = opts.y_variable
ifo = opts.ifo
Expand All @@ -186,13 +211,14 @@ snr_type = opts.snr_type
# otherwise the single IFO SNR is used
if snr_type == 'single':
if ifo is None:
err_msg = "--ifo must be given when using --use-sngl-ifo-snr"
err_msg = "--ifo must be given to plot single IFO SNR veto"
parser.error(err_msg)

# Veto is intended as a single IFO quantity. Network chisq will be obsolete.
if ifo is None:
err_msg = "--ifo must be given to plot single IFO SNR veto"
parser.error(err_msg)
# TODO: fix vetoes
#if ifo is None:
# err_msg = "--ifo must be given to plot single IFO SNR veto"
# parser.error(err_msg)

# Prepare plot title and caption
veto_labels = {'standard': "Chi Square Veto",
Expand All @@ -211,7 +237,7 @@ if opts.plot_caption is None:
"Black line: veto line. " +
"Gray shaded region: Vetoed area. " +
"Yellow lines: contours of new SNR.")
if found_file:
if found_missed_file:
opts.plot_caption = ("Red crosses: injections triggers. ") +\
opts.plot_caption

Expand All @@ -235,7 +261,7 @@ if ifo and ifo not in ifos:
trig_data = load_data(trig_file, vetoes, ifos, opts)

# Extract (or initialize) injection data
inj_data = load_data(found_file, vetoes, ifos, opts, injections=True)
inj_data = load_data(found_missed_file, vetoes, ifos, opts, injections=True)

# Sanity checks
if trig_data[snr_type] is None and inj_data[snr_type] is None:
Expand All @@ -259,14 +285,14 @@ if snr_type == 'coincident':

# Determine the minumum and maximum SNR value we are dealing with
x_min = opts.sngl_snr_threshold
x_max = 1.1*plu.axis_max_value(trig_data[snr_type], inj_data[snr_type], found_file)
x_max = 1.1*plu.axis_max_value(trig_data[snr_type], inj_data[snr_type], found_missed_file)

# Determine y-axis minimum value and label
y_label = "%s Single %s" % (ifo, veto_labels[veto_type])
y_min = 1

# Determine the maximum bank veto value we are dealing with
y_max = plu.axis_max_value(trig_data[veto_type], inj_data[veto_type], found_file)
y_max = plu.axis_max_value(trig_data[veto_type], inj_data[veto_type], found_missed_file)

# Determine contours for plots
conts = None
Expand Down
Loading