-
Notifications
You must be signed in to change notification settings - Fork 228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
harness: Detector only #833
base: main
Are you sure you want to change the base?
harness: Detector only #833
Conversation
Signed-off-by: Vidushi Maheshwari <[email protected]>
Signed-off-by: Vidushi Maheshwari <[email protected]>
Signed-off-by: Vidushi Maheshwari <[email protected]>
Signed-off-by: Vidushi Maheshwari <[email protected]>
Signed-off-by: Vidushi Maheshwari <[email protected]>
Signed-off-by: Vidushi Maheshwari <[email protected]>
Signed-off-by: Vidushi Maheshwari <[email protected]>
Signed-off-by: Vidushi Maheshwari <[email protected]>
DCO Assistant Lite bot All contributors have signed the DCO ✍️ ✅ |
I have read the DCO Document and I hereby sign the DCO |
recheck |
Thanks, will take a look! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some preliminary thoughts based on the assumption a configurable harness is viable here.
Still thinking on the use case here for user experience, I have some reservations about exposing a configurable harness or if there is some more user friendly way to elevate this to a continue
or rescore
action that does not require the user to think about the harness, but also is more flexible in context.
The current flag --detector_only
is a bit specific to be a top level config option.
garak/cli.py
Outdated
if not _config.plugins.detector_spec: | ||
logging.error("Detector(s) not specified. Use --detectors") | ||
raise ValueError("use --detectors to specify some detectors") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default the detectors to use should probably be extracted from the start_run setup
entry in the provided report file with the command line option being an override
to allow reprocessing results against a different detector.
garak/cli.py
Outdated
parser.add_argument( | ||
"--detector_only", | ||
action="store_true", | ||
help="run detector on jsonl report" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this might shift to a harness
type options to mimic generator_options
and probe_options
?
--harness_options
for inline json
--harness_options_file
that could take a json config file
Some validation may be need on the object received to ensure options provided are for a valid harness type and meet the requirements for launching the harness.
This would then remove the need to also add --probed_report_path
as that is currently only used when this option is set and json or file config aligns with other plugins.
{
"DetectorOnly":
{
"report_path": "file.report.jsonl"
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not exactly sure if continue
or rescore
has been implemented yet (or maybe in some other branch?). But I agree with creating harness_options
instead of exposing a lot of unnecessary higher-level options. I have incorporated the idea of harness_options
in the new changes.
We don't have continue / rescore anywhere yet. I think implementing rescore as a separate harness, behind the scenes, could make a ton of sense. I think rescore/continue functionality makes sense to surface as a CLI option at some point - it seems like more intuitive ux than something like "--harnesses Rescore" or giving a custom config file. |
@classmethod | ||
def from_dict(cls, dicti): | ||
"""Initializes an attempt object from dictionary""" | ||
attempt_obj = cls() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this skip the attempt constructor? Can we add an explicit type signature to signal what cls
is expected to be?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cls
is the callable for the class
which will be an Attempt
. This will call the __init__()
method with all defaults.
Due to the current overrides in the class attempt_obj.outputs
below may not produce the same in memory object for a multi-turn conversation attempt since the existing as_dict()
method serialized outputs
into the log and not the full messages history.
For the purposes of this PR I suspect this is acceptable, however it is worth noting.
@@ -105,6 +105,24 @@ def as_dict(self) -> dict: | |||
"messages": self.messages, | |||
} | |||
|
|||
@classmethod | |||
def from_dict(cls, dicti): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-- out of scope for here, but we should implement serialization/deserialization for Attempt
s
garak/cli.py
Outdated
if parsed_specs["detector"] == []: | ||
_config.plugins.harnesses["Probewise"] = {} | ||
else: | ||
_config.plugins.harnesses["Pxd"] = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you just run me through the reasoning here? would this clobber harness config loaded from global / site / cli-specific config YAML?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect this could be avoided. We should not clobber config
without an explicit override from a command line flag.
If no specific harness was provided via config
and no detectors were provided per parsed_spec
that is the determining factor on which default harness to load when _config.plugins.harnesses
does not contain any configuration data for a specific harness. This does expose that there may be a missing top level parameter to select a specific harness if default config were to provide for various harness types. Currently, finding config for DetectorOnly
to be a selection criteria seems a bit brittle.
If there is a desire to consolidate harness selection I maybe something like:
harness_command = command.pxd_run
if not _config.plugins.harnesses:
if parsed_specs["detector"] == []:
harness_command = command.probewise_run
elif "detectoronly" in _config.plugins.harnesses:
harness_command = command.detector_only_run
match harness_command:
case command.detector_only_run:
harness_command()
case command.probewise_run:
harness_command(
generator, parsed_specs["probe"], evaluator, parsed_specs["buff"]
)
case command.pxd_run:
harness_command(
generator,
parsed_specs["probe"],
parsed_specs["detector"],
evaluator,
parsed_specs["buff"],
parsed_specs["buff"],
)
case _: # base case for invalid callable, currently not a reachable case
logging.warn("no valid harness selected")
command.end_run()
There is probably more abstraction possible here but this offers an idea.
garak/command.py
Outdated
@@ -255,3 +255,48 @@ def write_report_digest(report_filename, digest_filename): | |||
digest = report_digest.compile_digest(report_filename) | |||
with open(digest_filename, "w", encoding="utf-8") as f: | |||
f.write(digest) | |||
|
|||
def detector_only_run(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jmartin-tech What do you think this is telling us about where responsibility for orchestrating runs lies? Is the existing Harness
interface just too inflexible to make invoking novel things like DetectorOnly
from garak.cli
?
print(msg) | ||
raise ValueError(msg) | ||
|
||
super().run_detectors(detectors, attempts, evaluator) # The probe is None, but hopefully no errors occur with probe. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this work?
super().run_detectors(detectors, attempts, evaluator) # The probe is None, but hopefully no errors occur with probe. | |
self.run_detectors(detectors, attempts, evaluator) # The probe is None, but hopefully no errors occur with probe. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for this. some auxiliary comments & qs - let's still wait for @jmartin-tech 's review
resolves #142 |
garak/cli.py
Outdated
logging.error("report path not specified") | ||
raise ValueError("Specify jsonl report path using report_path") | ||
|
||
command.start_run() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the refactor for harness selection offered is not used, this needs to be removed as start_run()
was called before entering this conditional.
command.start_run() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great progress, I am thinking this pattern can be stepping stone to providing a rescore
or continue
feature in separate iteration.
There are still a few quirks that likely need to be addressed. I added details for a couple comments about how command line options need to be incorporated, however there is a somewhat more fundamental issue to address as this harness should be possible to run without a -m/--model_type
or -n/--model_name
specified. Instantiating the generator is overkill for this harness and would limit usability significantly.
I would like to see support for usage like:
h_config.json
{
"detectoronly":
{
"DetectorOnly":
{
"report_path": "<file_path>"
}
}
}
python -m garak --harness_option_file h_config.json
python -m garak -d misleading --harness_option_file h_config.json
or
yaml config such as:
h_config.yaml
plugins:
detector_spec: misleading,mitigation
harnesss:
detectoronly:
DetectorOnly:
report_path: <report_file_path>
python -m garak --config h_config.yaml
Sorry for the churn here, trying to balance quick iteration, ease of use, and roadmap needs as we incorporate the use case.
garak/command.py
Outdated
with open(config["report_path"]) as f: | ||
data = [json.loads(line) for line in f] | ||
|
||
## Get detectors and evaluator from report if not specified by the user |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not quite what I was thinking in terms of obtaining the detectors from the original log. The actual extractions looks good as the start_run
will contain the expanded detector
list however it ignores existing top level arguments and the spec parsing support for options set on the harness.
The harness could accept the list of detectors provided via the parsed_spec for detectors from the command line and an evaluator
as other harnesses do, if no detectors were provided then the list of detectors can be obtained based on config from the original report.
If I am reading this correctly, this is expecting detectors
and eval_threshold
to be set in the harness config and falling back if not found, this would not account for the top level command line options that as a user I would expect to be applied.
It looks like the current expectation would be a config like:
h_options.json:
{
"detectoronly": {
"Detectoronly": {
"report_path": "<file_path>",
"detectors" : [ "d1", "d2" ],
"eval_threshold": 0.9
}
}
}
With a command line like:
python -m garak --harness_option_file h_options.json --m nim -n meta/llama3-70b-instruct
However based on the existing options a user may have expectations for -d all
to apply all detectors when passed as an option.
h_options.json:
{
"detectoronly": {
"Detectoronly": {
"report_path": "<file_path>"
}
}
}
With a command line like:
python -m garak --harness_option_file h_options.json --m nim -n meta/llama3-70b-instruct -d all --eval_threshold 0.5
My thought here is that the garak.command
module should not need access to data from the cli but should be be provided information that takes advantage of cli having parsed all the setup options and it's support for things like expanding a detector classes based on a module name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, so I will add the config and then support only --detectors
/ -d
as a top-level argument, and if that is not present, fall back to the ones present within the report. It makes more sense from a user perspective 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vidushiMaheshwari, circling back to check on progress.
I am happy to monitor this PR or offer parts of what I suggested as a PR to your branch in the coming weeks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I apologize for being inactive, just pushed the changes which I believe should suffice the comments. Would appreciate a PR with suggested changes!
Co-authored-by: Leon Derczynski <[email protected]> Signed-off-by: Vidushi Maheshwari <[email protected]>
Co-authored-by: Leon Derczynski <[email protected]> Signed-off-by: Vidushi Maheshwari <[email protected]>
Co-authored-by: Leon Derczynski <[email protected]> Signed-off-by: Vidushi Maheshwari <[email protected]>
Co-authored-by: Jeffrey Martin <[email protected]> Signed-off-by: Vidushi Maheshwari <[email protected]>
Sometimes I want to be able to run different detector on the same probe and sometimes my detector fails and I do not want to run the probe again. I created a detector-only harness that takes in the report for such times. It takes report.jsonl file and run the specified detector through it.
This could also help in offline testing of models against prod data.