harness: Detector only #833

vidushiMaheshwari · 2024-08-15T02:35:18Z

Sometimes I want to be able to run different detector on the same probe and sometimes my detector fails and I do not want to run the probe again. I created a detector-only harness that takes in the report for such times. It takes report.jsonl file and run the specified detector through it.

This could also help in offline testing of models against prod data.

Signed-off-by: Vidushi Maheshwari <[email protected]>

…ari/garak into detector-only-run

Signed-off-by: Vidushi Maheshwari <[email protected]>

github-actions · 2024-08-15T02:35:30Z

DCO Assistant Lite bot All contributors have signed the DCO ✍️ ✅

vidushiMaheshwari · 2024-08-15T02:36:22Z

I have read the DCO Document and I hereby sign the DCO

vidushiMaheshwari · 2024-08-15T02:36:33Z

recheck

leondz · 2024-08-15T15:49:33Z

Thanks, will take a look!

…ari/garak into detector-only-run

jmartin-tech

Some preliminary thoughts based on the assumption a configurable harness is viable here.

Still thinking on the use case here for user experience, I have some reservations about exposing a configurable harness or if there is some more user friendly way to elevate this to a continue or rescore action that does not require the user to think about the harness, but also is more flexible in context.

The current flag --detector_only is a bit specific to be a top level config option.

jmartin-tech · 2024-08-15T16:27:28Z

garak/cli.py

+            if not _config.plugins.detector_spec:
+                logging.error("Detector(s) not specified. Use --detectors")
+                raise ValueError("use --detectors to specify some detectors")


By default the detectors to use should probably be extracted from the start_run setup entry in the provided report file with the command line option being an override to allow reprocessing results against a different detector.

garak/command.py

jmartin-tech · 2024-08-15T16:40:12Z

garak/cli.py

+    parser.add_argument(
+        "--detector_only",
+        action="store_true",
+        help="run detector on jsonl report"
+    )


I wonder if this might shift to a harness type options to mimic generator_options and probe_options?

--harness_options for inline json
--harness_options_file that could take a json config file

Some validation may be need on the object received to ensure options provided are for a valid harness type and meet the requirements for launching the harness.

This would then remove the need to also add --probed_report_path as that is currently only used when this option is set and json or file config aligns with other plugins.

{ "DetectorOnly": { "report_path": "file.report.jsonl" } }

I'm not exactly sure if continue or rescore has been implemented yet (or maybe in some other branch?). But I agree with creating harness_options instead of exposing a lot of unnecessary higher-level options. I have incorporated the idea of harness_options in the new changes.

leondz · 2024-08-21T10:01:42Z

Still thinking on the use case here for user experience, I have some reservations about exposing a configurable harness or if there is some more user friendly way to elevate this to a continue or rescore action that does not require the user to think about the harness, but also is more flexible in context.

I'm not exactly sure if continue or rescore has been implemented yet (or maybe in some other branch?).

We don't have continue / rescore anywhere yet. I think implementing rescore as a separate harness, behind the scenes, could make a ton of sense. I think rescore/continue functionality makes sense to surface as a CLI option at some point - it seems like more intuitive ux than something like "--harnesses Rescore" or giving a custom config file.

garak/cli.py

leondz · 2024-08-21T10:07:28Z

garak/attempt.py

+    @classmethod
+    def from_dict(cls, dicti):
+        """Initializes an attempt object from dictionary"""
+        attempt_obj = cls()


Does this skip the attempt constructor? Can we add an explicit type signature to signal what cls is expected to be?

cls is the callable for the class which will be an Attempt. This will call the __init__() method with all defaults.

Due to the current overrides in the class attempt_obj.outputs below may not produce the same in memory object for a multi-turn conversation attempt since the existing as_dict() method serialized outputs into the log and not the full messages history.

For the purposes of this PR I suspect this is acceptable, however it is worth noting.

leondz · 2024-08-21T10:07:34Z

garak/attempt.py

@@ -105,6 +105,24 @@ def as_dict(self) -> dict:
            "messages": self.messages,
        }

+    @classmethod
+    def from_dict(cls, dicti):


-- out of scope for here, but we should implement serialization/deserialization for Attempts

garak/cli.py

leondz · 2024-08-21T10:09:15Z

garak/cli.py

+                if parsed_specs["detector"] == []:
+                    _config.plugins.harnesses["Probewise"] = {}
+                else:
+                    _config.plugins.harnesses["Pxd"] = {}


can you just run me through the reasoning here? would this clobber harness config loaded from global / site / cli-specific config YAML?

I suspect this could be avoided. We should not clobber config without an explicit override from a command line flag.

If no specific harness was provided via config and no detectors were provided per parsed_spec that is the determining factor on which default harness to load when _config.plugins.harnesses does not contain any configuration data for a specific harness. This does expose that there may be a missing top level parameter to select a specific harness if default config were to provide for various harness types. Currently, finding config for DetectorOnly to be a selection criteria seems a bit brittle.

If there is a desire to consolidate harness selection I maybe something like:

harness_command = command.pxd_run if not _config.plugins.harnesses: if parsed_specs["detector"] == []: harness_command = command.probewise_run elif "detectoronly" in _config.plugins.harnesses: harness_command = command.detector_only_run match harness_command: case command.detector_only_run: harness_command() case command.probewise_run: harness_command( generator, parsed_specs["probe"], evaluator, parsed_specs["buff"] ) case command.pxd_run: harness_command( generator, parsed_specs["probe"], parsed_specs["detector"], evaluator, parsed_specs["buff"], parsed_specs["buff"], ) case _: # base case for invalid callable, currently not a reachable case logging.warn("no valid harness selected") command.end_run()

There is probably more abstraction possible here but this offers an idea.

leondz · 2024-08-21T10:13:07Z

garak/command.py

@@ -255,3 +255,48 @@ def write_report_digest(report_filename, digest_filename):
    digest = report_digest.compile_digest(report_filename)
    with open(digest_filename, "w", encoding="utf-8") as f:
        f.write(digest)
+
+def detector_only_run():


@jmartin-tech What do you think this is telling us about where responsibility for orchestrating runs lies? Is the existing Harness interface just too inflexible to make invoking novel things like DetectorOnly from garak.cli?

docs/source/harnesses.rst

leondz · 2024-08-21T10:15:31Z

garak/harnesses/detectoronly.py

+                print(msg)
+            raise ValueError(msg)
+
+        super().run_detectors(detectors, attempts, evaluator) # The probe is None, but hopefully no errors occur with probe.


can this work?

Suggested change

super().run_detectors(detectors, attempts, evaluator) # The probe is None, but hopefully no errors occur with probe.

self.run_detectors(detectors, attempts, evaluator) # The probe is None, but hopefully no errors occur with probe.

garak/harnesses/base.py

leondz

thanks for this. some auxiliary comments & qs - let's still wait for @jmartin-tech 's review

leondz · 2024-08-21T10:30:36Z

resolves #142

jmartin-tech · 2024-08-21T15:46:10Z

garak/cli.py

+                logging.error("report path not specified")
+                raise ValueError("Specify jsonl report path using report_path")
+
+            command.start_run()


If the refactor for harness selection offered is not used, this needs to be removed as start_run() was called before entering this conditional.

Suggested change

command.start_run()

jmartin-tech

This is great progress, I am thinking this pattern can be stepping stone to providing a rescore or continue feature in separate iteration.

There are still a few quirks that likely need to be addressed. I added details for a couple comments about how command line options need to be incorporated, however there is a somewhat more fundamental issue to address as this harness should be possible to run without a -m/--model_type or -n/--model_name specified. Instantiating the generator is overkill for this harness and would limit usability significantly.

I would like to see support for usage like:
h_config.json

{
  "detectoronly":
  {
    "DetectorOnly":
    {
      "report_path": "<file_path>"
    }
  }
}

python -m garak --harness_option_file h_config.json
python -m garak -d misleading --harness_option_file h_config.json

or

yaml config such as:
h_config.yaml

plugins:
  detector_spec: misleading,mitigation
  harnesss:
    detectoronly:
      DetectorOnly:
        report_path: <report_file_path>

python -m garak --config h_config.yaml

Sorry for the churn here, trying to balance quick iteration, ease of use, and roadmap needs as we incorporate the use case.

jmartin-tech · 2024-08-21T16:37:31Z

garak/command.py

+    with open(config["report_path"]) as f:
+        data = [json.loads(line) for line in f]
+
+    ## Get detectors and evaluator from report if not specified by the user


This is not quite what I was thinking in terms of obtaining the detectors from the original log. The actual extractions looks good as the start_run will contain the expanded detector list however it ignores existing top level arguments and the spec parsing support for options set on the harness.

The harness could accept the list of detectors provided via the parsed_spec for detectors from the command line and an evaluator as other harnesses do, if no detectors were provided then the list of detectors can be obtained based on config from the original report.

If I am reading this correctly, this is expecting detectors and eval_threshold to be set in the harness config and falling back if not found, this would not account for the top level command line options that as a user I would expect to be applied.

It looks like the current expectation would be a config like:

h_options.json:

{ "detectoronly": { "Detectoronly": { "report_path": "<file_path>", "detectors" : [ "d1", "d2" ], "eval_threshold": 0.9 } } }

With a command line like:

python -m garak --harness_option_file h_options.json --m nim -n meta/llama3-70b-instruct

However based on the existing options a user may have expectations for -d all to apply all detectors when passed as an option.

h_options.json:

{ "detectoronly": { "Detectoronly": { "report_path": "<file_path>" } } }

With a command line like:

python -m garak --harness_option_file h_options.json --m nim -n meta/llama3-70b-instruct -d all --eval_threshold 0.5

My thought here is that the garak.command module should not need access to data from the cli but should be be provided information that takes advantage of cli having parsed all the setup options and it's support for things like expanding a detector classes based on a module name.

Okay, so I will add the config and then support only --detectors / -d as a top-level argument, and if that is not present, fall back to the ones present within the report. It makes more sense from a user perspective 👍

@vidushiMaheshwari, circling back to check on progress.

I am happy to monitor this PR or offer parts of what I suggested as a PR to your branch in the coming weeks.

I apologize for being inactive, just pushed the changes which I believe should suffice the comments. Would appreciate a PR with suggested changes!

garak/cli.py

Co-authored-by: Leon Derczynski <[email protected]> Signed-off-by: Vidushi Maheshwari <[email protected]>

Co-authored-by: Jeffrey Martin <[email protected]> Signed-off-by: Vidushi Maheshwari <[email protected]>

…o detector-only-run

…ari/garak into detector-only-run

vidushiMaheshwari added 12 commits August 14, 2024 21:30

detectoronly harness

d9e7f28

Update attempt.py

c8b7e77

Signed-off-by: Vidushi Maheshwari <[email protected]>

change

13c89fe

Merge branch 'detector-only-run' of https://github.com/vidushiMaheshw…

b87068f

…ari/garak into detector-only-run

Update garak.core.yaml

df107c2

Signed-off-by: Vidushi Maheshwari <[email protected]>

Update garak.core.yaml

d9d43a6

Signed-off-by: Vidushi Maheshwari <[email protected]>

100_pass_mod

662fbf0

change

15c1097

Signed-off-by: Vidushi Maheshwari <[email protected]>

Update attempt.py

3f8e263

Signed-off-by: Vidushi Maheshwari <[email protected]>

Update garak.core.yaml

4714757

Signed-off-by: Vidushi Maheshwari <[email protected]>

Update garak.core.yaml

9f63ab1

Signed-off-by: Vidushi Maheshwari <[email protected]>

100_pass_mod

1b5aa46

Signed-off-by: Vidushi Maheshwari <[email protected]>

github-actions bot added a commit that referenced this pull request Aug 15, 2024

@vidushiMaheshwari has signed the CLA in #833

db0b1e9

vidushiMaheshwari changed the title ~~Detector only run~~ Detector only Harness Aug 15, 2024

vidushiMaheshwari marked this pull request as ready for review August 15, 2024 02:44

vidushiMaheshwari added 2 commits August 19, 2024 11:08

docs

239cfc8

Merge branch 'detector-only-run' of https://github.com/vidushiMaheshw…

65c0c36

…ari/garak into detector-only-run

jmartin-tech reviewed Aug 19, 2024

View reviewed changes

harness config options and files

e9eb742

vidushiMaheshwari marked this pull request as draft August 19, 2024 21:30

Probewise harness is a dictionary instead of attributed class

a35cb5a

vidushiMaheshwari marked this pull request as ready for review August 20, 2024 14:27

vidushiMaheshwari requested a review from jmartin-tech August 20, 2024 14:30

leondz reviewed Aug 21, 2024

View reviewed changes

leondz linked an issue Aug 21, 2024 that may be closed by this pull request

add ability to reevaluate runs #142

Open

jmartin-tech reviewed Aug 21, 2024

View reviewed changes

jmartin-tech requested changes Aug 21, 2024

View reviewed changes

vidushiMaheshwari and others added 6 commits August 22, 2024 19:41

Update garak/cli.py

eab5c67

Co-authored-by: Leon Derczynski <[email protected]> Signed-off-by: Vidushi Maheshwari <[email protected]>

Update garak/cli.py

c3d33d4

Co-authored-by: Leon Derczynski <[email protected]> Signed-off-by: Vidushi Maheshwari <[email protected]>

Update garak/harnesses/base.py

153ff2e

Co-authored-by: Leon Derczynski <[email protected]> Signed-off-by: Vidushi Maheshwari <[email protected]>

Update garak/cli.py

5086e80

Co-authored-by: Jeffrey Martin <[email protected]> Signed-off-by: Vidushi Maheshwari <[email protected]>

Merge branch 'main' of https://github.com/vidushiMaheshwari/garak int…

6526194

…o detector-only-run

Merge branch 'detector-only-run' of https://github.com/vidushiMaheshw…

a7b3e1f

…ari/garak into detector-only-run

leondz added the enhancement Architectural upgrades label Sep 18, 2024

decouple harness only run from execution

dbe916a

leondz changed the title ~~Detector only Harness~~ harness: Detector only Sep 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

harness: Detector only #833

harness: Detector only #833

vidushiMaheshwari commented Aug 15, 2024 •

edited

Loading

github-actions bot commented Aug 15, 2024 •

edited

Loading

vidushiMaheshwari commented Aug 15, 2024

vidushiMaheshwari commented Aug 15, 2024

leondz commented Aug 15, 2024

jmartin-tech left a comment

jmartin-tech Aug 15, 2024

jmartin-tech Aug 15, 2024

vidushiMaheshwari Aug 20, 2024

leondz commented Aug 21, 2024

leondz Aug 21, 2024

jmartin-tech Aug 21, 2024 •

edited

Loading

leondz Aug 21, 2024

leondz Aug 21, 2024

jmartin-tech Aug 21, 2024 •

edited

Loading

leondz Aug 21, 2024

leondz Aug 21, 2024

leondz left a comment

leondz commented Aug 21, 2024

jmartin-tech Aug 21, 2024

jmartin-tech left a comment

jmartin-tech Aug 21, 2024

vidushiMaheshwari Aug 23, 2024

jmartin-tech Sep 23, 2024

vidushiMaheshwari Sep 24, 2024

	super().run_detectors(detectors, attempts, evaluator) # The probe is None, but hopefully no errors occur with probe.
	self.run_detectors(detectors, attempts, evaluator) # The probe is None, but hopefully no errors occur with probe.

harness: Detector only #833

Are you sure you want to change the base?

harness: Detector only #833

Conversation

vidushiMaheshwari commented Aug 15, 2024 • edited Loading

github-actions bot commented Aug 15, 2024 • edited Loading

vidushiMaheshwari commented Aug 15, 2024

vidushiMaheshwari commented Aug 15, 2024

leondz commented Aug 15, 2024

jmartin-tech left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

leondz commented Aug 21, 2024

Choose a reason for hiding this comment

jmartin-tech Aug 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmartin-tech Aug 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

leondz left a comment

Choose a reason for hiding this comment

leondz commented Aug 21, 2024

Choose a reason for hiding this comment

jmartin-tech left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vidushiMaheshwari commented Aug 15, 2024 •

edited

Loading

github-actions bot commented Aug 15, 2024 •

edited

Loading

jmartin-tech Aug 21, 2024 •

edited

Loading

jmartin-tech Aug 21, 2024 •

edited

Loading