Skip to content

Conversation

@IanYHChu
Copy link
Contributor

@IanYHChu IanYHChu commented Apr 30, 2025

By default, garak ggml generators invoke the ggml main binary (now llama-cli) in interactive mode, causing the process to hang and never return control to garak.

To allow for future compatibility with cli arguments to the ggml binary extra flags and parameters can be passed via configuration.

  • add -no-cnv as a default added flag
  • allow supply of a list of additional flags
  • allow supply of additional arguement and value pairs
  • keep suppression of projected params when set to None

Verification

List the steps needed to make sure this thing works

  • Test local GgmlGenerator configuration
  • Test suppression of GgmlGenerator -no-cnv flag with following config:
{
  "ggml": {
    "GgmlGenerator": {
      "extra_ggml_flags": null
    }
  }
}

@github-actions
Copy link
Contributor

github-actions bot commented Apr 30, 2025

DCO Assistant Lite bot All contributors have signed the DCO ✍️ ✅

@IanYHChu
Copy link
Contributor Author

I have read the DCO Document and I hereby sign the DCO

By default, garak ggml generators invoke the ggml main binary
(now llama-cli) in interactive mode, causing the process to hang
and never return control to garak. This commit:

• Modifies the command-building loop to always append each flag key,
  appending its value only if not None.
• Adds `"-no-cnv": None` to `command_params()`, ensuring the pure
  `-no-cnv` flag is passed to llama-cli to disable interactive mode.

With this change, garak ggml generators can scan model normally
when using the ggml main binary.

Signed-off-by: Ian Chu <[email protected]>
@IanYHChu IanYHChu force-pushed the generators/add-no-cnv-flag branch from 60d0714 to 4016941 Compare April 30, 2025 10:42
github-actions bot added a commit that referenced this pull request Apr 30, 2025
@jmartin-tech
Copy link
Collaborator

jmartin-tech commented Apr 30, 2025

This looks to be a change in behavior since introduced recently. I think there needs to be an option to suppress this functionality as users bring their own binary which may nor may not be a version that works as expected with this flag.

@IanYHChu
Copy link
Contributor Author

IanYHChu commented May 1, 2025

This provides a good explanation of the issue I encountered. When scanning the ggml/gguf local model, the automatic activation of the cnv flag (due to ggml-org/llama.cpp#11214) may cause the inability to return to garak. Therefore, I added the -no-cnv flag to mitigate this issue.
Currently, scanning ggml/gguf models can only be done by setting the GGML_MAIN_PATH environment variable to specify the path to llama-cli. Perhaps introducing another environment variable to specify additional options for llama-cli could provide a more general solution.

@jmartin-tech
Copy link
Collaborator

Perhaps introducing another environment variable to specify additional options for llama-cli could provide a more general solution.

This is close to what I am suggesting, generators have DEFAULT_PARAMS that can be configured via passing a json dictionary to --generator_options or --generator_option_file as well as by passed in a yaml file with --config. Most of the command_params() dictionary is coming from these options already such as self.frequency_penalty and self.top_k with the method being called formatting these with the expected flags for the binary.

Something like below would allow the user to provide a dictionary of additional parameters in the future or even suppress the default with an empty dict or None. Calling is something like extra_params, extra_cmd_params or extra_ggml_prarms may work.

    DEFAULT_PARAMS = Generator.DEFAULT_PARAMS | {
        "repeat_penalty": 1.1,
        "presence_penalty": 0.0,
        "frequency_penalty": 0.0,
        "top_k": 40,
        "top_p": 0.95,
        "temperature": 0.8,
        "exception_on_failure": True,
        "first_call": True,
        "key_env_var": ENV_VAR,
        "extra_params": {
            "-no-cnv": None
        },
    }

    generator_family_name = "ggml"

    def command_params(self):
        extra_params = self.extra_params if self.extra_params is not None else {}        
        return {
            "-m": self.name,
            "-n": self.max_tokens,
            "--repeat-penalty": self.repeat_penalty,
            "--presence-penalty": self.presence_penalty,
            "--frequency-penalty": self.frequency_penalty,
            "--top-k": self.top_k,
            "--top-p": self.top_p,
            "--temp": self.temperature,
            "-s": self.seed,
        } | extra_params

@IanYHChu
Copy link
Contributor Author

IanYHChu commented May 5, 2025

@jmartin-tech Thanks, your suggestion is definitely a more generic approach.

From what I can see, extra_ggml_params (or an equivalent mechanism) hasn’t landed in main yet, so garak still hangs on ggml scans without -no-cnv.

Would it make sense to keep this PR minimal: adding the -no-cnv flag to unblock scans now, and then explore introducing extra_ggml_params in a follow‑up once the broader generator‑options pipeline is settled?

I’m happy to proceed whichever path best fits the release timeline. Thanks again!

@jmartin-tech
Copy link
Collaborator

Each plugin can define the DEFAULT_PARAMS it needs already, the pattern I suggested could simply be incorporated in this PR as it impacts this class only. The class itself has to manage implementing usage of the attribute that get projected onto the instance.

I actually realized a mistake in my suggestion as currently any dict based DEFAULT_PARAMS keys merge with the defined params instead of fully overriding them, so the format { "-no_cov": None } would not offer a way to suppress the flag at this time.

Lists however are replaced instead of merged, Maybe something like:

extra_ggml_flags: [ "-no-cnv" ]
extra_ggml_args: {}

This would then allow suppression via configuration by setting:

plugins:
  generators:
    ggml:
      GgmlGenerator:
        extra_ggml_flags: []

I can offer this up as a PR against your branch in the next day or so if you can test and see if it turns out viable.

To allow for future compatibility with cli arguments to the ggml binary
extra flags and parameters can be passed via configuration.

* add `-no-cnv` as a default added flag
* allow supply of a list of additional flags
* allow supply of additional arguement and value pairs
* keep suppression of projected params when set to `None`

Signed-off-by: Jeffrey Martin <[email protected]>
Copy link
Collaborator

@erickgalinkin erickgalinkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Awesome work.

@jmartin-tech jmartin-tech merged commit 05d68e3 into NVIDIA:main May 9, 2025
11 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators May 9, 2025
@IanYHChu IanYHChu deleted the generators/add-no-cnv-flag branch May 10, 2025 03:18
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants