From 8a64a66ac9682fff7e663556b3c59889a918f9c2 Mon Sep 17 00:00:00 2001 From: Leon Derczynski Date: Wed, 28 Aug 2024 11:21:47 +0200 Subject: [PATCH] add faq items --- FAQ.md | 31 +++++++++++++++++++++++++++++-- 1 file changed, 29 insertions(+), 2 deletions(-) diff --git a/FAQ.md b/FAQ.md index b83a068c..27e76c5f 100644 --- a/FAQ.md +++ b/FAQ.md @@ -11,7 +11,7 @@ Both 'a's like a in English "hat", or à in French, or /æ/ in IPA. `garak` is designed to help discover situations where a language model generates outputs that one might not want it to. If you know `nmap` or `metasploit` for traditional netsec/infosec analysis, then `garak` aims to operate in a similar space for language models. -## How does it work? +## How does garak work? `garak` has probes that try to look for different "vulnerabilities". Each probs sends specific prompts to models, and gets multiple generations for each prompt. LLM output is often stochastic, so a single test isn't very informative. These generations are then processed by "detectors", which will look for "hits". If a detector registers a hit, that attempt is registered as failing. Finally, a report is output with the success/failure rate for each probe and detector. @@ -23,7 +23,7 @@ No. The scores from any probe don't operate on any kind of normalised scale. Hig Each detector is different. Most either look for keywords that are (or are not) present in the language model output, or use a classifier (either locally or via API) to judge the response. -## Does garak allow for additional prompts +## Does garak allow for additional prompts ? Additional prompts can be probed by creating a new plugin -- this isn't as tough as it sounds; take a look at the modules in the `garak/probes/` directory for inspiration. @@ -63,6 +63,33 @@ The things garak probes for are generally not like traditional cybersec vulnerab Some garak probes still check for traditional cybersecurity vulnerabilities within the scope of what can be extracted from APIs also used for inference. +## I tried to scan a model from HuggingFace, but for some reason, the process got killed when loading checkpoint shards. I ran the scan in my Jupyter notebook locally, the model had already been downloaded during a previous run. I couldn't get past 75% without the process being killed. + +This sounds like hitting a resource limit - something external to garak, e.g. the kernel, has taken action. Does your process have access to the required system RAM and GPU memory + +## How can I use garak to scan a NIM of an LLM? What should the "model_type" be? And how do we pass the NIM endpoint url to garak? + +`model_type` should be "nim" for chat-type models (which is most of them - this selects the right class automatically. Then, set model_name to [organisation]/[model name] from [build.nvidia.com](https://build.nvidia.com) (the JSON example is authoritative). For example, `--model_type nim --model_name meta/llama-3.1-8b-instruct`. You will need to put the API key in the `NIM_API_KEY` environment variable, or in the config. + +## If I have already scanned a model on HuggingFace, and I use the same model somewhere else, say in a container, is it necessary for me to scan the container with garak as well? + +No, if the model is the same, you should get the same results - though there are some probes that scan the model files themself, which work on Hugging Face but not via a container. + +## How can I scan a RAG pipeline with garak? + +Currently the major attack we hear about in RAG systems is indirect prompt injection, and garak already scans for a few of those. + +## There are so many probes in garak, I was trying to scan a model for all probes, but it took hours and I eventually had to kill that scan. What is the recommended practice on scanning a model? Which typical probes are recommended? + +Recommended practice: it's really context dependent. The builtin "fast" config works pretty well (`--config fast`). It's also useful to run with `--parallel_attempts` (using a value of e.g. 20 or 40) if the model isn't local. + +## Once a model is scanned, there is really no need to scan it again for the same probe(s) unless the model has been customized/finetuned? + +We update garak by improving existing probes or adding new ones quite frequently, and so scores will go down over time - garak isn't a benchmark, and the more we learn about failures in LLMs, the harder garak gets. But if you're looking at a short period of just a month or two, then the scores will probably stay pretty much the same. We do not recommend relying on scores over six months old. + + + +