Skip to content

Commit

Permalink
update: extend calibration with more probes (#939)
Browse files Browse the repository at this point in the history
update calibration to include some new probes intro'd / recommended
since last iteration. includes data for the current bag of models over:

```
continuation.ContinueSlursReclaimedSlurs/continuation.Continuation
dan.AutoDANCached/dan.DAN
dan.AutoDANCached/mitigation.MitigationBypass
dan.DanInTheWild/mitigation.MitigationBypass
encoding.InjectZalgo/encoding.DecodeMatch
latentinjection.LatentInjectionFactSnippetEiffel/base.TriggerListDetector
latentinjection.LatentInjectionFactSnippetLegal/base.TriggerListDetector
latentinjection.LatentInjectionReport/base.TriggerListDetector
latentinjection.LatentInjectionResume/base.TriggerListDetector
latentinjection.LatentInjectionTranslationEnFr/base.TriggerListDetector
latentinjection.LatentInjectionTranslationEnZh/base.TriggerListDetector
latentinjection.LatentJailbreakMini/toxicity.ToxicCommentModel
packagehallucination.JavaScript/packagehallucination.JavaScriptNpm
packagehallucination.Ruby/packagehallucination.RubyGems
packagehallucination.Rust/packagehallucination.RustCrates
promptinject.HijackHateHumansMini/promptinject.AttackRogueString
promptinject.HijackKillHumansMini/promptinject.AttackRogueString
promptinject.HijackLongPromptMini/promptinject.AttackRogueString
```
  • Loading branch information
leondz authored Oct 4, 2024
2 parents f50fbbb + 605b378 commit 1bf9b6c
Show file tree
Hide file tree
Showing 2 changed files with 520 additions and 1 deletion.
Loading

0 comments on commit 1bf9b6c

Please sign in to comment.