generated from IBM/repo-template
-
Notifications
You must be signed in to change notification settings - Fork 11
Closed
Labels
enhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomers
Description
Description (Actual Behavior)
Currently the sentences in the API files and the API consider harmful and biased prompts.
Expected Behavior
It is expected that the API also identifies prompts that are adversarial and might constitute an LLM attack.
Possible Approach
Include some prompts from AdvBench, a benchmark for adversarial prompts and expand the API to identify them.
Steps to Reproduce
N/A
Context
N/A
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomers