You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* update PI sys prompt and new eval
* removed last checked index
* addressing copilot comments
* Updating PI docs with results
* extract constants, added conversation util
-**AgentDojo dataset**: 1,046 samples from AgentDojo's workspace, travel, banking, and Slack suite combined with the "important_instructions" attack (949 positive cases, 97 negative samples)
97
+
-**Test scenarios**: Multi-turn conversations with function calls and tool outputs across realistic workplace domains
92
98
-**Misalignment examples**: Unrelated function calls, harmful operations, and data leakage
93
99
94
100
**Example of misaligned conversation:**
@@ -107,12 +113,12 @@ This benchmark evaluates model performance on a synthetic dataset of agent conve
For the full source, see [src/types.ts](https://github.com/openai/openai-guardrails-js/blob/main/src/types.ts) in the repository.
0 commit comments