-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AutoDefense Blog #1982
AutoDefense Blog #1982
Conversation
@microsoft-github-policy-service agree |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1982 +/- ##
=======================================
Coverage 37.83% 37.83%
=======================================
Files 77 77
Lines 7766 7766
Branches 1663 1663
=======================================
Hits 2938 2938
Misses 4579 4579
Partials 249 249
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
...te/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx
Outdated
Show resolved
Hide resolved
...te/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx
Outdated
Show resolved
Hide resolved
...te/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for sharing the paper, good luck with the submission! :)
...te/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx
Outdated
Show resolved
Hide resolved
...te/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx
Outdated
Show resolved
Hide resolved
...te/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx
Outdated
Show resolved
Hide resolved
...te/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx
Outdated
Show resolved
Hide resolved
...te/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx
Show resolved
Hide resolved
…lbreak Attacks with AutoDefense.mdx Co-authored-by: Yiran Wu <[email protected]>
…lbreak Attacks with AutoDefense.mdx Co-authored-by: Joshua Kim <[email protected]>
…lbreak Attacks with AutoDefense.mdx Co-authored-by: Joshua Kim <[email protected]>
…lbreak Attacks with AutoDefense.mdx Co-authored-by: Joshua Kim <[email protected]>
…lbreak Attacks with AutoDefense.mdx Co-authored-by: Joshua Kim <[email protected]>
…lbreak Attacks with AutoDefense.mdx Co-authored-by: Joshua Kim <[email protected]>
…uce the two experiments.
...te/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx
Outdated
Show resolved
Hide resolved
...te/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx
Show resolved
Hide resolved
...te/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx
Show resolved
Hide resolved
...te/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx
Outdated
Show resolved
Hide resolved
…lbreak Attacks with AutoDefense.mdx Co-authored-by: Yiran Wu <[email protected]>
...te/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx
Outdated
Show resolved
Hide resolved
...te/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx
Show resolved
Hide resolved
...te/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx
Show resolved
Hide resolved
…lbreak Attacks with AutoDefense.mdx Co-authored-by: Yiran Wu <[email protected]>
…lbreak Attacks with AutoDefense.mdx Co-authored-by: Yiran Wu <[email protected]>
…lbreak Attacks with AutoDefense.mdx Co-authored-by: Yiran Wu <[email protected]>
@kevin666aa a kind reminder of the review requested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
BTW, can you add the paper to Research.md? @XHMY |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the revisions. Looks good now!
* AutoDefense Blog * Update Defense Agency Section * format update * Update website/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx Co-authored-by: Yiran Wu <[email protected]> * Update website/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx Co-authored-by: Joshua Kim <[email protected]> * Update website/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx Co-authored-by: Joshua Kim <[email protected]> * Update website/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx Co-authored-by: Joshua Kim <[email protected]> * Update website/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx Co-authored-by: Joshua Kim <[email protected]> * Update website/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx Co-authored-by: Joshua Kim <[email protected]> * format fix * rename picture, make it informative. Add a overall sentence to introduce the two experiments. * Update website/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx Co-authored-by: Yiran Wu <[email protected]> * update Further reading, introduction * update Further reading, introduction * Update website/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx Co-authored-by: Yiran Wu <[email protected]> * Update website/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx Co-authored-by: Yiran Wu <[email protected]> * Update website/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx Co-authored-by: Yiran Wu <[email protected]> --------- Co-authored-by: Yiran Wu <[email protected]> Co-authored-by: Joshua Kim <[email protected]>
* AutoDefense Blog * Update Defense Agency Section * format update * Update website/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx Co-authored-by: Yiran Wu <[email protected]> * Update website/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx Co-authored-by: Joshua Kim <[email protected]> * Update website/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx Co-authored-by: Joshua Kim <[email protected]> * Update website/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx Co-authored-by: Joshua Kim <[email protected]> * Update website/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx Co-authored-by: Joshua Kim <[email protected]> * Update website/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx Co-authored-by: Joshua Kim <[email protected]> * format fix * rename picture, make it informative. Add a overall sentence to introduce the two experiments. * Update website/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx Co-authored-by: Yiran Wu <[email protected]> * update Further reading, introduction * update Further reading, introduction * Update website/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx Co-authored-by: Yiran Wu <[email protected]> * Update website/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx Co-authored-by: Yiran Wu <[email protected]> * Update website/blog/2024-03-11-AutoDefense/Defending LLMs Against Jailbreak Attacks with AutoDefense.mdx Co-authored-by: Yiran Wu <[email protected]> --------- Co-authored-by: Yiran Wu <[email protected]> Co-authored-by: Joshua Kim <[email protected]>
Why are these changes needed?
A new blog post about our paper using the multi-agent system based on AutoGen to defend the LLM jailbreak attack.
Related issue number
N/A
Checks