Replies: 3 comments 5 replies
-
GitHub Discussions are available 🎉 |
Beta Was this translation helpful? Give feedback.
1 reply
-
Very interesting paper! Is there a possibility to attack the pretrained language model itself? It might be a stronger attack. |
Beta Was this translation helpful? Give feedback.
1 reply
-
just for clarification: what percentage of the corpus the authors poisoned in that paper? |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi everyone, this week I wrote up a quick discussion on a great paper from Kurita et al.'s on how pre-trained models can be "poisoned" to exhibit nefarious behavior that persist even after fine-tuning on downstream tasks. Below are a few general discussion questions I'd love to get your input on, but feel free to also bring up anything that's interesting to you!
Discussion Questions
Beta Was this translation helpful? Give feedback.
All reactions