Debora Nozza • Federico Bianchi • Giuseppe Attanasio
HATE-ITA is a binary hate speech classification model for Italian social media text.
See the paper for additional details:
Debora Nozza, Federico Bianchi, and Giuseppe Attanasio. 2022. HATE-ITA: New Baselines for Hate Speech Detection in Italian. In Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH), pages 252–260, Seattle, Washington (Hybrid). Association for Computational Linguistics. Link
Code comes from HuggingFace and thus our License is an MIT license.
For models restrictions may apply on the data (which are derived from existing datasets) or Twitter (main data source). We refer users to the original licenses accompanying each dataset and Twitter regulations.
Important: If you want to use CUDA you need to install the correct version of the CUDA systems that matches your distribution, see PyTorch.
from hate_ita.classifier import HateSpeechClassifier
hc = HateSpeechClassifier()
hc.predict(["ti odio", "come si fa a rompere la lavatrice porca puttana"])
>> ["hate", "not-hate"]
We release three models (see the paper for reference).
from hate_ita.classifier import HateSpeechClassifier
hc = HateSpeechClassifier("twitter")
hc = HateSpeechClassifier("base")
hc = HateSpeechClassifier("large")
If you use this tool please cite the following paper:
@inproceedings{nozza-etal-2022-hate, title = "{HATE}-{ITA}: Hate Speech Detection in {I}talian Social Media Text", author = "Nozza, Debora and Bianchi, Federico and Attanasio, Giuseppe", booktitle = "Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)", month = jul, year = "2022", address = "Seattle, Washington (Hybrid)", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2022.woah-1.24", doi = "10.18653/v1/2022.woah-1.24", pages = "252--260" }
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.