This is the repository for the course "Watermarking: Defense and Hazards" that I delivered at the Summer School on Artificial Intelligence for a Secure Society (5-10 September, 2024) in Capo Vaticano, Italy. The School is an initiative funded by the SoBigata Research Infrastructure and the SERICS Foundation.
Digital watermarking allows to hide information within a digital carrier, such as text, video, and network traffic. For instance, cloaked data can be used to check the integrity of a software, track the diffusion of digital media, or to enforce the intellectual property. With the diffusion of AI-generated contents, the ability of developing advanced watermarking schemes has become a prime research topic. In fact, watermarking enables to protect the code generated through large language models or to understand whether an image has been created by a human or a machine. Unfortunately, the availability of techniques to conceal data within other data also opens to many security issues, including an emerging class of threats defined as steganographic malware. Therefore, this course briefly introduces the core concepts of digital watermarking and it also outlines the main research questions to be faced to make such techniques capable of handling future digital contents and support ethical needs.
The duration of the course is 3 hours.
The repository is organized as follows:
- Examples: contains the various examples shown during the course;
- Literature: contains some reference works that can provide additional details or useful directions to further investigate the topic;
- Scripts: contains the Python scripts used to process the digital images and the network traffic used throughout the course;
- Slides: contains the .pdf version of the slides.
The material used during the course and collected in this repository has been prepared with the help of Angelica Liguori and Marco Zuppelli. Angelica is the owner of the code for watermarking AI models, while Marco prepared the digital media and network traffic examples. Lastly, Massimo Guarascio provided many interesting insights on how to embed data within AI models.
Feel free to contact me at luca.caviglione(AT)cnr.it