This repository provides an implementation of the Adaptive Centroid Shift Loss (AOCloss) method for Audio Deepfake Detection, as described in the corresponding research paper. The approach employs a one-class learning framework that continuously adapts a centroid to represent bonafide audio embeddings while maximizing the distance of spoof embeddings.
from AOC_loss import AOCloss
# Initialize with desired embedding dimension
criterion = AOCloss(embedding_dim=512)
loss = criterion(embeddings, labels)
embeddings
: Tensor of shape(batch_size, embedding_dim)
.labels
: Binary tensor where0
represents bonafide samples and1
represents spoof samples.
The centroid is automatically updated during the forward pass.
criterion.update_centroid(bonafide_embeddings)
- PyTorch >= 1.10
- Python >= 3.8
-
ValueError: Centroid has not been initialized
:- Ensure the batch contains bonafide samples.
-
Negative Loss:
- This is expected due to the range of cosine similarity between
-1
and1
.
- This is expected due to the range of cosine similarity between
This github implementation based on the following paper:
@inproceedings{kim24b_interspeech,
title = {One-class learning with adaptive centroid shift for audio deepfake detection},
author = {Hyun Myung Kim and Kangwook Jang and Hoirin Kim},
year = {2024},
booktitle = {Interspeech 2024},
pages = {4853--4857},
doi = {10.21437/Interspeech.2024-177},
issn = {2958-1796},
}
For issues or questions, feel free to open an issue in the repository.
- This implementation is inspired by research paper.