This is quick re-implementation of asoftmax loss proposed in this paper: SphereFace: Deep Hypersphere Embedding for Face Recognition. Please cite it if it helps in your paper.
- I was using Tensorflow 1.4
- I followed this author's caffe implementation sphereface.
- l is \lambda in the paper to balance the modified logits and original logits
Set l = 1
My observation is that the same set of hyper-parameters does not work well in TF. The asoftmax generally improves the accuracy for about 2% on LFW when trained with CASIA. The best accuracy I got is about 98.X%. It seems it is quite tricky to tune the hyper-parameters to match the accuracy of the implementation in caffe.