Using an e-commerce companies' — i.e Shopee — product listing images and textual descriptions written by owner of the listing, identify the identical products listed by different vendors.
35000 listing images and descriptions in English or Indonesian or both.
Creating combined embedding space of image and text then quantify similarity of listings based on cosine distance.
EfficientNet-b3 & BERT + FC + ArcFace
Unseen test data micro averaged F1-score of ~0.73