ImageNet 21k based filtered dataset #83

isidentical · 2024-05-12T01:07:12Z

Image-based filtering. We select a subset of examples whose visual content overlaps with ImageNet
classes. After applying English language (fasttext) and caption length filtering, we cluster the
image embeddings extracted by the OpenAI ViT-L/14 model for each image into 100K groups using
Faiss [ 75]. We then find the nearest neighbor group for every ImageNet training example, and keep
examples belonging to these groups. We apply this procedure using either ImageNet-21K (14M
images) or ImageNet-1K (1.2M images), forming two subsets.

In the paper, regarding the composition of "Image filters", it mentions that either ImageNet-21K or ImageNet-1K can be used. Looking into the code however, especially for the Datacomp 1B, it looks like only IN1K is used. Is there a version of the Datacomp 1B with IN21K?

sagadre · 2024-05-15T14:53:57Z

Hi @isidentical, thanks for the questions! In our scaling experiments we scaled both the IN1k and IN21k strategies up to the large pool (filtering 1.28B samples). Looking at Table 27 in the paper and comparing rows Image-based clustering (ImageNet1k) and Image-based clustering (ImageNet21k), we noticed average performance of 0.481 vs. 0.471. Hence, we only scaled up the IN1k strategy to the xlarge pool (filtering 12.8B samples). Unfortunately we don't have a IN21k version of DataComp 1B on hand

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ImageNet 21k based filtered dataset #83

ImageNet 21k based filtered dataset #83

isidentical commented May 12, 2024

sagadre commented May 15, 2024

ImageNet 21k based filtered dataset #83

ImageNet 21k based filtered dataset #83

Comments

isidentical commented May 12, 2024

sagadre commented May 15, 2024