Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Outputs of inception network seems to be batch size dependent #43

Closed
nicolas-dufour opened this issue Mar 14, 2023 · 4 comments
Closed

Comments

@nicolas-dufour
Copy link

As mentionned in this issue Lightning-AI/torchmetrics#1620, the inception network seems to suffer from different results depending from the batch_size

Something i tried that seems to solve this issue is to run the network in float64

This batch independence seems to have considerable differences in FIDs as mentionned in Lightning-AI/torchmetrics#1620 the batch bias in FID seems to be higher from small batch-sizes. If we compute FID between 2 uniformly sampled distributions with a 1000 points each, if we compute it with a batch size of 1000, we get FID 1.9 but if we compute it with batch-size=2, then the FID is 10. Since we sample from the same distribution, FID should be as close too zero as possible.

Possible Fix

inception = inception.to(torch.float64)
imgs = imgs.to(torch.float64)
inception(imgs)
@toshas
Copy link
Owner

toshas commented Mar 20, 2023

Thanks for this report! Is there an insight into the nature of this issue? Of course, it is good that conversion to float64 fixes the issue, but IMO it cannot be considered a clear-cut fix. What would the actual fix look like? Does it happen on all GPUs/cudnns, or is it a narrower issue?
If this is something stemming from the drivers or hardware, then, of course fixing it with conversion to double should be an option. However, it is not clear to me that this is the only way to go and how involved the other ways are.

@nicolas-dufour
Copy link
Author

I've tried it on a CUDA 11.8 with pytorch 1.13 running on a Nvidia 3090

@toshas toshas closed this as completed in 028118e Apr 30, 2023
@toshas
Copy link
Owner

toshas commented Apr 30, 2023

I have checked a way to specify the double type in all feature extractors. I also checked in a separate test suite to track down discrepancies arising from batch size. Thanks for the report! 0.4.0 release soon

@nicolas-dufour
Copy link
Author

Thanks for solving this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants