You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Xref FluxML/Metalhead.jl#105 (comment). When running gradient tests on some Metalhead models, the times taken for the first gradient are off the charts. Specifically, the function gradtest was run on the models in Metalhead#master:
functiongradtest(model, input)
y, pb = Zygote.pullback(() ->model(input), Flux.params(model))
gs =pb(ones(Float32, size(y)))
# if we make it to here with no error, success!returntrueend
and the times taken for the tests were:
The ViT model can be ignored because gradtest wasn't run on it, but it seems to be taking quite a large amount of time for precompilation of the other models...
Xref FluxML/Metalhead.jl#105 (comment). When running gradient tests on some Metalhead models, the times taken for the first gradient are off the charts. Specifically, the function
gradtest
was run on the models in Metalhead#master:and the times taken for the tests were:
The ViT model can be ignored because
gradtest
wasn't run on it, but it seems to be taking quite a large amount of time for precompilation of the other models...cc @CarloLucibello
The text was updated successfully, but these errors were encountered: