-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Distributed pmap memory leak using metaprogramming with MWE #37560
Comments
As written, this is expected, since generated code does not get freed. |
@Keno Thanks for the attention given to this issue. I wonder if this is recognized as a bug for Flux? Is there a temporary workaround for this? |
Closing as a duplicate of #14495. For your particular use case, can you redefine the method rather than define new modules/functions? |
@simonbyrne This affects all the Flux/Zygote backpropagation code. |
@freddycct are you saying that Zygote requires you to define new modules or functions rather than redefine existing methods? |
@simonbyrne, Zygote internally define new functions for each backpropagation. Not sure why the Zygote devs are not commenting. The above is just an abstraction, or minimal example. |
I suggest opening an issue on Zygote.jl then. |
I discovered this issue while using Flux/Zygote. I am training a big and complex model with Julia distributed pmap. It used to be fine under Flux/Tracker but after moving to Zygote, the memory consumption seems to grow overtime. GC.gc() does not work, if one observes memory usage using
top
in linux or activity monitor in osx, it seems to grow until the program terminates due to out-of-memory by the OS.It looks like pmap doesn't free the memory generated through the use of Julia's metaprogramming features.
The machines I have experimented on:
Minimum working example
@MikeInnes , trying to get your attention since you are the author on Zygote/src/compiler/*.jl files
References
The text was updated successfully, but these errors were encountered: