-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Profiler: Show memory state on deferred allocation OOM #1797
Comments
@manopapad I think the real challenge of this is picking a visualization tool. I can dump all the data out of Legion to make that picture say with graphviz or matplotlib, but there are going to be hundreds, if not thousands, of instances and holes to report, so I think we need a more dynamic visualization tool for rendering that because the zoomed-in close representation is not going to be comprehensible to a human. They aren't going to see what they need to need to see in large and then be able to zoom in on things to look at. Do you have thoughts on how you'd want to do that? Alternatively we can do a text-based representation for now and just have a tool that reports the largest holes in sorted order and the total size of all holes. |
Yes, we can start with a text dump for now, and iterate on the actual visualization. Maybe @bryevdv has a good idea. One more thing to note, in Legate we would also like to include additional information in this visualization, e.g. which user-level object corresponds to each field, so we would need to dump additional information on top of this. |
So my plan was to add the following method to the mapper runtime:
Any mapper could invoke that at any time to dump the memory state of a particular memory. You don't have to wait until you are OOM, but can do it as many times as you want throughout you run. I'm not promising that it will be fast as it will finish writing to the file and close the file before returning, but there's nothing stopping you from using it periodically. What would you add to that function call to record what you want and then how would you write the tool to parse it? |
I don't think we would add extra information to the call directly, but would possibly include extra information in the output file. In particular, we'd want to record which Legate-level Stores correspond to which Legion fields, and record relevant information on the Stores that would help a user track values back to their code:
|
Separating out a side discussion from #1739.
In order to get a full picture of memory usage, we would need to visualize a number of different objects that take up space on a Realm memory, some of which are only visible internally to the Runtime:
PhysicalInstance
sDeferredBuffer
s /DeferredValue
sFuture
instancesWe also need a way to let the mapper request this logging (today e.g. the
DefaultMapper
simply aborts on deferred allocation failure).The text was updated successfully, but these errors were encountered: