-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MLA-1762] reduce memory allocations from DiscreteActionOutputApplier #4922
Conversation
public int Sample(float[] cmf) | ||
/// <param name="branchSize">The number of possible branches, i.e. the effective size of the cmf array.</param> | ||
/// <returns>A sampled index from the CMF ranging from 0 to branchSize-1.</returns> | ||
public int Sample(float[] cmf, int branchSize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the float[] might be larger than we need now, we also pass the effective size of the array.
We could instead (or in additionally) repeat the final sumProb
value in ComputeCdf()
readonly ActionSpec m_ActionSpec; | ||
readonly int[] m_StartActionIndices; | ||
readonly float[] m_cdfBuffer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m_CdfBuffer ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch
|
||
if (src.data.batch != dst.data.batch) | ||
{ | ||
throw new ArgumentException("Batch size for input and output data is different!"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these exceptions no longer needed ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure they were ever necessary in the first place. The temporary tensors they were referencing are no longer needed.
Proposed change(s)
There were several intermediate tensors and float arrays (1d and 2d) allocated on each call. This reuses a single float[] for the CDF, and eliminates the other intermediate allocations. Now for each action, we compute the CDF and sample the result directly to the output ActionBuffers.DiscreteActions.
Before:


After:
Types of change(s)
Checklist
Other comments