-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GPU] Add GPU plugin metric to get statistics of GPU memory allocated by engine #7758
[GPU] Add GPU plugin metric to get statistics of GPU memory allocated by engine #7758
Conversation
3c2b682
to
b08b35b
Compare
e1bc728
to
910d95d
Compare
@vladimir-paramuzov @iefode @avladimi Could you review this PR? |
910d95d
to
ff02018
Compare
ff02018
to
66d3134
Compare
66d3134
to
95c3795
Compare
if (m_context != nullptr) { | ||
auto impl = getContextImpl(m_context); | ||
impl->acquire_lock(); | ||
std::shared_ptr<cldnn::engine> eng = impl->GetEngine(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a clDNNPlugin expert, but where is a loop over multiples graphs (streams)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is NOT per-network, but overall (plugin) mem footprint as the https://github.com/openvinotoolkit/openvino/pull/7758/files#diff-0931e76736f996144670536e6a3712a65ad7277637ff564b7511fe8b827223a1R112 suggests... then the metric should be for the plugin, not for the ExecNetwork
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The engine object of cldnn library already stores the accumulated memory usage in multiple graphs. And since the requirement of this PR is to query memory usage statistics after LoadNetwork, we decided to implement it as ExecNetwork API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andrew-k-park I think @myshevts has fair point and @ilya-lavrenov mentioned the same thing in jira. If it's device-level statistics, then the query should be added to Core.
For executable network we can track memory consumption of this particular network, but currently it's just memory usage snapshot for the whole device.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand
- [This PR] statistics for ExecNetwork Metric : this will show the entire memory footprint of the networks including all streams requested for that networks.
- statistics for Core Metric if we add in the future: this will show the entire memory footprint of all networks & all streams in the target device.
In this sense I could not understand what @vladimir-paramuzov mentioned as "but currently it's just memory usage snapshot for the whole device." ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yeonbok
As I can see, current impl works as follows:
auto net1 = ie.LoadNetwork(model1);
auto stat1 = net1.GetMetric(GPU_METRIC_KEY(MEMORY_STATISTICS)); // mem consumption for net1 only
auto net2 = ie.LoadNetwork(model2);
auto stat2 = net2.GetMetric(GPU_METRIC_KEY(MEMORY_STATISTICS)); // mem consumption for net1 + net2
stat2 contains unexpected values for metric in executable network.
How it should work:
auto net1 = ie.LoadNetwork(model1);
auto stat1 = net1.GetMetric(GPU_METRIC_KEY(MEMORY_STATISTICS)); // mem consumption for net1 only
auto net2 = ie.LoadNetwork(model2);
auto stat2 = net2.GetMetric(GPU_METRIC_KEY(MEMORY_STATISTICS)); // mem consumption for net2 only
auto stat_global = core.GetMetric(GPU_METRIC_KEY(MEMORY_STATISTICS)); // mem consumption for net1+net2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latter was our intention, too. @andrew-k-park Could yuu confirm the behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yeonbok Based on the current implementation, multiple executable networks have same context impl, So the first case is correct. @vladimir-paramuzov My question is, is the latter your expected implementation to meet the requirement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought that the context impl is created for each ExecNetwork. If it is not true and the current behavior is not as we intended, I think we need to come up with an additional update so that we could be alinged with the initial concept. Current behavior is somewhat strange in the sense that it is implemented in Network. Lets come up with additional fix. Thanks for the review @myshevts @vladimir-paramuzov
@@ -109,6 +109,10 @@ class engine { | |||
/// Returns the amount of GPU memory specified allocation @p type that currently used by the engine | |||
uint64_t get_used_device_memory(allocation_type type) const; | |||
|
|||
/// Returns statistics of GPU memory allocated by engine in current process for all allocation types. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this is current process only, how do I track the overall GPU device memory utilization (e.g. how much free me I have to load another network)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, It's impossible to query information about overall GPU memory through multiple networks or multiple processes because only total size of memory information can retrieved from GPU device, and memory usage is managed by engine object in cldnn library through the recently called LoadNetwork API, So memory from multiple loaded networks must be tracked on application side.
95c3795
to
a1ea29a
Compare
memory_usage = iter->second.load(); | ||
} | ||
return memory_usage; | ||
} | ||
|
||
void engine::get_memory_statistics(std::map<std::string, uint64_t>* statistics) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, why statistics
is an argument of this method? can it be
std::map<std::string, uint64_t> engine::get_memory_statistics() const
?
…gine for each allocation type Signed-off-by: Andrew Kwangwoong Park <[email protected]>
Signed-off-by: Andrew Kwangwoong Park <[email protected]>
Details:
Tickets: