-
Notifications
You must be signed in to change notification settings - Fork 2.3k
[Enhancement] Extend 'show usage resource groups;' with mem_pool metrics #66690
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enhancement] Extend 'show usage resource groups;' with mem_pool metrics #66690
Conversation
🧪 CI InsightsHere's what we observed from your CI run for b9c4141. 🟢 All jobs passed!But CI Insights is watching 👀 |
bf65e1c to
b9c4141
Compare
|
@cursor review |
|
@cursor review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Test assertions missing new columns in expected output
The test assertions at lines 1592-1602 and 1620-1624 expect the old output format with only 6 columns, but the code now produces 10 columns per row (including BEMemLimitBytes, BEMemPool, BEMemPoolInUseMemBytes, and BEMemPoolMemLimitBytes). The header assertion at lines 1562-1563 was correctly updated to include the new columns, but the data row assertions still expect the old format. These tests will fail because the actual output contains 4 additional column values that are not included in the expected strings.
fe/fe-core/src/test/java/com/starrocks/qe/scheduler/QueryQueueManagerTest.java#L1591-L1602
starrocks/fe/fe-core/src/test/java/com/starrocks/qe/scheduler/QueryQueueManagerTest.java
Lines 1591 to 1602 in 4f1d12a
| String res = starRocksAssert.executeShowResourceUsageSql("SHOW USAGE RESOURCE GROUPS;"); | |
| assertThat(res).isEqualTo("Name|Id|Backend|BEInUseCpuCores|BEInUseMemBytes|BERunningQueries\n" + | |
| "default_wg|2|be0-host|3.112|39|38\n" + | |
| "default_mv_wg|3|be1-host|4.11|49|48\n" + | |
| "wg0|10|be0-host|0.112|9|8\n" + | |
| "wg0|10|be1-host|1.11|19|18\n" + | |
| "wg1|11|be0-host|0.1|0|0\n" + | |
| "wg1|11|be1-host|1.1|0|0\n" + | |
| "wg2|12|be0-host|0.12|7|6\n" + | |
| "wg2|12|be1-host|1.12|17|16\n" + | |
| "wg3|13|be0-host|0.03|0|0\n" + | |
| "wg3|13|be1-host|0.13|0|0"); |
fe/fe-core/src/test/java/com/starrocks/qe/scheduler/QueryQueueManagerTest.java#L1619-L1624
starrocks/fe/fe-core/src/test/java/com/starrocks/qe/scheduler/QueryQueueManagerTest.java
Lines 1619 to 1624 in 4f1d12a
| String res = starRocksAssert.executeShowResourceUsageSql("SHOW USAGE RESOURCE GROUPS;"); | |
| assertThat(res).isEqualTo("Name|Id|Backend|BEInUseCpuCores|BEInUseMemBytes|BERunningQueries\n" + | |
| "wg0|10|be0-host|0.21|29|28\n" + | |
| "wg1|11|be0-host|0.2|0|0\n" + | |
| "wg2|12|be1-host|1.22|27|26\n" + | |
| "wg3|13|be1-host|0.23|0|0"); |
|
@cursor review |
|
The failing CI PIPELINE / FE UT check is due to This is very interesting, because this test passes for me locally, and I can see the new columns when executing The columns in the Any idea why the CI run might not be getting the expected result? |
|
@cursor review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅ Bugbot reviewed your changes and found no bugs!
|
@mergify rebase |
✅ Branch has been successfully rebased |
080a7ca to
c1421a2
Compare
|
@arin-mirza |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅ Bugbot reviewed your changes and found no bugs!
|
Regarding the failing BE UT check Seems unrelated. Flaky? |
[FE Incremental Coverage Report]✅ pass : 24 / 24 (100.00%) file detail
|
|
@mergify rebase |
…age metrics Signed-off-by: arin-mirza <a.mirza@celonis.com>
Signed-off-by: arin-mirza <a.mirza@celonis.com>
Signed-off-by: arin-mirza <a.mirza@celonis.com>
Signed-off-by: arin-mirza <a.mirza@celonis.com>
Signed-off-by: arin-mirza <a.mirza@celonis.com>
Signed-off-by: arin-mirza <a.mirza@celonis.com>
Signed-off-by: arin-mirza <a.mirza@celonis.com>
Signed-off-by: arin-mirza <a.mirza@celonis.com>
Signed-off-by: arin-mirza <a.mirza@celonis.com>
…olumns Signed-off-by: arin-mirza <a.mirza@celonis.com>
Signed-off-by: arin-mirza <a.mirza@celonis.com>
✅ Branch has been successfully rebased |
8e750b3 to
a07cc5e
Compare
|
[Java-Extensions Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
|
@Mergifyio backport branch-4.0 |
✅ Backports have been createdDetails
|
[BE Incremental Coverage Report]❌ fail : 13 / 18 (72.22%) file detail
|
…ics (StarRocks#66690) Signed-off-by: arin-mirza <a.mirza@celonis.com> Signed-off-by: Farhad Shahmohammadi <f.shahmohammadi@celonis.com>
…ics (StarRocks#66690) Signed-off-by: arin-mirza <a.mirza@celonis.com>



Why I'm doing:
The
mem_poolproperty for resource groups was recently introduced. See related pull requests below:Currently, there is no way to display the usage metrics of the memory pools.
What I'm doing:
This pull request extends the result of
show usage resource groups;command with the following:BEMemLimitBytes) under the same memory pool to the same value. This value also corresponds to the memory limit of the memory pool (BEMemPoolMemLimitBytes). The displayed values in these two columns will coincide due to this restriction, but it might change in the future.BEMemPoolInUseMemBytesis the sum of memory usages of each resource group that belongs to that memory pool. See the screenshots below for an example scenario.Notes/Discussion
group_versionaTResourceGroupUsageobject belongs to. I tried two different approaches:resorce_group_usage_recorder.cpp, I added an auxillaryVersionedResourceGroupUsagestruct which had a group_version and aTResourceGroupUsageobject as a pair. This solution worked, but it is just too messy for something as simple as this. See b9c4141group_versionfield to theResourceUsage.thriftfile and directly use it. This is much cleaner. See bba2165group_versionis only used byresorce_group_usage_recorder.cppin the backend, and is not read from the thrift file on the front end side.Demo
Fixes #issue
What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist:
Bugfix cherry-pick branch check:
Note
Enhances resource group usage reporting with memory‑pool metrics end-to-end.
SHOW USAGE RESOURCE GROUPSwithBEMemLimitBytes,BEMemPool,BEMemPoolInUseMemBytes,BEMemPoolMemLimitBytes; updates metadata, rendering, and testsResourceGroupUsageRecordernow includesgroup_version,mem_limit_bytes,mem_pool, and mem‑pool limit/usage; merges per‑group stats preferring latestgroup_versionWorkGroupexposes parent memory limit/usage helpers used for mem‑pool valuesComputeNode.ResourceGroupUsageextended to carry new fields and map from thriftTResourceGroupUsagefor mem limits/pool andgroup_versionWritten by Cursor Bugbot for commit a07cc5e. This will update automatically on new commits. Configure here.