-
Notifications
You must be signed in to change notification settings - Fork 401
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[#6238] improvement(storage): Improve get role performance when roles is bound to many metadata. #6455
base: main
Are you sure you want to change the base?
Conversation
In xiaomi company, we find when roles is bounded to many securable objects, then get role is very slow, so we try so solve get role function. |
c811185
to
1860a7a
Compare
core/src/main/java/org/apache/gravitino/storage/relational/service/RoleMetaService.java
Outdated
Show resolved
Hide resolved
} | ||
}); | ||
|
||
// Since there are many comparisons of RoleEntity in the unit tests, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Since there are many comparisons of RoleEntity in the unit tests, | |
// Since there are many comparisions of RoleEntity in the unit tests, |
core/src/test/java/org/apache/gravitino/storage/relational/service/TestRoleMetaService.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/apache/gravitino/storage/relational/service/RoleMetaService.java
Outdated
Show resolved
Hide resolved
|
||
public static Map<Long, String> getMetadataObjectFullNames(Long metalakeId, List<Long> ids) { | ||
Map<Long, String> catalogIdAndNameMap = getCatalogIdAndNameMap(metalakeId); | ||
Map<Long, Map<Long, String>> schemaIdAndNameMap = getSchemaIdAndNameMap(metalakeId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to get all the schema of the metalake. We can get the schema id list according to the fileset list.
.forEach( | ||
(type, objects) -> { | ||
// If the type is Fileset, use the batch retrieval interface; | ||
// otherwise, use the single retrieval interface |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you can add TODO
to get other securable objects using batch retrieving?
.collect(Collectors.toList()); | ||
|
||
Map<Long, String> filesetIdAndNameMap = | ||
getMetadataObjectFullNames(po.getMetalakeId(), filesetIds); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the name should be getFilesetObjectFullNames
, right?
// Since there are many comparisons of RoleEntity in the unit tests, | ||
// and the order after grouping by is different each time, | ||
// the results are sorted by fullName here to ensure consistent query results. | ||
securableObjects.sort(Comparator.comparing(MetadataObject::fullName)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a temporary solution to make it work in UTs. It's the UTs that we should change. If there are no major performance issues, it's okay with me.
1860a7a
to
0dc985e
Compare
@@ -35,6 +36,24 @@ public String listSchemaPOsByCatalogId(@Param("catalogId") Long catalogId) { | |||
+ " WHERE catalog_id = #{catalogId} AND deleted_at = 0"; | |||
} | |||
|
|||
public String listSchemaPOsByCatalogIds(@Param("catalogIds") List<Long> catalogIds) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this should be listSchemaPOsBySchemaIds
.
0dc985e
to
a638ad3
Compare
|
||
filesetPOs.forEach( | ||
filesetPO -> { | ||
String catalogName = catalogIdAndNameMap.get(filesetPO.getCatalogId()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will catalogName or schemaName be null if catalog or schema is deleted?
What changes were proposed in this pull request?
fix issue #6238
improve performance when a single role is bound to many metadata.
Why are the changes needed?
Use batch queries when getting role securable object full names instead of loop queries to get each securable object full name.
Fix: #6238
Does this PR introduce any user-facing change?
No
How was this patch tested?
Unit tests and integration tests have all passed, this feature has been running internally at Xiaomi for two weeks.