-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-25368 Filter out more invalid encoded name in isEncodedRegionNa… #2753
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
🎊 +1 overall
This message was automatically generated. |
| if (regionName.length > MD5_HEX_LENGTH) { | ||
| return false; | ||
| } else if (regionName.length == MD5_HEX_LENGTH) { | ||
| return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are we protecting against? Could we be passed a tablename? If so, why can't we have a tablename that is an MD5? Or 32 bytes in size? Should we check it is all hex at least? I suppose if someone passes a tablename that is an md5, then they are asking for trouble?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @saintstack. Two issues are addressed.
For compact/flush/split with the tableName,
` # Requests a table or region or column family major compaction
def major_compact(table_or_region_name, family = nil, type = 'NORMAL')
family_bytes = nil
family_bytes = family.to_java_bytes unless family.nil?
compact_type = nil
if type == 'NORMAL'
compact_type = org.apache.hadoop.hbase.client.CompactType::NORMAL
elsif type == 'MOB'
compact_type = org.apache.hadoop.hbase.client.CompactType::MOB
else
raise ArgumentError, 'only NORMAL or MOB accepted for type!'
end
begin
if family_bytes.nil?
@admin.majorCompactRegion(table_or_region_name.to_java_bytes)
else
@admin.majorCompactRegion(table_or_region_name.to_java_bytes, family_bytes)
end
rescue java.lang.IllegalArgumentException, org.apache.hadoop.hbase.UnknownRegionException
if family_bytes.nil?
@admin.majorCompact(TableName.valueOf(table_or_region_name), compact_type)
else
@admin.majorCompact(TableName.valueOf(table_or_region_name), family_bytes, compact_type)
end
end
end
`
it first calls majorCompactRegion() with tableName as input. It expects an IllegalArgumentException or UnknownRegionException to call majorCompact().
This normally involves a registry query to get this UnknownRegionException, this is an expensive path.
If it knows that the input string is not an encodedRegionName or regionName, it can throw out IllegalArgumentException without registry query.
For the case that a md5 hex being used as tableName, it has to go through expensive path to find out that this is not an encodedRegionName, it will still work, just optimization wont be applied in this case.
It also fixes a bug for a table name length over 32 bytes, currently, compact/flush/split this tableName from shell will fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
THanks for explanation. Shove this up in the JIRA description? Its good. Thanks.
| try { | ||
| Integer.parseInt(encodedName); | ||
| // If this is a valid integer, it could be hbase:meta's encoded region name. | ||
| return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We know the hbase:meta int. It does not change. Compare it? When meta splits, it will have md5 for new regions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It needs to have meta regionname to compare (another query). This will defeat the purpose of optimization, so instead, it just check if it is an integer. If it is an integer, then this is a possible encoded region name.
hbase-client/src/main/java/org/apache/hadoop/hbase/client/RegionInfo.java
Show resolved
Hide resolved
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
|
||
| ADMIN.flush(tableName); | ||
| //ADMIN.majorCompactRegion(Bytes.toBytes("abcdefghijklmnopqrstuvwxyzabcdedgeghijklmnopqrs")); | ||
| ADMIN.majorCompactRegion(Bytes.toBytes("abcd")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will remove these two lines, they are used for testing before the new test case.
|
Failure seems unrelated. Rerunning anyways while Huaxiang sleeps. |
saintstack
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Rerunning tests to make sure the failure unrelated.
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
…me(byte[] regionName)
d4167a7 to
2bf1fde
Compare
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
Seems good to go @huaxiangsun ? |
|
Let me introduce the isMD5Hash, @saintstack |
|
Actually, I thought about it. If I introduce apache.common.codec.MD5Hex.decodeHex(), it is heavy (allocate/copy). I think it is ok without introducing MD5 as most of the calls are the real encodedRegionNames. Only for shell commends, table name can be passed in. In this case, it is ok to have a query to registry in case table name length is 32 bytes (which could be rare as well). |
|
Ok. Still +1 on this @huaxiangsun |
…me(byte[] regionName)
This addresses a few issues:
Signed-off-by: stack [email protected]