-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-25368 Filter out more invalid encoded name in isEncodedRegionNa… #2753
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -363,7 +363,23 @@ static byte[] getStartKey(final byte[] regionName) throws IOException { | |
| @InterfaceAudience.Private // For use by internals only. | ||
| public static boolean isEncodedRegionName(byte[] regionName) { | ||
| // If not parseable as region name, presume encoded. TODO: add stringency; e.g. if hex. | ||
| return parseRegionNameOrReturnNull(regionName) == null && regionName.length <= MD5_HEX_LENGTH; | ||
| if (parseRegionNameOrReturnNull(regionName) == null) { | ||
| if (regionName.length > MD5_HEX_LENGTH) { | ||
| return false; | ||
| } else if (regionName.length == MD5_HEX_LENGTH) { | ||
| return true; | ||
| } else { | ||
| String encodedName = Bytes.toString(regionName); | ||
| try { | ||
| Integer.parseInt(encodedName); | ||
| // If this is a valid integer, it could be hbase:meta's encoded region name. | ||
| return true; | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We know the hbase:meta int. It does not change. Compare it? When meta splits, it will have md5 for new regions.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It needs to have meta regionname to compare (another query). This will defeat the purpose of optimization, so instead, it just check if it is an integer. If it is an integer, then this is a possible encoded region name. |
||
| } catch(NumberFormatException er) { | ||
| return false; | ||
| } | ||
| } | ||
| } | ||
| return false; | ||
saintstack marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| } | ||
|
|
||
| /** | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are we protecting against? Could we be passed a tablename? If so, why can't we have a tablename that is an MD5? Or 32 bytes in size? Should we check it is all hex at least? I suppose if someone passes a tablename that is an md5, then they are asking for trouble?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @saintstack. Two issues are addressed.
For compact/flush/split with the tableName,
` # Requests a table or region or column family major compaction
def major_compact(table_or_region_name, family = nil, type = 'NORMAL')
family_bytes = nil
family_bytes = family.to_java_bytes unless family.nil?
compact_type = nil
if type == 'NORMAL'
compact_type = org.apache.hadoop.hbase.client.CompactType::NORMAL
elsif type == 'MOB'
compact_type = org.apache.hadoop.hbase.client.CompactType::MOB
else
raise ArgumentError, 'only NORMAL or MOB accepted for type!'
end
`
it first calls majorCompactRegion() with tableName as input. It expects an IllegalArgumentException or UnknownRegionException to call majorCompact().
This normally involves a registry query to get this UnknownRegionException, this is an expensive path.
If it knows that the input string is not an encodedRegionName or regionName, it can throw out IllegalArgumentException without registry query.
For the case that a md5 hex being used as tableName, it has to go through expensive path to find out that this is not an encodedRegionName, it will still work, just optimization wont be applied in this case.
It also fixes a bug for a table name length over 32 bytes, currently, compact/flush/split this tableName from shell will fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
THanks for explanation. Shove this up in the JIRA description? Its good. Thanks.