Skip to content
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions hbase-common/src/main/resources/hbase-default.xml
Original file line number Diff line number Diff line change
Expand Up @@ -656,6 +656,12 @@ possible configurations would overwhelm and obscure the important.
<description>The minimum size for a region to be considered for a merge, in whole
MBs.</description>
</property>
<property>
<name>hbase.normalizer.merge.max.region.count</name>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Careful here. This config name looks like a counter-point to hbase.normalizer.merge.min.region.count but in fact it's not. The min config is placing a guard-rail on the minimum number of regions a table contain in order to be considered for merge normalization. This new guard-rail is limiting the number of regions that can be merged together in a single operation.

I instead suggest a config named something like hbase.normalizer.merge.merge_request_max_number_of_regions. Also, the description text should be updated accordingly -- this is just a copy-paste of the other config's description.

<value>50</value>
<description>The maximum number of regions in a table to consider for merge
normalization.</description>
</property>
<property>
<name>hbase.table.normalization.enabled</name>
<value>false</value>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,8 @@ class SimpleRegionNormalizer implements RegionNormalizer, ConfigurationObserver
static final int DEFAULT_MERGE_MIN_REGION_AGE_DAYS = 3;
static final String MERGE_MIN_REGION_SIZE_MB_KEY = "hbase.normalizer.merge.min_region_size.mb";
static final int DEFAULT_MERGE_MIN_REGION_SIZE_MB = 0;
static final String MERGE_MAX_REGION_COUNT_KEY = "hbase.normalizer.merge.max.region.count";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However you chance the configuration key, please also change these variable names. As it is, they look identical to the other, subtly-different MAX config variable, and it will be confusing to maintainers.

static final long DEFAULT_MERGE_MAX_REGION_COUNT = Long.MAX_VALUE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to put a large but limiting default, so that no one else hits the same problem you do, right? In order to execute a merge procedure, the master must assign all involved regions to the same region server. So likely a majority of the regions involved have to be moved first -- all those assign and close and open procedures. How about a default of 100? 1_000?

Remember, this limits the size of a single merge request. With a default of 100, your problematic situation with 25k regions would still generate 250 merge procedures (much better, but still a lot all at once).


private MasterServices masterServices;
private NormalizerConfiguration normalizerConfiguration;
Expand Down Expand Up @@ -138,6 +140,16 @@ private static long parseMergeMinRegionSizeMb(final Configuration conf) {
return settledValue;
}

private static long parseMergeMaxRegionCount(final Configuration conf) {
final long parsedValue =
conf.getLong(MERGE_MAX_REGION_COUNT_KEY, DEFAULT_MERGE_MAX_REGION_COUNT);
final long settledValue = Math.max(1, parsedValue);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The minimum value should be 2 ; it makes no sense to merge a single region.

if (parsedValue != settledValue) {
warnInvalidValue(MERGE_MAX_REGION_COUNT_KEY, parsedValue, settledValue);
}
return settledValue;
}

private static <T> void warnInvalidValue(final String key, final T parsedValue,
final T settledValue) {
LOG.warn("Configured value {}={} is invalid. Setting value to {}.", key, parsedValue,
Expand Down Expand Up @@ -186,6 +198,10 @@ public long getMergeMinRegionSizeMb() {
return normalizerConfiguration.getMergeMinRegionSizeMb();
}

public long getMergeMaxRegionCount() {
return normalizerConfiguration.getMergeMaxRegionCount();
}

@Override
public void setMasterServices(final MasterServices masterServices) {
this.masterServices = masterServices;
Expand Down Expand Up @@ -382,15 +398,16 @@ private List<NormalizationPlan> computeMergeNormalizationPlans(final NormalizeCo
break;
}
if (
rangeMembers.isEmpty() // when there are no range members, seed the range with whatever
// we have. this way we're prepared in case the next region is
// 0-size.
(rangeMembers.isEmpty() // when there are no range members, seed the range with whatever
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While you're here, do you mind cleaning up these comment strings? They're awkwardly broken over several lines, probably because of the spotless:apply automation. Something like,

        if (
          // when there are no range members, seed the range with whatever we have. this way we're
          // prepared in case the next region is0-size.
          rangeMembers.isEmpty()
            // when there is only one region and the size is 0, seed the range with whatever we
            // have.
            || (rangeMembers.size() == 1 && sumRangeMembersSizeMb == 0)
            // always add an empty region to the current range.
            || regionSizeMb == 0
            || (regionSizeMb + sumRangeMembersSizeMb <= avgRegionSizeMb)
        ) {
          // add the current region to the range when there's capacity remaining.
          rangeMembers.add(new NormalizationTarget(regionInfo, regionSizeMb));

// we have. this way we're prepared in case the next region is
// 0-size.
|| (rangeMembers.size() == 1 && sumRangeMembersSizeMb == 0) // when there is only one
// region and the size is 0,
// seed the range with
// whatever we have.
|| regionSizeMb == 0 // always add an empty region to the current range.
|| (regionSizeMb + sumRangeMembersSizeMb <= avgRegionSizeMb)
|| (regionSizeMb + sumRangeMembersSizeMb <= avgRegionSizeMb))
&& (rangeMembers.size() < getMergeMaxRegionCount())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please invert the logic to be another || condition -- adding the && makes the logic more confusing.

) { // add the current region
// to the range when
// there's capacity
Expand Down Expand Up @@ -502,6 +519,7 @@ private static final class NormalizerConfiguration {
private final int mergeMinRegionCount;
private final Period mergeMinRegionAge;
private final long mergeMinRegionSizeMb;
private final long mergeMaxRegionCount;
private final long cumulativePlansSizeLimitMb;

private NormalizerConfiguration() {
Expand All @@ -511,6 +529,7 @@ private NormalizerConfiguration() {
mergeMinRegionCount = DEFAULT_MERGE_MIN_REGION_COUNT;
mergeMinRegionAge = Period.ofDays(DEFAULT_MERGE_MIN_REGION_AGE_DAYS);
mergeMinRegionSizeMb = DEFAULT_MERGE_MIN_REGION_SIZE_MB;
mergeMaxRegionCount = DEFAULT_MERGE_MAX_REGION_COUNT;
cumulativePlansSizeLimitMb = DEFAULT_CUMULATIVE_SIZE_LIMIT_MB;
}

Expand All @@ -522,6 +541,7 @@ private NormalizerConfiguration(final Configuration conf,
mergeMinRegionCount = parseMergeMinRegionCount(conf);
mergeMinRegionAge = parseMergeMinRegionAge(conf);
mergeMinRegionSizeMb = parseMergeMinRegionSizeMb(conf);
mergeMaxRegionCount = parseMergeMaxRegionCount(conf);
cumulativePlansSizeLimitMb =
conf.getLong(CUMULATIVE_SIZE_LIMIT_MB_KEY, DEFAULT_CUMULATIVE_SIZE_LIMIT_MB);
logConfigurationUpdated(SPLIT_ENABLED_KEY, currentConfiguration.isSplitEnabled(),
Expand All @@ -534,6 +554,8 @@ private NormalizerConfiguration(final Configuration conf,
currentConfiguration.getMergeMinRegionAge(), mergeMinRegionAge);
logConfigurationUpdated(MERGE_MIN_REGION_SIZE_MB_KEY,
currentConfiguration.getMergeMinRegionSizeMb(), mergeMinRegionSizeMb);
logConfigurationUpdated(MERGE_MAX_REGION_COUNT_KEY,
currentConfiguration.getMergeMaxRegionCount(), mergeMaxRegionCount);
}

public Configuration getConf() {
Expand Down Expand Up @@ -597,6 +619,10 @@ public long getMergeMinRegionSizeMb(NormalizeContext context) {
return mergeMinRegionSizeMb;
}

public long getMergeMaxRegionCount() {
return mergeMaxRegionCount;
}

private long getCumulativePlansSizeLimitMb() {
return cumulativePlansSizeLimitMb;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
import static org.apache.hadoop.hbase.master.normalizer.RegionNormalizerWorker.CUMULATIVE_SIZE_LIMIT_MB_KEY;
import static org.apache.hadoop.hbase.master.normalizer.SimpleRegionNormalizer.DEFAULT_MERGE_MIN_REGION_AGE_DAYS;
import static org.apache.hadoop.hbase.master.normalizer.SimpleRegionNormalizer.MERGE_ENABLED_KEY;
import static org.apache.hadoop.hbase.master.normalizer.SimpleRegionNormalizer.MERGE_MAX_REGION_COUNT_KEY;
import static org.apache.hadoop.hbase.master.normalizer.SimpleRegionNormalizer.MERGE_MIN_REGION_AGE_DAYS_KEY;
import static org.apache.hadoop.hbase.master.normalizer.SimpleRegionNormalizer.MERGE_MIN_REGION_COUNT_KEY;
import static org.apache.hadoop.hbase.master.normalizer.SimpleRegionNormalizer.MERGE_MIN_REGION_SIZE_MB_KEY;
Expand Down Expand Up @@ -503,6 +504,41 @@ public void testHonorsMergeMinRegionSizeInTD() {
assertThat(normalizer.computePlansForTable(tableDescriptor), empty());
}

@Test
public void testHonorsMergeMaxRegionCount() {
conf.setBoolean(SPLIT_ENABLED_KEY, false);
conf.setInt(MERGE_MIN_REGION_COUNT_KEY, 1);
conf.setInt(MERGE_MIN_REGION_SIZE_MB_KEY, 0);
conf.setInt(MERGE_MAX_REGION_COUNT_KEY, 3);
final TableName tableName = name.getTableName();
final List<RegionInfo> regionInfos = createRegionInfos(tableName, 5);
final Map<byte[], Integer> regionSizes = createRegionSizesMap(regionInfos, 0, 1, 0, 1, 0);
setupMocksForNormalizer(regionSizes, regionInfos);
assertEquals(3, normalizer.getMergeMaxRegionCount());
List<NormalizationPlan> plans = normalizer.computePlansForTable(tableDescriptor);
assertThat(plans,
contains(
new MergeNormalizationPlan.Builder().addTarget(regionInfos.get(0), 0)
.addTarget(regionInfos.get(1), 1).addTarget(regionInfos.get(2), 0).build(),
new MergeNormalizationPlan.Builder().addTarget(regionInfos.get(3), 1)
.addTarget(regionInfos.get(4), 0).build()));
}

@Test
public void testHonorsMergeMaxRegionCountDefault() {
conf.setBoolean(SPLIT_ENABLED_KEY, false);
conf.setInt(MERGE_MIN_REGION_COUNT_KEY, 1);
conf.setInt(MERGE_MIN_REGION_SIZE_MB_KEY, 0);
final TableName tableName = name.getTableName();
final List<RegionInfo> regionInfos = createRegionInfos(tableName, 3);
final Map<byte[], Integer> regionSizes = createRegionSizesMap(regionInfos, 0, 0, 0);
setupMocksForNormalizer(regionSizes, regionInfos);
assertEquals(50, normalizer.getMergeMaxRegionCount());
List<NormalizationPlan> plans = normalizer.computePlansForTable(tableDescriptor);
assertThat(plans, contains(new MergeNormalizationPlan.Builder().addTarget(regionInfos.get(0), 0)
.addTarget(regionInfos.get(1), 0).addTarget(regionInfos.get(2), 0).build()));
}

@Test
public void testMergeEmptyRegions0() {
conf.setBoolean(SPLIT_ENABLED_KEY, false);
Expand Down