Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,9 @@
import java.util.Set;
import java.util.SortedSet;
import java.util.TreeSet;
import java.util.concurrent.CompletableFuture;
import java.util.concurrent.Future;
import java.util.concurrent.TimeUnit;
import java.util.function.Function;
import java.util.stream.Collectors;
import org.apache.commons.lang3.StringUtils;
Expand Down Expand Up @@ -661,12 +663,14 @@ private synchronized void flushConfig(Map<String, RSGroupInfo> newGroupMap) thro
return;
}

// Make changes visible
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because generally it is the case to update in-memory before persistent storage

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@caroliney14 even if we don't change this order, are we still good with the main problem? I was wondering if this particular order of in-memory vs persistent update could be taken up in follow-up PR as well if we are good with the latest change in isOnline() method.

resetRSGroupMap(newGroupMap);

/* For online mode, persist to hbase:rsgroup and Zookeeper */
flushConfigTable(newGroupMap);

// Make changes visible after having been persisted to the source of truth
resetRSGroupMap(newGroupMap);
saveRSGroupMapToZK(newGroupMap);

// Update previous map
updateCacheOfRSGroups(newGroupMap.keySet());
}

Expand Down Expand Up @@ -825,6 +829,20 @@ private void createRSGroupTable() throws IOException {
}

public boolean isOnline() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How often is this check run?

Copy link
Contributor Author

@caroliney14 caroliney14 Aug 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saintstack It's called by RSGroupBasedLoadBalancer#balanceCluster (here), RSGroupInfoManagerImpl#refresh (here), and RSGroupInfoManagerImpl#flushConfig (here) so I would think semi-frequently? Not sure how often balanceCluster gets called, but refresh gets called upon RSGroupInfoManagerImpl startup and flushConfig gets called every time we add/remove servers or tables or change the rsgroups in any way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the intention here is we will not come back to offline mode after online?

Copy link
Contributor Author

@caroliney14 caroliney14 Aug 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that was the original intention, but it seems misleading/strange. We have encountered errors in prod resulting from trying to flush to hbase:rsgroup when the table became unavailable after the initial check and set to "online" -- in that case, we should go the offline path, right? (Albeit this was in HBase 1, so I'm not 100% sure if HBase 2+ hasn't fixed this -- but from my understanding of the code, it hasn't?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this part of code is written by me but I can not recall if it was already like this or I changed the implementation to make it only online once. Give some time to check code for different branches...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, it was not me. It was like this when we first introduced this class in HBASE-6721. Even on branch-1 backport, the implementation is just return the isOnline flag. So if you want to change the implementation, please explain a bit about the reason?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason is that if we do not periodically update the online status to reflect the availability of the hbase:rsgroup table, we could become blocked waiting on a flush to the hbase:rsgroup table when it can't be accessed (e.g. it's stuck in transition, offline, the rs hosting it has queueing, etc.). Each rsgroup functionality (add, move servers, move tables, remove, etc.) is synchronized, and furthermore the multiMutate function which does the persisting to hbase:rsgroup uses Future.get without timeout. So if hbase:rsgroup is unavailable we will keep getting blocked until the client times out, and we will be unable to serve another rsgroup request in the meantime, when we could have exited early by checking for the availability of hbase:rsgroup.

Instead of being blocked waiting like this, we can go through an "offline" code path. There already is an offline code path in flushConfig which only updates the in-memory state of the default group (here), but we could also change it so that it updates in-memory state while asynchronously trying to persist it to hbase:rsgroup in the background.

Please correct me if I misunderstood anything. What do you think about this rationale?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Apache9 any thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic seems good to me. @Apache9 Would you like to verify?

if (isMasterRunning(masterServices)) {
try {
// try reading from the table
CompletableFuture<Result> read = conn.getTable(RSGROUP_TABLE_NAME).get(new Get(ROW_KEY));
if (read.get(10000, TimeUnit.MILLISECONDS) != null) {
online = true;
}
} catch (Exception e) {
LOG.warn("Failed to read from " + RSGROUP_TABLE_NAME+ "; setting online = false");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use log placeholder {} for table name?

online = false;
}
} else {
online = false;
}
return online;
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,13 +18,15 @@
package org.apache.hadoop.hbase.rsgroup;

import static java.lang.Thread.sleep;
import static org.junit.Assert.assertFalse;
import static org.junit.Assert.assertTrue;

import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseTestingUtil;
import org.apache.hadoop.hbase.coprocessor.CoprocessorHost;
import org.apache.hadoop.hbase.util.EnvironmentEdgeManager;
import org.apache.hadoop.hbase.util.JVMClusterUtil;
import org.junit.AfterClass;
import org.junit.BeforeClass;
import org.junit.Test;
Expand Down Expand Up @@ -72,7 +74,25 @@ public void testEnableRSGroup() throws IOException, InterruptedException {
(RSGroupBasedLoadBalancer) TEST_UTIL.getMiniHBaseCluster().getMaster().getLoadBalancer();
long start = EnvironmentEdgeManager.currentTime();
while (EnvironmentEdgeManager.currentTime() - start <= 60000 && !loadBalancer.isOnline()) {
LOG.info("waiting for rsgroup load balancer onLine...");
LOG.info("Waiting for rsgroup load balancer online...");
sleep(200);
}

assertTrue(loadBalancer.isOnline());

// kill all RS, RSGroupBasedLoadBalancer should now be offline since rsgroup table unavailable
for (JVMClusterUtil.RegionServerThread t:
TEST_UTIL.getMiniHBaseCluster().getRegionServerThreads()) {
TEST_UTIL.getMiniHBaseCluster().killRegionServer(
t.getRegionServer().getServerName());
}

assertFalse(loadBalancer.isOnline());

TEST_UTIL.getMiniHBaseCluster().startRegionServer();
start = EnvironmentEdgeManager.currentTime();
while (EnvironmentEdgeManager.currentTime() - start <= 60000 && !loadBalancer.isOnline()) {
LOG.info("Waiting for rsgroup load balancer online...");
sleep(200);
}

Expand Down