Hive: Make lock check retries backoff exponentially #1873

raptond · 2020-12-04T03:09:46Z

50 milliseconds (constant) sleep time between "checking lock status" thrashes hive metastore databases when multiple jobs try to commit to the same Iceberg table. This fix allows the frequency of "checking the WAITING lock status" configurable and makes use of Tasks to backoff exponentially.

Every time a check on the lock is made, the HMS performs heartbeats on the lock record and the transaction record. It eventually ends up with the below errors if the number of jobs on the same table grew and commit at the same time. Ability to configure the delay between retries and slowing down retries further exponentially would help. Thanks.

MetaException(message:Unable to update transaction database org.postgresql.util.PSQLException: ERROR: could not serialize access due to read/write dependencies among transactions
Detail: Reason code: Canceled on identification as a pivot, during write.
Hint: The transaction might succeed if retried.

pvary · 2020-12-04T08:38:25Z

hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java

+  private static final String HIVE_LOCK_CHECK_BACKOFF_SCALE_FACTOR = "iceberg.hive.lock-check-backoff-scale-factor";
+  private static final long HIVE_ACQUIRE_LOCK_TIMEOUT_MS_DEFAULT = 3 * 60 * 1000; // 3 minutes
+  private static final long HIVE_LOCK_CHECK_MIN_WAIT_MS_DEFAULT = 50; // 50 milliseconds
+  private static final long HIVE_LOCK_CHECK_MAX_WAIT_MS_DEFAULT = 5 * 1000; // 5 seconds


Might worth to mention in the documentation, or somewhere that this should be smaller than hive.txn.timeout or in newer versions metastore.txn.timeout otherwise the locks might be timed out without because of the lack of heartbeat.

We should also add these configs to configuration.md to the rest of Hadoop conf, @raptond.

The Glue catalog is introducing support for a lock using DynamoDB. It would be nice to standardize these options across catalogs so that we only need to document them once and they work the same way. FYI @jackye1995.

I think it would also make sense for these to be catalog options, rather than pulled from the Hive configuration. We used HiveConf originally because we didn't have catalog-specific configuration, but now I think it would make sense to move these into catalog properties. We don't want to increase the cases where we use a Hadoop Configuration.

Yes agree. But Hive is currently built around reading from Hadoop configs. If we want to change it to use catalog properties, we need to also change all the places that loads HiveCatalog using the constructor public HiveCatalog(Configuration conf), such as https://github.com/apache/iceberg/blob/master/mr/src/main/java/org/apache/iceberg/mr/Catalogs.java#L215

This API is also used from Spark 2 where we don't have a way to specify catalog options. Plus, we already have a Hadoop conf for the lock timeout. How do we approach this?

That being said, I am +1 for adding catalog options. I am just not sure we can get rid of Hadoop conf completely in this case.

In that case, we should default from Configuration, but prefer options passed to the initialize method.

@aokolnychyi @pvary -

To make sure I got @rdblue and @jackye1995. You are talking about generalizing catalog options, right?

pvary · 2020-12-04T08:54:49Z

hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveMetastore.java

  private TServer server;
  private HiveMetaStore.HMSHandler baseHandler;
-  private HiveClientPool clientPool;
+  protected HiveClientPool clientPool; // Exposed for testing.


nit: Could we use VisibleForTesting annotation here?

Since this was a test class, I didn't add this annotation. I finally ended up not using it, so I reverted back the change. Thanks for the review.

pvary · 2020-12-04T09:00:33Z

Minor comments, looks good to me (non-binding)

hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java

aokolnychyi · 2020-12-04T10:30:44Z

This change looks great to me, just minor comments in addition to what @pvary mentioned.
Thanks for working on this, @raptond!

We could add fewer configs but I'd be in favor of what this PR does. It was really painful when we hit this problem and couldn't do anything without changing the code so having props to configure every aspect sounds good to me. It is rather a sensitive area and more control here seems justified to me.

RussellSpitzer · 2020-12-04T15:07:26Z

hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java

+    if (state.get().equals(LockState.WAITING)) {
+      try {
+        Tasks.foreach(lockId)
+            .retry(Integer.MAX_VALUE - 100) // Endless retries bound by timeouts. Tasks.retry adds 1 for "first try".


I only wanted to keep a big number for retry. Eg Integer.MAX_VALUE. But, the setter adds 1 overflowing to MIN_VALUE.

Integer.MAX_VALUE - 1 would simply suffice, but I chose conservatively to set Integer.MAX_VALUE - 100.

I think it would be worth noting the rational for a choice like this in a comment.

👍 +1 I have added the rationale in the comments.

aokolnychyi · 2020-12-04T15:11:42Z

hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveCommitLocks.java

+  }
+
+  @Test
+  public void testLockAcquisitionAfterRetries() throws TException, InterruptedException {


Is InterruptedException needed?

yes, here the HiveTableOperations.doUnlock method throws InterruptedException.

aokolnychyi · 2020-12-04T15:14:00Z

hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java

+        conf.getLong(HIVE_LOCK_CHECK_MAX_WAIT_MS, HIVE_LOCK_CHECK_MAX_WAIT_MS_DEFAULT);
+    this.lockCheckBackoffScaleFactor =
+        conf.getDouble(HIVE_LOCK_CHECK_BACKOFF_SCALE_FACTOR, HIVE_LOCK_CHECK_BACKOFF_SCALE_FACTOR_DEFAULT);
+


nit: extra empty line

taken care.

hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java

rdblue · 2020-12-04T17:22:31Z

hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java

+                if (newState.equals(LockState.WAITING)) {
+                  throw WAITING_FOR_LOCK_EXCEPTION;
+                }
+              } catch (InterruptedException | TException e) {


For InterruptedException, why not throw WaitingForLockException and signal that the thread was interrupted? Then this could use the checked exception call, run(id -> {...}, TException.class) and would not need to wrap the exceptions.

The code looks better after this comment. However throwing WaitingForLockException ends up losing the source of the original InterruptedException because it would get handled here: https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/util/Tasks.java#L452

So, I chose to throw RuntimeException which will stop the retry and preserve the source stack trace.

I'd probably opt to suppress the interrupt and let the code carry on after setting that the thread was interrupted. That results in a CommitFailedException. I don't think that preserving the stack of the InterruptedException is really needed, but I'm fin with it this way if you prefer it.

Taken care as per the other comment.

50 milliseconds (constant) sleep time between "checking lock status" thrashes hive metastore databases when multiple jobs try to commit to the same Iceberg table. This fix allows the frequency of "checking the WAITING lock status" configurable and makes use of Tasks to backoff exponentially.

rdblue · 2020-12-08T22:31:50Z

hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java


  protected HiveTableOperations(Configuration conf, HiveClientPool metaClients, FileIO fileIO,
-                                String catalogName, String database, String table) {
+      String catalogName, String database, String table) {


Nit: unnecessary whitespace change.

rdblue · 2020-12-08T23:43:12Z

hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java

+                }
+              } catch (InterruptedException e) {
+                Thread.currentThread().interrupt();
+                throw new RuntimeException("Interrupted while checking lock status.", e);


I don't think it is necessary to throw RuntimeException here. If this doesn't throw WaitingForLockException then it will exit and move on. Since timeout is not set, it would hit the check for whether the lock was acquired and fail, resulting in the CommitFailedException.

I think that's a fairly reasonable way to handle an interrupt without wrapping it in a RuntimeException.

You are correct. I was fixated on failing the execution. This suggestion works nicely and I have a test case for this. Thank you.

rdblue

Overall looks good to me. @aokolnychyi can you take another look?

raptond · 2020-12-09T08:43:53Z

All comments taken care.

aokolnychyi · 2020-12-10T17:44:17Z

I'll do a pass in 15 mins

aokolnychyi · 2020-12-10T18:01:54Z

Thanks everyone! The change looks solid so I merged it!

aokolnychyi · 2020-12-10T18:02:41Z

@raptond, do you want to work on catalog options for this next?

raptond · 2020-12-10T18:35:24Z

@aokolnychyi - Sure, I will work on to submit a new one with catalog options (properties via initialize method) overriding the Configuration.

github-actions bot added the hive label Dec 4, 2020

pvary reviewed Dec 4, 2020

View reviewed changes

aokolnychyi reviewed Dec 4, 2020

View reviewed changes

hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java Outdated Show resolved Hide resolved

aokolnychyi reviewed Dec 4, 2020

View reviewed changes

hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java Outdated Show resolved Hide resolved

RussellSpitzer reviewed Dec 4, 2020

View reviewed changes

aokolnychyi reviewed Dec 4, 2020

View reviewed changes

rdblue reviewed Dec 4, 2020

View reviewed changes

hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java Outdated Show resolved Hide resolved

rdblue reviewed Dec 4, 2020

View reviewed changes

raptond added 2 commits December 8, 2020 11:29

Fix review comments

c53108c

raptond force-pushed the hms-thrashing branch from b201863 to c53108c Compare December 8, 2020 19:31

github-actions bot added the docs label Dec 8, 2020

raptond added 2 commits December 8, 2020 11:52

Fix checkstyle issues

b4dcc67

Fix additional review comments

10c23b8

rdblue reviewed Dec 8, 2020

View reviewed changes

rdblue approved these changes Dec 8, 2020

View reviewed changes

More review comments addressed.

e1f4d48

raptond force-pushed the hms-thrashing branch from 99f7ac1 to e1f4d48 Compare December 9, 2020 07:40

raptond added 2 commits December 8, 2020 23:49

Remove unused import

bb828ba

Add a note on iceberg.hive.lock-check-max-wait-ms

bbc91e6

aokolnychyi approved these changes Dec 10, 2020

View reviewed changes

aokolnychyi merged commit 2cea824 into apache:master Dec 10, 2020

RussellSpitzer mentioned this pull request Apr 28, 2021

Hive: Lock Issues with multithreaded commits #2540

Closed

Hive: Make lock check retries backoff exponentially #1873

Hive: Make lock check retries backoff exponentially #1873

Uh oh!

Conversation

raptond commented Dec 4, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aokolnychyi Dec 4, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pvary commented Dec 4, 2020

Uh oh!

Uh oh!

Uh oh!

aokolnychyi commented Dec 4, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rdblue left a comment

Choose a reason for hiding this comment

Uh oh!

raptond commented Dec 9, 2020

Uh oh!

aokolnychyi commented Dec 10, 2020

Uh oh!

aokolnychyi commented Dec 10, 2020

Uh oh!

aokolnychyi commented Dec 10, 2020

aokolnychyi Dec 4, 2020 •

edited

Loading

aokolnychyi commented Dec 4, 2020 •

edited

Loading