-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Hive: Print db and table name while acquiring hive meta-store lock #5039
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
dramaticlly
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java
Outdated
Show resolved
Hide resolved
|
Question: iceberg/hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java Line 563 in a5b2c70
And iceberg/hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java Line 568 in a5b2c70
These should be logging the database.tablename if lock is not acquired in a timely manner. is that exception getting swallowed somewhere or the msg not logged from the exception, which includes these details along with the state of the lock. |
These logs gets printed, just that, they are printed at the very end, after exhausting all the retries. Here we are trying to add db.table details after each re-try and get the information up in the log file. Currently with just "Waiting for lock" message, we don't get details about, which db.table, the lock is requested. |
Ah! That makes sense, thanks for clarifying. |
| LockResponse lockResponse = metaClients.run(client -> client.lock(lockRequest)); | ||
| AtomicReference<LockState> state = new AtomicReference<>(lockResponse.getState()); | ||
| long lockId = lockResponse.getLockid(); | ||
| final Pair<Long, String> lockDetails = Pair.of(lockId, String.format("%s.%s", database, tableName)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think we need this Pair object here. Could we just use database/tableName down below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In current implementation, only a single lock is acquired, so we can use database/tableName directly inside the task/lambda. Since we are using Tasks.foreach, where multiple tasks can run in future, grouped lock id and table info in a pair (to avoid confusion). thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the Tasks.foreach(lockDetails) is only used to reuse the retry functionality of Tasks. So I think it is perfectly fine to keep the simpler implementation and using database/tableName.
pvary
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 pending tests
|
Thanks @krisdas for the PR, and for the team for the review comments! |
While acquiring hive-metastore lock on an Iceberg table, after every timeout, below log line is printed, which doesn't have database and table name information.
org.apache.iceberg.hive.HiveTableOperations$WaitingForLockException: Waiting for lock.After exhausting all the re-try attempt, finally below log line prints database and table name.
Retrying task after failure: Timed out after 180133 ms waiting for lock on database.tableHere we are adding the database and table name in first log line to speed up investigation related with locking.
cc : @szehon-ho @dramaticlly