-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-25902 Add missing CFs in meta during HBase 1 to 2 Upgrade #3417
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-25902 Add missing CFs in meta during HBase 1 to 2 Upgrade #3417
Conversation
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
Apache9
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to write a UT for this?
Or have you tried this on real cluster?
| .setColumnFamily(FSTableDescriptors.getTableFamilyDesc(conf)) | ||
| .setColumnFamily(FSTableDescriptors.getReplBarrierFamilyDesc()) | ||
| .build(); | ||
| long pid = this.modifyTable(TableName.META_TABLE_NAME, () -> newMetaDesc, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This procedure could be done without ClusterSchemaService?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, apparently this can be done without ClusterSchemaService as per my dev testing.
|
@Apache9 Unfortunately, writing UT might not be possible (or might require another level of screwup by UT, I tried but looks too complicated to achieve this), but I have tested this change on pseudo-distributed mode (1 ZK server, 1 NN, 1 DN, 1 HM, 4 RS). |
| this.tableStateManager.start(); | ||
| } catch (NoSuchColumnFamilyException e) { | ||
| if (tableFamilyDesc == null && replBarrierFamilyDesc == null) { | ||
| LOG.info("For missing CFs in meta, this Exception is expected", e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You think operator will know what 'missing CFs in meta' is about? Would it be better to talk about migration from hbase1 to hbase2?
| // initialized) at the expense of bypassing few important tasks as part | ||
| // of active master init routine. So now we abort active master so that | ||
| // next active master init will not face any issues and all mandatory | ||
| // services will be started during master init phase. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice note.
| // of active master init routine. So now we abort active master so that | ||
| // next active master init will not face any issues and all mandatory | ||
| // services will be started during master init phase. | ||
| throw new IOException("Stopping active master after missing CFs are " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are not stopping, we are aborting?
Would it be better to throw a PleaseRestartMeException here? (Would have to create it...).
| tries--; | ||
| } | ||
| if (tries <= 0) { | ||
| throw new IOException( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than new IOE... throw a new HBaseIOE?
6e871c5 to
fe6323e
Compare
|
🎊 +1 overall
This message was automatically generated. |
| } | ||
| } | ||
|
|
||
| public static ColumnFamilyDescriptor getTableFamilyDesc( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a META table CF. Do we have any other util class or so (specific for META) where we can include this?
At least the name should make it clear. This is actually table state CF in Meta.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me fix the name. Reg the class to keep it, maybe it is fine here because it is primarily used by FSTD only. Just for upgrade case, master needs it. But I am open to keeping it in other util class if there is better suggestion.
| try { | ||
| this.tableStateManager.start(); | ||
| } catch (NoSuchColumnFamilyException e) { | ||
| if (tableFamilyDesc == null && replBarrierFamilyDesc == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems 2.0.x had extra table state CF only. 2.1.x Had this replication barrier. When table state CF is missing it causes the startup issue right?
When its 2.0.x to 2.3.x upgrade, the repBarrier will get auto created?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is for 2.3+ releases meaning if we come from HBase 1 to 2.3+, the transition should be seamless. Hence, for this upgrade case, both table and repl_barrier will be missing.
When its 2.0.x to 2.3.x upgrade, the repBarrier will get auto created?
I have not tried this one. If table CF is handled, I guess repl_barrier too would have been handled. As per @saintstack's testing, the cluster had to come from 1.2 to earlier than 2.3 release and from that release, it went on to 2.3, hence I am assuming HBase 2 (< 2.3) to 2.3 upgrade should have been smooth. @saintstack thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've run various versions of hbase-2.0-hbase2.2.x upgrades successfully... It was when I tried to go from an hbase1.2 version straight to hbase2.3 that I ran into this issue.
Otherwise, agree w/ your thinking around repl_barrier and table CF. Is there a case that one might be in place but not the other -- I don't know-- and does the code do right thing(I've not checked)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a case that one might be in place but not the other -- I don't know-- and does the code do right thing(I've not checked)
Exactly, the case where either 'table' or 'repl_barrier' is missing might be weird case and should not happen given that HBase 2 upgrades are not going to face missing CF issues. That's why I kept && here instead of || to make sure we are specifically handling HBase 1 to 2.3+ upgrade case.
FYI @anoopsjohn
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Good. Thanks.
| // Set master as 'initialized'. | ||
| setInitialized(true); | ||
|
|
||
| if (tableFamilyDesc == null && replBarrierFamilyDesc == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case one is not there also, need to create?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking that too but for HBase 1 to 2.3+ upgrade, both will be null for sure so with this case, we are targeting this specific upgrade case.
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
… (#3417) Signed-off-by: Michael Stack <[email protected]>
Signed-off-by: Michael Stack <[email protected]>
Signed-off-by: Michael Stack <[email protected]>
No description provided.