-
Notifications
You must be signed in to change notification settings - Fork 9.2k
HDFS-15624. fix the function of setting quota by storage type #2377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
@liuml07 Hi, sorry to disturb you. Would you help to review this pr? It changes the ordinal of nvdimm as the last storage type. And we will propose another pr to bump up the NamenodeLayoutVersion. Thanks very much. |
|
Thanks @YaYun-Wang for fixing this. It looks good to me overall, and I can have another look over the weekend, if it's not committed. I also added @ayushtkn as reviewer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@YaYun-Wang I am not very convinced that we should get away with the half fix. Better we have the NamenodeLayout also fixed here and conclude the Nvdimm stuff.
That would literally not more than 10 lines, I guess you need to just increase it only once, and a check only in FsNamesystem should do.
I have talked with @vinayakumarb you can connect with him offline for details on that
|
💔 -1 overall
This message was automatically generated. |
|
@ayushtkn ,ok, we will fix that all in the pull request. |
|
💔 -1 overall
This message was automatically generated. |
Yes, I think I'm +1 on this. |
|
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since there is no release after this feature is committed, No need to increase layout version twice for same feature.
Usually one version increase is sufficient per release. But for clarity separate feature can be added.
In this case feature is only one. NVDIMM SUPPORT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because of addition of new STORAGE TYPE,
during upgrade
calls from old clients, who doesnt have this storagetype, will experience failure.
Especially getQuotaUsage() call will experience failure. May be need to analyze more to avoid failures during upgrade.
Even though upgrade is finalized, old clients should continue to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We add check for setQuota() in FSNamesystem.java, I think it's ok, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setQuota() check will only block during rollingupgrade. But once its finalized, still old clients experience failure.
Anyway thats the current problem, even before this feature. Can be handled in a separate Jira.
More details about failure: 2.10.1 client asking quota usage from 3.3.0 namenode.
$ bin/hdfs dfs -fs hdfs://namenode:8020/ -count -q -t -h /
count: Message missing required fields: usage.typeQuotaInfos.typeQuotaInfo[3].type
Above issue is coming, because 2.10.1 client doent know about PROVIDED StorageType.
Similar problem will occur for NVDIMM also from previous version clients.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure but will getStoragePolicies also land up in something similar issue? Due to unavailability of storage type? The quota stuff shall be there for PROVIDED also but in case this backward incompatibility is there with Storage Policy too, Then we need to find out some way.
@vinayakumarb do you have pointers or suggestions on this, how to tackle this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have verified the getStoragePolicies() with older clients.
Older clients will get the storage policy but, unknown storage types will be ignored.
So in this case, ALLNVDIMM storage policy shows empty StorageTypes for older clients.
May be need to show DEFAULT StorageType instead of ignoring the unknown StorageTypes. This also can be fixed in a separate Jira. Right now, existing clients wont be broken on getStoragePolicies() call with this change.
So nothing special required for that in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for clarification and testing.
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason to change the minCompatLV to -66?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the comment above said:
If the feature cannot satisfy compatibility with any prior version, then set its minimum compatible lqyout version to itself to indicate that downgrade is impossible.
Or maybe we missed something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not very aware, but yes, if I am decoding the comment correct, This should be 66 both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keep it -61 for minCompatLV itself.
More details here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will change the miniCompatLV to -61, thanks for your clarification.
|
I think, you can hold on till HDFS-15660 addressed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check should be done in case of setStoragePolicy also, if the storage policy is ALLNVDIMM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Please see code above. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not very aware, but yes, if I am decoding the comment correct, This should be 66 both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure but will getStoragePolicies also land up in something similar issue? Due to unavailability of storage type? The quota stuff shall be there for PROVIDED also but in case this backward incompatibility is there with Storage Policy too, Then we need to find out some way.
@vinayakumarb do you have pointers or suggestions on this, how to tackle this?
vinayakumarb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to make both as -66. Keep it as -61 itself, unless cant keep backward compatibility even with avoiding operations related to this feature. Second one is to indicate till which version downgrade is possible from the fsimage generated by this version’s generated fsimage during rolling upgrade.
You have to make sure that no NVDIMM related operations are allowed during rolling upgrade, so that fsimage/editlog generated during this period is compatible with pre-upgrade version incase user want to downgrade without loosing the latest data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar check you need to add when user tries to use NVDIMM based storage policy..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Please see code above. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setQuota() check will only block during rollingupgrade. But once its finalized, still old clients experience failure.
Anyway thats the current problem, even before this feature. Can be handled in a separate Jira.
More details about failure: 2.10.1 client asking quota usage from 3.3.0 namenode.
$ bin/hdfs dfs -fs hdfs://namenode:8020/ -count -q -t -h /
count: Message missing required fields: usage.typeQuotaInfos.typeQuotaInfo[3].type
Above issue is coming, because 2.10.1 client doent know about PROVIDED StorageType.
Similar problem will occur for NVDIMM also from previous version clients.
I dont think need to hold this Jira, unless HDFS-15660 cant be solved in the same release as NVDIMM feature altogether. So both can go independently but need to make sure both lands in same release. |
2321ba5 to
4d53da6
Compare
… the setQuota of the FDNamesystem class
… the setQuota of the FDNamesystem class
This changes: 1. puts NVDIMM to the end of storage type enum to make sure compatibility. 2. adds check for setQuota() and setStoragePolicy() to make sure the software layout version is satisfied.
5d676ee to
56bcbbd
Compare
|
💔 -1 overall
This message was automatically generated. |
|
@liuml07 @vinayakumarb @brahmareddybattula Sorry to disturb, would you please help to review, thanks very much. |
|
💔 -1 overall
This message was automatically generated. |
|
@liuml07 @linyiqun Sorry to disturb you again :) the jira https://issues.apache.org/jira/browse/HDFS-15660 was fixed, and I think we can review this patch? Thanks very much. |
|
@ayushtkn Hi, maybe you can help to review this :)? Thanks very much |
|
@tangzhankun would you please have a look for this, thanks very much. |
liuml07
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
|
Thanx @huangtianhua for the work here, Sorry I couldn't revert back to your emails & pings. @brahmareddybattula has objections on the jira with the approach itself. Quoting him from the jira
There is no code change post HDFS-15660? It was asserted the generic solution shall solve this problem or will change something So, We might need changes here post HDFS-15660. should wait for him, unless he is convinced. |
|
@ayushtkn, thanks for review it. HDFS-15660 supports handling storage types for older clients in a generic way, and it has been merged, or I missed it? |
|
@huangtianhua nopes you didn't. I know that is merged. That is what I said, but there were assertions earlier on jira that we should hold this code in Jira, for HDFS-15660. That would fix something or change our code here. So, we held this jira because of that only. So, just want to wait, so that can be clarified what needs to be done here post HDFS-15660. And secondly the NamenodeLayout version approach had objection too as I quoted above. We need to get an agreement over there. For me the code is good enough, once we have clarifications regarding these things, can conclude this |
|
@ayushtkn , in fact we don't have to hold this for HDFS-15660 as vinay said, the codes here is to fix the specific issues of NVDIMM, to avoid operations which related with storage type during rollingupgrade, to keep the orinal of storage type to make sure the editLog/fsimage works after restart namenode. IIUC, the miniCompatLV of namenodelayout version is introduced to make sure to refuse operations while rollingupgrade, so I think the approach is appropriate for the situation. |
|
@liuml07 @vinayakumarb , would you please to merge this, we wait so long and seems there is no other opinion, thanks very much. |
liuml07
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 again as HDFS-15660 is already committed.
I will hold on a week so @vinayakumarb can post his comments before we merge this.
ayushtkn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
|
@liuml07 could you approve this? Thanks very much. |
|
Merged and resolved the JIRA. Thank you all! |
…#2377)" This reverts commit 394b9f7. Ref: HDFS-15995. Had to revert this commit, so we can commit HDFS-15566 (a critical bug preventing rolling upgrade to Hadoop 3.3) Will re-work this fix again later.
…#2377) 1. puts NVDIMM to the end of storage type enum to make sure compatibility. 2. adds check to make sure the software layout version is satisfied Co-authored-by: su xu <[email protected]> Co-authored-by: huangtianhua <[email protected]> Co-authored-by: YaYun-Wang <[email protected]> Signed-off-by: Mingliang Liu <[email protected]> Signed-off-by: Ayush Saxena <[email protected]> Signed-off-by: Vinayakumar B <[email protected]> Change-Id: I3c58beef50730827a09b3c968e9ad637baa57d44
…#2955) 1. puts NVDIMM to the end of storage type enum to make sure compatibility. 2. adds check to make sure the software layout version is satisfied Co-authored-by: su xu <[email protected]> Co-authored-by: huangtianhua <[email protected]> Co-authored-by: YaYun-Wang <[email protected]> Signed-off-by: Mingliang Liu <[email protected]> Signed-off-by: Ayush Saxena <[email protected]> Signed-off-by: Vinayakumar B <[email protected]> Change-Id: I3c58beef50730827a09b3c968e9ad637baa57d44
…apache#2377)" This reverts commit 394b9f7. Ref: HDFS-15995. Had to revert this commit, so we can commit HDFS-15566 (a critical bug preventing rolling upgrade to Hadoop 3.3) Will re-work this fix again later.
…#2377) (apache#2955) 1. puts NVDIMM to the end of storage type enum to make sure compatibility. 2. adds check to make sure the software layout version is satisfied Co-authored-by: su xu <[email protected]> Co-authored-by: huangtianhua <[email protected]> Co-authored-by: YaYun-Wang <[email protected]> Signed-off-by: Mingliang Liu <[email protected]> Signed-off-by: Ayush Saxena <[email protected]> Signed-off-by: Vinayakumar B <[email protected]> Change-Id: I3c58beef50730827a09b3c968e9ad637baa57d44
HDFS-15025 adds a new storage Type NVDIMM, changes the ordinal() of the enum of StorageType. And, setting the quota by storageType depends on the ordinal(), therefore, it may cause the setting of quota to be invalid after upgrade.
This fix the function of setting quota by storage type bug in hadoop/trunk.