Skip to content

Conversation

@captainzmc
Copy link
Member

@captainzmc captainzmc commented Oct 13, 2020

What changes were proposed in this pull request?

Currently volumeArgs using getCacheValue and put the same object in doubleBuffer, this might cause issue.

Let's take the below scenario:

InitialVolumeArgs quotaBytes -> 10000

  1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated volumeArgs to DoubleBuffer.
  2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to double buffer.

Now at the end of flushing these transactions, our DB should have 7000 as bytes used.

Now T1 is picked by double Buffer and when it commits, and as it uses cached Object put into doubleBuffer, it flushes to DB with the updated value from T2(As it is a cache object) and update DB with bytesUsed as 7000.

And now OM has restarted, and only DB has transactions till T1. (We get this info from TransactionInfo Table(https://issues.apache.org/jira/browse/HDDS-3685)

Now T2 is again replayed, as it is not committed to DB, now DB will be again subtracted with 2000, and now DB will have 5000.

But after T2, the value should be 7000, so we have DB in an incorrect state.

Issue here:

  1. As we use a cached object and put the same cached object into double buffer this can cause this kind of issue.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-4308

How was this patch tested?

Use the existing UT

@captainzmc
Copy link
Member Author

captainzmc commented Oct 14, 2020

Hi @bharatviswa504, Can you help to review this PR.

@captainzmc
Copy link
Member Author

Hi @linyiqun,I modified the implementation based on the latest comments. Can you help to review this PR.

@captainzmc captainzmc force-pushed the fix-update-usage branch 3 times, most recently from dcfdb76 to dfc8e50 Compare October 29, 2020 13:16
@captainzmc
Copy link
Member Author

Hi @bharatviswa504 @linyiqun, This PR has been modified. Can you help to review this PR.

Copy link
Contributor

@linyiqun linyiqun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @captainzmc , please have a look for the latest comments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method name setUsedBytes seems confused, can we rename to incrUsedBytes(long bytes)

Copy link
Contributor

@linyiqun linyiqun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the comments, @captainzmc ! Leave one minor comment.
@bharatviswa504 , does current fix way make sense to you?

Copy link
Contributor

@linyiqun linyiqun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM +1.
Let's wait for others to have a look for latest change as well, : ).

Copy link
Contributor

@bharatviswa504 bharatviswa504 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use volume lock for key operations we serialize all bucket ops across volume which have major perf impact. (I understand why we need it, I am just thinking if there are some ways we can avoid it)

So some general questions on Quota:

  1. Can we skip updating volume bytes used during key operations, when it is required just calculated bytes used from all bucket info. As with volume lock, it affects key operations.
  2. During bucket creation why can't we make check if buckets created exceed volume quota set if so, fail bucket creation. In this way, during key operations we don't need to check volume bytesUsed, just bucket bytesUsed will be enough.

Example:
volume quota 100 MB

Bucket1 90MB success
Bucket2 20MB fail(As total volume quota is 100MB)

With this approach, we don't need to check volume quota during key ops. (It has impact ony during quota set operation which is not a frequent operation on the cluster)

  1. Is there a way to disable quota feature? My question is so when upgraded during key creation we by default bytesUsed. And when not required, we don't need to update bytesUsed. (I see that we have check during checkQuota, but not during update bytesUsed)

@captainzmc
Copy link
Member Author

captainzmc commented Nov 4, 2020

Thanks for @bharatviswa504's optimization Suggestions.

  1. Is there a way to disable quota feature? My question is so when upgraded during key creation we by default bytesUsed. And when not required, we don't need to update bytesUsed. (I see that we have check during checkQuota, but not during update bytesUsed)

This bytesUsed should be updated all the time. If quota is not enabled when a bucket is created, then quota is enabled after a certain amount of data is written. At this point we need to know how much bytesUsed was before quota was enabled so that we can correctly update it.
And in addition to quota, displaying bytesUsed on buckets also gives the user a more intuitive view of the current usage of bucket data. This is similar to bytesUsed in the current container.

  1. Can we skip updating volume bytes used during key operations, when it is required just calculated bytes used from all bucket info. As with volume lock, it affects key operations.
  2. During bucket creation why can't we make check if buckets created exceed volume quota set if so, fail bucket creation. In this way, during key operations we don't need to check volume bytesUsed, just bucket bytesUsed will be enough.

Agree, I think this is a better way to avoid the current use of Volume lock. I will modify this PR in this way as soon as possible. cc @linyiqun

@linyiqun
Copy link
Contributor

linyiqun commented Nov 4, 2020

Also +1 with no volume lock approach and aggregate the bucket bytes used instead.

This bytesUsed should be updated all the time. If quota is not enabled when a bucket is created, then quota is enabled after a certain amount of data is written. At this point we need to know how much bytesUsed was before quota was enabled so that we can correctly update it.

Agreed, bytes used still needed but don't need to do any quota check if we have a way to disable quota.

@captainzmc
Copy link
Member Author

captainzmc commented Nov 5, 2020

Hi @bharatviswa504 @linyiqun. Based on what we discussed yesterday, I revised the PR. Can you take another look?
The changes are as follows:

  1. Remove the dependency on volume usedBytes.
  2. The previous function ensures that the total size of bucket quota does not exceed volume, so the portion does not need to be modified. Also, when checking quota, the quota is not checked if it is not enabled.
  3. Modified relevant UT.

@runzhiwang
Copy link
Contributor

We no longer need to update volume usedBytes in Response. You can also delete the logic for updating volume usedBytes in Response.

@runzhiwang
Copy link
Contributor

Now that we no longer need the usebyte of the volume; You need to synchronize the contents of quote.md.

@runzhiwang
Copy link
Contributor

Overall LGTM. Just a couple of minor suggestions.

@captainzmc
Copy link
Member Author

The issues has been fixed. Can you take another look? @runzhiwang

@captainzmc
Copy link
Member Author

Thanks for @runzhiwang‘s review. The issues has been fixed.

Copy link
Contributor

@runzhiwang runzhiwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@runzhiwang runzhiwang merged commit 54cca0b into apache:master Nov 25, 2020
@runzhiwang
Copy link
Contributor

@captainzmc Thanks the patch. @bharatviswa504 @linyiqun Thanks for review. I have merged the patch.

@linyiqun
Copy link
Contributor

linyiqun commented Nov 25, 2020

@captainzmc and @runzhiwang , I noticed that we also removed below function:

  protected void checkVolumeQuotaInBytes(OmVolumeArgs omVolumeArgs,
      long allocateSize) throws IOException {
    ...
  }

So now where we will check the quota usage of the volume? There should be one place to do the sum of volume buckets usage and then do the quota check.

@captainzmc
Copy link
Member Author

captainzmc commented Nov 25, 2020

Thanks for @linyiqun's attention. For now we no longer need to check the Quota of Volume. Because we have ensured that all bucket quota and do not exceed volume quota when we set bucket and volume quota. Therefore, to write a key under a bucket, we simply check bucket quota. Volume's quota will naturally not exceed as long as the bucket's quota checks pass.
Just as Bharat's suggests.

@linyiqun
Copy link
Contributor

Sounds good, thanks @captainzmc for the explanation.

errose28 added a commit to errose28/ozone that referenced this pull request Dec 1, 2020
* HDDS-3698-upgrade:
  HDDS-4429. Create unit test for SimpleContainerDownloader. (apache#1551)
  HDDS-4461. Reuse compiled binaries in acceptance test (apache#1588)
  HDDS-4511: Avoiding StaleNodeHandler to take effect in TestDeleteWithSlowFollower. (apache#1625)
  HDDS-4510. SCM can avoid creating RetriableDatanodeEventWatcher for deletion command ACK (apache#1626)
  HDDS-3363. Intermittent failure in testContainerImportExport (apache#1618)
  HDDS-4370. Datanode deletion service can avoid storing deleted blocks. (apache#1620)
  HDDS-4512. Remove unused netty3 transitive dependency (apache#1627)
  HDDS-4481. With HA OM can send deletion blocks to SCM multiple times. (apache#1608)
  HDDS-4487. SCM can avoid using RETRIABLE_DATANODE_COMMAND for datanode deletion commands. (apache#1621)
  HDDS-4471. GrpcOutputStream length can overflow (apache#1617)
  HDDS-4308. Fix issue with quota update (apache#1489)
  HDDS-4392. [DOC] Add Recon architecture to docs (apache#1602)
  HDDS-4501. Reload OM State fail should terminate OM for any exceptions. (apache#1622)
  HDDS-4492. CLI flag --quota should default to 'spaceQuota' to preserve backward compatibility. (apache#1609)
  HDDS-3689. Add various profiles to MiniOzoneChaosCluster to run different modes. (apache#1420)
  HDDS-4497. Recon File Size Count task throws SQL Exception. (apache#1612)
errose28 added a commit to errose28/ozone that referenced this pull request Dec 1, 2020
* HDDS-3698-upgrade:
  HDDS-4429. Create unit test for SimpleContainerDownloader. (apache#1551)
  HDDS-4461. Reuse compiled binaries in acceptance test (apache#1588)
  HDDS-4511: Avoiding StaleNodeHandler to take effect in TestDeleteWithSlowFollower. (apache#1625)
  HDDS-4510. SCM can avoid creating RetriableDatanodeEventWatcher for deletion command ACK (apache#1626)
  HDDS-3363. Intermittent failure in testContainerImportExport (apache#1618)
  HDDS-4370. Datanode deletion service can avoid storing deleted blocks. (apache#1620)
  HDDS-4512. Remove unused netty3 transitive dependency (apache#1627)
  HDDS-4481. With HA OM can send deletion blocks to SCM multiple times. (apache#1608)
  HDDS-4487. SCM can avoid using RETRIABLE_DATANODE_COMMAND for datanode deletion commands. (apache#1621)
  HDDS-4471. GrpcOutputStream length can overflow (apache#1617)
  HDDS-4308. Fix issue with quota update (apache#1489)
  HDDS-4392. [DOC] Add Recon architecture to docs (apache#1602)
  HDDS-4501. Reload OM State fail should terminate OM for any exceptions. (apache#1622)
  HDDS-4492. CLI flag --quota should default to 'spaceQuota' to preserve backward compatibility. (apache#1609)
  HDDS-3689. Add various profiles to MiniOzoneChaosCluster to run different modes. (apache#1420)
  HDDS-4497. Recon File Size Count task throws SQL Exception. (apache#1612)
errose28 added a commit to errose28/ozone that referenced this pull request Jan 5, 2021
* master: (40 commits)
  HDDS-4473. Reduce number of sortDatanodes RPC calls (apache#1610)
  HDDS-4485. [DOC] add the authentication rules of the Ozone Ranger. (apache#1603)
  HDDS-4528. Upgrade slf4j to 1.7.30 (apache#1639)
  HDDS-4424. Update README with information how to report security issues (apache#1548)
  HDDS-4484. Use RaftServerImpl isLeader instead of periodic leader update logic in OM and isLeaderReady for read/write requests (apache#1638)
  HDDS-4429. Create unit test for SimpleContainerDownloader. (apache#1551)
  HDDS-4461. Reuse compiled binaries in acceptance test (apache#1588)
  HDDS-4511: Avoiding StaleNodeHandler to take effect in TestDeleteWithSlowFollower. (apache#1625)
  HDDS-4510. SCM can avoid creating RetriableDatanodeEventWatcher for deletion command ACK (apache#1626)
  HDDS-3363. Intermittent failure in testContainerImportExport (apache#1618)
  HDDS-4370. Datanode deletion service can avoid storing deleted blocks. (apache#1620)
  HDDS-4512. Remove unused netty3 transitive dependency (apache#1627)
  HDDS-4481. With HA OM can send deletion blocks to SCM multiple times. (apache#1608)
  HDDS-4487. SCM can avoid using RETRIABLE_DATANODE_COMMAND for datanode deletion commands. (apache#1621)
  HDDS-4471. GrpcOutputStream length can overflow (apache#1617)
  HDDS-4308. Fix issue with quota update (apache#1489)
  HDDS-4392. [DOC] Add Recon architecture to docs (apache#1602)
  HDDS-4501. Reload OM State fail should terminate OM for any exceptions. (apache#1622)
  HDDS-4492. CLI flag --quota should default to 'spaceQuota' to preserve backward compatibility. (apache#1609)
  HDDS-3689. Add various profiles to MiniOzoneChaosCluster to run different modes. (apache#1420)
  ...
guihecheng pushed a commit to guihecheng/ozone that referenced this pull request Mar 8, 2021
…ge request !66)


Squash merge branch '2020-11-30' into 'tencent-master'

* HDDS-4308. Fix issue with quota update (apache#1489)
guihecheng pushed a commit to guihecheng/ozone that referenced this pull request Mar 8, 2021
guihecheng pushed a commit to guihecheng/ozone that referenced this pull request Mar 8, 2021
Revert cherry-pick HDDS-4308. Fix issue with quota update (apache#1489) (merge request !66)
This reverts commit dfcd441
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants