Skip to content

Conversation

@adoroszlai
Copy link
Contributor

@adoroszlai adoroszlai commented Jun 21, 2020

What changes were proposed in this pull request?

Implement bucket links.

Implementation details (from the design doc)

  • ozone sh bucket link operation creates a link bucket. Links are like regular buckets, stored in DB the same way, but with two new, optional pieces of information: source volume and bucket. (The bucket being referenced by the link is called "source", not "target", to follow symlink terminology.)
  • Link buckets share the namespace with regular buckets. If a bucket or link with the same name already exists, a BUCKET_ALREADY_EXISTS result is returned.
  • (Unlike previous per-user S3 volume mapping) Link buckets are not inherently specific to a user, access is restricted only by ACL.
  • Links are persistent, ie. they can be used until they are deleted.
  • Existing bucket operations (info, delete, ACL) work on the link object in the same way as they do on regular buckets. No new link-specific RPC is required.
  • Links are followed for key operations (list, get, put, etc.). Read permission on the link is required for this.
  • Checks for existence of the source bucket, as well as ACL, are performed only when following the link (similar to symlinks). Source bucket is not checked when operating on the link bucket itself (eg. deleting it). This avoids the need for reverse checks for each bucket delete or ACL change.
  • Bucket links are generic, not restricted to the s3v volume.

Audit

Audit message for link creation mentions the bucket to which the link points (sourceVolume and sourceBucket):

OMAudit | ... | op=CREATE_BUCKET {volume=60748-target, bucket=link1, gdprEnabled=null, acls=[user:hadoop:a[ACCESS], group:users:a[ACCESS]], isVersionEnabled=false, storageType=DISK, creationTime=1592631444545, bucketEncryptionKey=null, sourceVolume=60748-source, sourceBucket=bucket1} | ret=SUCCESS |  

As well as audit messages for key operations issued on links:

OMAudit | ... | op=ALLOCATE_KEY {volume=60748-target, bucket=link1, key=key1, dataSize=671, replicationType=RATIS, replicationFactor=THREE, sourceVolume=60748-source, sourceBucket=bucket1} | ret=SUCCESS |  
OMAudit | ... | op=DELETE_KEY {volume=60748-target, bucket=link1, key=key2, dataSize=0, replicationType=RATIS, replicationFactor=ONE, sourceVolume=60748-source, sourceBucket=bucket1} | ret=SUCCESS |  

Response for ALLOCATE_KEY contains the real (source) volume/bucket, so COMMIT_KEY does not know about the link (but it is prepared to handle it anyway):

OMAudit | ... | op=COMMIT_KEY {volume=60748-source, bucket=bucket1, key=key1, dataSize=671, replicationType=RATIS, replicationFactor=ONE} | ret=SUCCESS |  

Based on top of changes from #1091 and #1093.

https://issues.apache.org/jira/browse/HDDS-3612

How was this patch tested?

Added acceptance tests for testing bucket links via Ozone Shell.

Slightly tweaked existing S3 acceptance tests are executed on both normal buckets and bucket links.

Existing Ozone FS acceptance tests (reorganized in #1093) are also executed on links.

https://github.com/adoroszlai/hadoop-ozone/runs/792817258

adoroszlai added 30 commits May 28, 2020 13:41
@adoroszlai
Copy link
Contributor Author

Thanks @bharatviswa504 for the review. I have updated the patch based on your comments.

When deleting a bucket, we check bucket is empty or not. Do we need to check resolvedBucket is empty or not?

Link can be deleted regardless of the existence or emptiness of the real bucket it points to. So no need to check resolved bucket.

@bharatviswa504
Copy link
Contributor

bharatviswa504 commented Jul 15, 2020

Thanks @bharatviswa504 for the review. I have updated the patch based on your comments.

When deleting a bucket, we check bucket is empty or not. Do we need to check resolvedBucket is empty or not?

Link can be deleted regardless of the existence or emptiness of the real bucket it points to. So no need to check resolved bucket.

This is the semantics for link buckets, because this is not a delete link, it is same delete bucket op which OM has received.

@bharatviswa504
Copy link
Contributor

bharatviswa504 commented Jul 15, 2020

Overall LGTM.
Few minor comments/questions, once those are addressed good to go.

@bharatviswa504
Copy link
Contributor

bharatviswa504 commented Jul 15, 2020

@adoroszlai Any comments on left over 2 questions.
One related to delete bucket and other multiple times of validation of bucket.

@adoroszlai
Copy link
Contributor Author

When deleting a bucket, we check bucket is empty or not. Do we need to check resolvedBucket is empty or not?

Link can be deleted regardless of the existence or emptiness of the real bucket it points to. So no need to check resolved bucket.

This is the semantics for link buckets, because this is not a delete link, it is same delete bucket op which OM has received.

The same "delete bucket" op works for deleting links. Since links cannot have direct entries, the emptiness check will not find any, hence delete is always allowed. We don't resolve the bucket the link points to, because its emptiness does not affect whether we can delete the link.

Please let me know if I misunderstood the question.

@ChenSammi
Copy link
Contributor

@adoroszlai, thanks for working on this feature. I did some simple tests locally and looks good. A few questions about the expected behavior,

  1. When the source bucket is deleted, bucket link still exists. It's an expected behavior, right?
  2. (Unlike previous per-user S3 volume mapping) Link buckets are not inherently specific to a user, access is restricted only by ACL. Not very clear about what it means. would you explain it further? Can we set different ACL on source bucket and bucket link?
  3. We're working on Volume quota implementation. I suppose bucket link will require 0 quota from it's parent volume, only the source bucket size will be controlled by it's volume's quota, right?

@adoroszlai
Copy link
Contributor Author

adoroszlai commented Jul 16, 2020

I did some simple tests locally and looks good.

Thanks @ChenSammi for that.

  1. When the source bucket is deleted, bucket link still exists. It's an expected behavior, right?

Yes.

  1. (Unlike previous per-user S3 volume mapping) Link buckets are not inherently specific to a user, access is restricted only by ACL. Not very clear about what it means. would you explain it further?

Previously each user had its own S3 volume and only by sharing the AWS access key could others use it. Now if you create a bucket link, other users can access it as long as ACL permits (similar to other Ozone objects).

Can we set different ACL on source bucket and bucket link?

Yes. When following links eg. to create a key, only read permission is checked on the link. The source bucket is still checked for the permission required for the specific operation (eg. key list, create, or delete).

  1. We're working on Volume quota implementation. I suppose bucket link will require 0 quota from it's parent volume, only the source bucket size will be controlled by it's volume's quota, right?

I'm not familiar with quotas, but I guess links should work like empty buckets in this respect. Keys are created in the source bucket, so they should not affect space usage in the link's volume.

@ChenSammi
Copy link
Contributor

I did some simple tests locally and looks good.

Thanks @ChenSammi for that.

  1. When the source bucket is deleted, bucket link still exists. It's an expected behavior, right?

Yes.

  1. (Unlike previous per-user S3 volume mapping) Link buckets are not inherently specific to a user, access is restricted only by ACL. Not very clear about what it means. would you explain it further?

Previously each user had its own S3 volume and only by sharing the AWS access key could others use it. Now if you create a bucket link, other users can access it as long as ACL permits (similar to other Ozone objects).

Can we set different ACL on source bucket and bucket link?

Yes. When following links eg. to create a key, only read permission is checked on the link. The source bucket is still checked for the permission required for the specific operation (eg. key list, create, or delete).

  1. We're working on Volume quota implementation. I suppose bucket link will require 0 quota from it's parent volume, only the source bucket size will be controlled by it's volume's quota, right?

I'm not familiar with quotas, but I guess links should work like empty buckets in this respect. Keys are created in the source bucket, so they should not affect space usage in the link's volume.

@adoroszlai , thanks for the explanation.

@bharatviswa504
Copy link
Contributor

When deleting a bucket, we check bucket is empty or not. Do we need to check resolvedBucket is empty or not?

Link can be deleted regardless of the existence or emptiness of the real bucket it points to. So no need to check resolved bucket.

This is the semantics for link buckets, because this is not a delete link, it is same delete bucket op which OM has received.

The same "delete bucket" op works for deleting links. Since links cannot have direct entries, the emptiness check will not find any, hence delete is always allowed. We don't resolve the bucket the link points to, because its emptiness does not affect whether we can delete the link.

Please let me know if I misunderstood the question.

Makes sense to me. Thanks for the explanation.

@bharatviswa504
Copy link
Contributor

I have one suggestion, can we make this mount configurable option, as this is required only when old S3 buckets needs to be exposed via O3fs/ofs. In this way, we can avoid extra bucket checks. (I know that it is in-memory cache, but still I feel if some one does not require this feature, it is totally disabled). I think we can make that check in 2 places. One in create bucket/and other in resolveBucketLink.

Fine to do in another Jira also.

Copy link
Contributor

@bharatviswa504 bharatviswa504 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 LGTM, with one comment.
Thank You @adoroszlai for the PR.

@bharatviswa504
Copy link
Contributor

I have one suggestion, can we make this mount configurable option, as this is required only when old S3 buckets needs to be exposed via O3fs/ofs. In this way, we can avoid extra bucket checks. (I know that it is in-memory cache, but still I feel if some one does not require this feature, it is totally disabled). I think we can make that check in 2 places. One in create bucket/and other in resolveBucketLink.

Fine to do in another Jira also.

I take back this comment. This feature helps to expose buckets created via shell/ofs to S3. So making it configurable is not needed. (As by default if user decides to expose any bucket created via shell, it can be exposed without any restart.)

Thank You @arp7 for the offline discussion.

@adoroszlai
Copy link
Contributor Author

I have one suggestion, can we make this mount configurable option, as this is required only when old S3 buckets needs to be exposed via O3fs/ofs. In this way, we can avoid extra bucket checks. (I know that it is in-memory cache, but still I feel if some one does not require this feature, it is totally disabled). I think we can make that check in 2 places. One in create bucket/and other in resolveBucketLink.

Fine to do in another Jira also.

Thanks for the suggestion, sounds good to me. I think if we want to disable it by default, then it should be done as part of this task. If it's enabled by default, then it's fine to add the config later in separate task. Let me know what should be the default value. CC @arp7 @elek

@adoroszlai
Copy link
Contributor Author

I take back this comment. This feature helps to expose buckets created via shell/ofs to S3. So making it configurable is not needed. (As by default if user decides to expose any bucket created via shell, it can be exposed without any restart.)

Thank You @arp7 for the offline discussion.

OK.

@ChenSammi ChenSammi merged commit 7aff2f0 into apache:master Jul 17, 2020
@ChenSammi
Copy link
Contributor

Thanks @adoroszlai for the contribution and @bharatviswa504 for the review.

@adoroszlai adoroszlai deleted the HDDS-3612-dev branch July 17, 2020 04:56
@adoroszlai
Copy link
Contributor Author

Thanks @arp7, @elek for high-level review, @bharatviswa504 for the detailed reviews, @ChenSammi for testing and committing it.

errose28 added a commit to errose28/ozone that referenced this pull request Jul 17, 2020
* master:
  HDDS-3855. Add upgrade smoketest (apache#1142)
  HDDS-3964. Ratis config key mismatch (apache#1204)
  HDDS-3612. Allow mounting bucket under other volume (apache#1104)
  HDDS-3926. OM Token Identifier table should use in-house serialization. (apache#1182)
  HDDS-3824: OM read requests should make SCM#refreshPipeline outside BUCKET_LOCK (apache#1164)
errose28 added a commit to errose28/ozone that referenced this pull request Jul 17, 2020
…erface

* upstream/master:
  HDDS-3855. Add upgrade smoketest (apache#1142)
  HDDS-3964. Ratis config key mismatch (apache#1204)
  HDDS-3612. Allow mounting bucket under other volume (apache#1104)
  HDDS-3926. OM Token Identifier table should use in-house serialization. (apache#1182)
  HDDS-3824: OM read requests should make SCM#refreshPipeline outside BUCKET_LOCK (apache#1164)
  HDDS-3966. Disable flaky TestOMRatisSnapshots
errose28 added a commit to errose28/ozone that referenced this pull request Jul 20, 2020
* master:
  HDDS-3984. Support filter and search the columns in recon UI (apache#1218)
  HDDS-3806. Support recognize aws v2 Authorization header. (apache#1098)
  HDDS-3955. Unable to list intermediate paths on keys created using S3G. (apache#1196)
  HDDS-3741. Reload old OM state if Install Snapshot from Leader fails (apache#1129)
  HDDS-3965. SCM failed to start up for duplicated pipeline detected. (apache#1210)
  HDDS-3855. Add upgrade smoketest (apache#1142)
  HDDS-3964. Ratis config key mismatch (apache#1204)
  HDDS-3612. Allow mounting bucket under other volume (apache#1104)
  HDDS-3926. OM Token Identifier table should use in-house serialization. (apache#1182)
  HDDS-3824: OM read requests should make SCM#refreshPipeline outside BUCKET_LOCK (apache#1164)
  HDDS-3966. Disable flaky TestOMRatisSnapshots
errose28 pushed a commit to errose28/ozone that referenced this pull request Jul 21, 2020
ChenSammi pushed a commit that referenced this pull request Jul 22, 2020
rakeshadr pushed a commit to rakeshadr/hadoop-ozone that referenced this pull request Sep 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request om

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants