-
Notifications
You must be signed in to change notification settings - Fork 593
HDDS-4044. Deprecate ozone.s3g.volume.name. #1270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
9df70f0 to
8d90c93
Compare
avijayanhwx
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM +1.
Minor comment inline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit. space before 'configurations'.
'configurations' --> 'configuration'
'all S3Gateway use' --> 'all S3Gateway buckets use' ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the reasons for this configuration is to use different volumeName across different S3G's. So if someone is Using the configuration with that intention, the logline was to say If different S3G's use different volumes, make sure user creates those volumes before performing those operations. (So, no mention of the bucket in the log)
|
Let's say I use Mapreduce and With configuring s3g to use I think it's reasonable to keep the configuration even if we have the bucket link functionality. |
More flexibility usually comes with more complexity. :) The example Bharat highlighted seems valid - different S3 gateways using a different config key. I think in the future we can support volume level links which will allow pointing |
In general, I can agree with this statement, but in this specific case: the additional complexity of one configuration key seems to be minimal.
"Different S3 gateways using a different config key" --> It's also a feature which can be used in some specific cases. But I agree, flexibility vs complexity can be a balance. Sometimes It can be hard to choose between them without being opinionated. Here, I feel I have two different views: When it's part of an open source project: I would prefer to keep the project flexible (within boundaries !!!) but use smart defaults. When it's part of distribution of a vendor: I see it more reasonable to restrict specific usage patterns. But these restrictions can be specific a usage (vendor/distribution) and others may use the open source project in a different way. But again, it's more like a philosophy, hard to use technical arguments here. If you really like to remove the documentation of this configuration (It means to remove the 3rd sentence from here: https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/docs/content/interface/S3.md, I guess), let's do that. (I am +0 about changing only the docs: I don't like it, but I can accept it if it's your strong opinion) In general, I prefer to explain the risk and teach the user, instead of hiding information. |
|
+1 for this change. Keep exposing this can be problematic after OM create default s3v volume based on the same configuration upon start. If this got changed and then all the buckets in the old s3v volume will not be available until manually linked. We may revisit this when the volume level link is added. |
|
How do we handle legacy volumes and buckets? |
|
As mentioned in the PR description user can use bucket links. For more information refer HDDS-3612. This feature is totally not removed, making this hidden, as this config is added, at that time we don't have support for bucket link. |
Get it. So not really deprecating it as the title says? @bharatviswa504 |
yes, we want new users to not use this configuration. As the bucket link feature is available now. As this can use some additional complications, as mentioned in the PR description again :) |
As I wrote in my previous comment, bucket link is not a replacement of this configuration. This configuration can expose an other volume and all buckets. With bucket links you should updated your links in case of the creation of any new bucket.
It is deprecating but not removing: Which means that a warning will be printed out without alternatives (linking new bucket every day is not the same). As this feature is used by our early adopters (who started to use s3g with the old scheme) I think it's important to keep this functionality as is, without warning. As I wrote, I can accept the consensus proposal from @arp7: Let's remove it from the s3g doc page but keep the same code. (Later we can create a separated page about powerful but risky configuration, and it can be added back to there...) |
Let's say when s3G started all buckets are created under volume s3v. Now different S3G's started using different volumes, now when the user wants to use the bucket in o3fs, now he is not sure which volume to use. As said in the PR description this can cause this kind of issue. I understood the use case of you, have mentioned, and also we are not totally eliminating this, if someone is advanced user and knows what he is doing, they can still use this, but they will see the warning. (But it is not documented, I think this can be addressed with a new page with your proposal) And also one more point, using a single volume for S3G buckets came to eliminate a command which was there before So, to avoid this, might be admin need to give one S3G URL and the volume used underneath it to users. (And admin can go ahead and create the volumes and provide appropriate permissions). And also admin needs to use separate config for each S3G. Okay coming back to the comment, to get this in are you proposing we need to remove the warning which we have added in the code? |
I might have a different view this, because I think here are multiple, different roles, which are mixed here.
I think it's acceptable, if the administrator modifies some fundamental part of the environment setup, the user should be notified. It's very similar to a DNS name refactor. There are cases where the knowledge of the users should be updated. I think it's acceptable, and this is the responsibility of the administrator to judge if it's ok or not in a specific environment / company culture. There are also cases when the administrator would like to start with
Similar, but not the same. There are significant differences:
Actually it was suggested by @arp7, but yes. I am fine with that. I think we shold keep it in |
|
Thinking a little more: if I understoof well, you are worried about admins who will start multiple s3g servers with different configs. On the other hand I would prefer to suport customizable volume name (but practically the same for all s3g instances). The problem is that we couldn't check if different services are started with different settings or not. We had earlier a plan to do some kind of configuration download during the service startup to simplify the configuration of the services. With such approach, you can be sure that all the services use the same configs (but power users can do any evil actions, anyway...) |
2e6af7f to
a7e4a04
Compare
|
Leaving this key easily discoverable is a bad idea and will result in users unintentionally shooting themselves in the foot. However I don't have the energy to argue this ad-infinitum so in the interests of making forward progress, let's just go ahead. |
(cherry picked from commit d7ea496)
What changes were proposed in this pull request?
HDDS-3612 introduced bucket links.
After this feature now we don't need this parameter, any volume/bucket can be exposed to S3 via using bucket links.
ozone bucket link srcvol/srcbucket destvol/destbucket
So now to expose any ozone bucket to S3G
For example, the user wants to expose a bucket named bucket1 under volume1 to S3G, they can run below command
ozone bucket link volume1/bucket1 s3v/bucket2
Now, the user can access all the keys in volume/bucket1 using s3v/bucket2 and also ingest data to the volume/bucket1 using using s3v/bucket2
This Jira is opened to remove the config from ozone-default.xml
And also log a warning message to use bucket links, when it does not have default value s3v.
With this configuration it causes trouble to users, to figure which volume is used for their buckets whenever this configuration is changed.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-4044
How was this patch tested?
Added UT