Skip to content

Conversation

@devmadhuu
Copy link
Contributor

What changes were proposed in this pull request?

Most of the calls to SCM are only done by the namespace layer (OzoneManager), it makes sense to make all the client-facing APIs of SCM to do Admin check. SCM should only get calls from namespace service or from admin, it cannot get direct calls from a client.
allocateBlock, deleteBlock and sortDatanodes doesn't have Admin check now. If we do Admin check for getContainerWithPipeline, we should also do the same in allocateBlock, deleteBlock and sortDatanodes.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-1796

How was this patch tested?

This patch is tested with existing Junit test cases of all the updated client facing APIs of SCM.

@devmadhuu devmadhuu marked this pull request as ready for review August 22, 2023 05:09
@Override
public List<DatanodeDetails> sortDatanodes(List<String> nodes,
String clientMachine) throws IOException {
scm.checkAdminAccess(getRemoteUser(), false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sortDatanodes is kind of a read operation, it doesn't change the in memory or persist state of pipeline.

Copy link
Contributor Author

@devmadhuu devmadhuu Aug 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ChenSammi for review. I understood and handled the comment. Pls re-review.

@ChenSammi
Copy link
Contributor

@devmadhuu , the overall patch looks good.

@errose28
Copy link
Contributor

Secure acceptance tests are failing, likely because OM user is not an SCM admin. What is the best way to resolve this?

@ChenSammi
Copy link
Contributor

Good question. We have property "ozone.administrators" and "ozone.administrators.groups" for Ozone administrator configuration. There properties cover both OM and SCM. I didn't check how the unit is failed. In a real Ozone cluster, if OM and SCM are lauched by same user, then it will be fine. If different users are used to lauch OM and SCM, we requires end user to explictily include the user who launch OM in "ozone.administrators", or the user group in "ozone.administrators.groups", otherwise the request from OM will be rejected by SCM.

@devmadhuu
Copy link
Contributor Author

Secure acceptance tests are failing, likely because OM user is not an SCM admin. What is the best way to resolve this?

@errose28 Thanks for pointing out, As per @ChenSammi suggested to have specified admin user under "ozone.administrators" config based on test case, it can be handled, but I have noticed that few robot acceptance test cases are expecting for permission denied and if I specify admin user , those failed with message "'[ ]' does not contain 'PERMISSION_DENIED'", but since we have only docker-config where "ozone.administrators" config can be added, so positive test cases passes, but negative test cases fails. Any suggestion how to handle this ?

@ChenSammi
Copy link
Contributor

ChenSammi commented Aug 24, 2023

Secure acceptance tests are failing, likely because OM user is not an SCM admin. What is the best way to resolve this?

@errose28 Thanks for pointing out, As per @ChenSammi suggested to have specified admin user under "ozone.administrators" config based on test case, it can be handled, but I have noticed that few robot acceptance test cases are expecting for permission denied and if I specify admin user , those failed with message "'[ ]' does not contain 'PERMISSION_DENIED'", but since we have only docker-config where "ozone.administrators" config can be added, so positive test cases passes, but negative test cases fails. Any suggestion how to handle this ?

Is it available to use other users which are not included in "ozone.administrators" to run the test which expects a PERMISSION_DENIED?

@devmadhuu
Copy link
Contributor Author

Secure acceptance tests are failing, likely because OM user is not an SCM admin. What is the best way to resolve this?

@errose28 Thanks for pointing out, As per @ChenSammi suggested to have specified admin user under "ozone.administrators" config based on test case, it can be handled, but I have noticed that few robot acceptance test cases are expecting for permission denied and if I specify admin user , those failed with message "'[ ]' does not contain 'PERMISSION_DENIED'", but since we have only docker-config where "ozone.administrators" config can be added, so positive test cases passes, but negative test cases fails. Any suggestion how to handle this ?

Is it available to use other users which are not included in "ozone.administrators" to run the test which expects a PERMISSION_DENIED?

I have resolved and able to successfully run all secure acceptance test cases now, however below 2 test cases related to aws s3g API is still failing.

image

@errose28
Copy link
Contributor

As per @ChenSammi suggested to have specified admin user under "ozone.administrators" config based on test case, it can be handled

Would OM admin be considered an SCM admin by default? I'm not sure we want this, but if we do not do it this way it would be a breaking change because all users would need to update their admin configs.

Also it looks like acceptance tests are still failing. @devmadhuu its probably better to convert this to a draft and debug the failures on your fork until CI is green.

@ChenSammi ChenSammi marked this pull request as draft August 29, 2023 03:23
@ChenSammi ChenSammi marked this pull request as ready for review August 29, 2023 07:02
@ChenSammi
Copy link
Contributor

ChenSammi commented Aug 29, 2023

As per @ChenSammi suggested to have specified admin user under "ozone.administrators" config based on test case, it can be handled

Would OM admin be considered an SCM admin by default? I'm not sure we want this, but if we do not do it this way it would be a breaking change because all users would need to update their admin configs.

@errose28 , currently the "ozone.administrators" applies to both OM and SCM. There is no individual property to control OM or SCM admins. And both OM and SCM will add the user who launches the services as admin. Usually, the user who launches OM, will launches SCM too, so it will not be a general problem.
Before this PR, there is already admin check on getContainerWithPipeline. This PR add the check to other APIs, so the behavior is consistent.

@devmadhuu devmadhuu marked this pull request as draft August 29, 2023 12:59
@devmadhuu devmadhuu marked this pull request as ready for review August 29, 2023 14:02
@devmadhuu
Copy link
Contributor Author

devmadhuu commented Aug 29, 2023

Should we handle other file integration tests failing in this PR as part of other JIRA/PR ?

For intermittent failures in integration tests of this PR, it will be handled as part of other PR, as many other integration tests fails intermittently.

If all is Ok in this PR, pls have a re-look.

@errose28 @ChenSammi

@ChenSammi
Copy link
Contributor

Should we handle other file integration tests failing in this PR as part of other JIRA/PR ?

file integration test failing can be handled with a new JIRA.

@devmadhuu
Copy link
Contributor Author

@ChenSammi Thank you for detailed review, As discussed, have removed checkAdminAccess for following SCM APIs:

  1. queryNode
  2. getReplicationManagerReport
  3. getContainerWithPipelineBatch
  4. getExistContainerWithPipelinesInBatch
  5. sortDatanodes

Pls have a re-look


public List<DeletedBlocksTransactionInfo> getFailedDeletedBlockTxn(int count,
long startTxId) throws IOException {
getScm().checkAdminAccess(getRemoteUser(), true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check can be removed too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check can be removed too.

Done.

Can follow link with read access
Execute kdestroy
Run Keyword Kinit test user testuser2 testuser2.keytab
Run Keyword Kinit test user testuser testuser.keytab
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still needed now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes removed.

${result} = Execute And Ignore Error ozone sh key list ${protocol}${server}/${volume}/bb1
Should contain ${result} PERMISSION_DENIED
Execute kdestroy
Run Keyword Kinit test user testuser2 testuser2.keytab
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still needed now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

@devmadhuu
Copy link
Contributor Author

@ChenSammi Handled comments, pls review.

Copy link
Contributor

@ChenSammi ChenSammi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @devmadhuu .

@ChenSammi ChenSammi changed the title HDDS-1796. SCMClientProtocolServer#getContainerWithPipeline should check for admin access HDDS-1796. Add admin access check for write opertaions in SCMClientProtocolServer and SCMBlockProtocolServer. Sep 7, 2023
@ChenSammi ChenSammi merged commit a31e438 into apache:master Sep 7, 2023
@ashishkumar50
Copy link
Contributor

@ChenSammi @devmadhuu, In real ozone secure cluster, If remoteUser is "om" and scmAdmins is having "scm", because of this allocateBlock fails. Do we need to change any configuration?
Whether allocateBlock, deleteBlock need this check?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants