Enforce API key type on authentication#95908
Enforce API key type on authentication#95908elasticsearchmachine merged 10 commits intoelastic:mainfrom
Conversation
The PR adds support for building roles of cross-cluster API keys. These API keys are different than exsiting API keys in that they do not use the limited-by model. Their roles are only resolvable in the context of cross-cluster access.
| // TODO: emit audit event | ||
| listener.onFailure(e); |
There was a problem hiding this comment.
We have an open issue (#52551) for missing audit logs when ApiKeyService fails to authenticate. It impacts here as well, i.e. when API key type does not match, the authentication terminates, but no audit event is generated. I will fix it separately when working on the plan item for audit logging. More specifically, I plan to emit an authenticationFailure event here when the authentication "terminates". But not when it "continues", which shoud be fixed separately.
| } else if (apiKeyDoc.type != credentials.expectedType) { | ||
| listener.onResponse( | ||
| AuthenticationResult.terminate( | ||
| Strings.format( | ||
| "authentication expected API key type of [%s], but API key [%s] has type [%s]", |
There was a problem hiding this comment.
Since we have found a way to have the type information carried by ApiKeyCredentials instead of Authenticator#Context, I think it is now better to perform the type check in ApiKeyService instead of ApiKeyAuthenticator. A major advantage is that the type check can happen before we validate the API key secret. It's both faster and more secure.
There was a problem hiding this comment.
I think there's a trade-off here: technically this means that anyone with an API key ID (which is public-ish info that ends up all over logs, etc) but no valid credential could find out if an API key is REST or CROSS_CLUSTER simply by trying to authenticate with the given ID and a bogus secret. I don't think the API key type is very sensitive info, and I like the advantage of failing early and preventing authentication attempts, so I think I'm on board. WDYT?
There was a problem hiding this comment.
This is a good point. In theory we should not reveal any information unless authentication is successful. I was thinking to limit cross-cluster API keys as much as possible on the REST interface, i.e. not letting them to trigger hash computation. But that is probably not really necessary since it is probably easier for someone to get a valid API key ID and trigger hash compuation compared to getting a API key ID specific for cross-cluster usage.
Though I did consider the type information to be public, I did suggest prefix the ID with it. However since we are not revealing it at this stage, it is probably still overall better to place the check behind hash verification. It means extra milliseconds needs to be spent on first authentication. But it is cached afterwards and should have very minor overhead.
In summary, I moved the check after hash verification.
...lasticsearch/xpack/remotecluster/AbstractRemoteClusterSecurityWithMultipleRemotesRestIT.java
Show resolved
Hide resolved
|
Pinging @elastic/es-security (Team:Security) |
There was a problem hiding this comment.
LGTM! This came with a v neat and small change set.
I'm pondering the following: before this new restriction, QC-side admins could use routes like GET /_security/_authenticate to get information about the API key they've been given, which could be useful for debugging. This is not possible anymore. I think that's fine, and there was never a requirement for QC admins to be able to do this. Just something that occurred to me. It's an exception to other kinds of credentials where it's possible to run _authenticate and friends.
Also, optional but I don't think we covered secondary authentication tests (sorry if did and I missed it!).
| } | ||
|
|
||
| // Package private for testing | ||
| RoleReferenceIntersection buildRoleReferencesForCrossClusterApiKey() { |
There was a problem hiding this comment.
Nit: since this is an internal method, and separate from generic API keys now, we could return a role reference instead of an intersection.
There was a problem hiding this comment.
Indeed. I was too busy at copy & paste ... Thanks!
| } else if (apiKeyDoc.type != credentials.expectedType) { | ||
| listener.onResponse( | ||
| AuthenticationResult.terminate( | ||
| Strings.format( | ||
| "authentication expected API key type of [%s], but API key [%s] has type [%s]", |
There was a problem hiding this comment.
I think there's a trade-off here: technically this means that anyone with an API key ID (which is public-ish info that ends up all over logs, etc) but no valid credential could find out if an API key is REST or CROSS_CLUSTER simply by trying to authenticate with the given ID and a bogus secret. I don't think the API key type is very sensitive info, and I like the advantage of failing early and preventing authentication attempts, so I think I'm on board. WDYT?
...plugin/security/src/test/java/org/elasticsearch/xpack/security/authc/ApiKeyServiceTests.java
Outdated
Show resolved
Hide resolved
| () -> performRequestWithRemoteSearchUser(new Request("GET", "/wrong_api_key_type:*/_search")) | ||
| ); | ||
| assertThat(e.getResponse().getStatusLine().getStatusCode(), equalTo(401)); | ||
| assertThat(e.getMessage(), containsString("unable to authenticate user ")); |
There was a problem hiding this comment.
Not for this PR but currently, and really more a musing than an actionable suggestion.
Currently, if a REST API key is configured (in place of a cross-cluster one) this is actually not that easy to detect, for a QC-side admin.
All they get are generic 401 messages on CCS (and other remote requests), without the extra context that this is due to the wrong API key type (I think).
This isn't something we introduced with this change (same problem exists for e.g., key not found) and may be kind of tricky to address. So perhaps the best way to deal with this is just having a trouble-shooting section in docs that emphasizes common causes for a 401 here...
Just jotting this down. As I said, not something to worry about now.
There was a problem hiding this comment.
Errr this turns out to be a test bug. Thanks for cathing it!
With #94998, we should report underlying error in the response. The real issue here is that the remote_search user was not created on the querying cluster. It was a native user created in the beginning of testCrossClusterSearch method and I didn't copy it in the new method. So It was actually reporting 401 of the remote_search on the querying cluster. I fixed it by merging it back to testCrossClusterSearch. Now the error response contains message about mismatched API key type. Also these errors are logging on the QC side so that QC admins will be able to see them right after configuring the remote cluster because the handshake would fail for the same error. In summary, I think the error should be pretty visible on the QC side.
| @Nullable BytesReference metadataFlattened | ||
| ) { | ||
| this.docType = docType; | ||
| this.type = type == null ? ApiKey.Type.REST : type; |
There was a problem hiding this comment.
Optional, but wondering if we should trace log here -- could be useful debugging context.
This is a good point. I haven't thought about it before. It's worth a discussion. I raised a JIRA issue which we can evaluate once the main tasks are done.
You are right. I didn't have direct tests for it. It's now added. Thanks! |
|
@elasticmachine run elasticsearch-ci/part-2 |
|
Failure is unrelated I raised #95970 |
|
@elasticmachine run elasticsearch-ci/part-2 |
1 similar comment
|
@elasticmachine run elasticsearch-ci/part-2 |
The PR adds enforcement for API key type at authentication time. Concretely, new cross-cluster API keys (#95714) can only be used on the dedicated remote cluster interface and the existing (rest) API key must not be used for new remote cluster communication. To make cross-cluster API keys actually usable after authentication, the PR also adds support for resolving their roles.