Skip to content

Enforce API key type on authentication#95908

Merged
elasticsearchmachine merged 10 commits intoelastic:mainfrom
ywangd:rcs-api-key-role-building
May 10, 2023
Merged

Enforce API key type on authentication#95908
elasticsearchmachine merged 10 commits intoelastic:mainfrom
ywangd:rcs-api-key-role-building

Conversation

@ywangd
Copy link
Member

@ywangd ywangd commented May 8, 2023

The PR adds enforcement for API key type at authentication time. Concretely, new cross-cluster API keys (#95714) can only be used on the dedicated remote cluster interface and the existing (rest) API key must not be used for new remote cluster communication. To make cross-cluster API keys actually usable after authentication, the PR also adds support for resolving their roles.

The PR adds support for building roles of cross-cluster API keys. These
API keys are different than exsiting API keys in that they do not use
the limited-by model. Their roles are only resolvable in the context of
cross-cluster access.
Comment on lines +63 to 64
// TODO: emit audit event
listener.onFailure(e);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have an open issue (#52551) for missing audit logs when ApiKeyService fails to authenticate. It impacts here as well, i.e. when API key type does not match, the authentication terminates, but no audit event is generated. I will fix it separately when working on the plan item for audit logging. More specifically, I plan to emit an authenticationFailure event here when the authentication "terminates". But not when it "continues", which shoud be fixed separately.

Comment on lines +998 to +1002
} else if (apiKeyDoc.type != credentials.expectedType) {
listener.onResponse(
AuthenticationResult.terminate(
Strings.format(
"authentication expected API key type of [%s], but API key [%s] has type [%s]",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we have found a way to have the type information carried by ApiKeyCredentials instead of Authenticator#Context, I think it is now better to perform the type check in ApiKeyService instead of ApiKeyAuthenticator. A major advantage is that the type check can happen before we validate the API key secret. It's both faster and more secure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a trade-off here: technically this means that anyone with an API key ID (which is public-ish info that ends up all over logs, etc) but no valid credential could find out if an API key is REST or CROSS_CLUSTER simply by trying to authenticate with the given ID and a bogus secret. I don't think the API key type is very sensitive info, and I like the advantage of failing early and preventing authentication attempts, so I think I'm on board. WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point. In theory we should not reveal any information unless authentication is successful. I was thinking to limit cross-cluster API keys as much as possible on the REST interface, i.e. not letting them to trigger hash computation. But that is probably not really necessary since it is probably easier for someone to get a valid API key ID and trigger hash compuation compared to getting a API key ID specific for cross-cluster usage.

Though I did consider the type information to be public, I did suggest prefix the ID with it. However since we are not revealing it at this stage, it is probably still overall better to place the check behind hash verification. It means extra milliseconds needs to be spent on first authentication. But it is cached afterwards and should have very minor overhead.

In summary, I moved the check after hash verification.

@ywangd ywangd changed the title Add role building logic for cross-cluster API keys Enforce API key type on authentication May 9, 2023
@ywangd ywangd added >non-issue :Security/Authentication Logging in, Usernames/passwords, Realms (Native/LDAP/AD/SAML/PKI/etc) labels May 9, 2023
@ywangd ywangd marked this pull request as ready for review May 9, 2023 07:00
@ywangd ywangd requested a review from n1v0lg May 9, 2023 07:00
@elasticsearchmachine elasticsearchmachine added the Team:Security Meta label for security team label May 9, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-security (Team:Security)

Copy link
Contributor

@n1v0lg n1v0lg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! This came with a v neat and small change set.

I'm pondering the following: before this new restriction, QC-side admins could use routes like GET /_security/_authenticate to get information about the API key they've been given, which could be useful for debugging. This is not possible anymore. I think that's fine, and there was never a requirement for QC admins to be able to do this. Just something that occurred to me. It's an exception to other kinds of credentials where it's possible to run _authenticate and friends.

Also, optional but I don't think we covered secondary authentication tests (sorry if did and I missed it!).

}

// Package private for testing
RoleReferenceIntersection buildRoleReferencesForCrossClusterApiKey() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: since this is an internal method, and separate from generic API keys now, we could return a role reference instead of an intersection.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. I was too busy at copy & paste ... Thanks!

Comment on lines +998 to +1002
} else if (apiKeyDoc.type != credentials.expectedType) {
listener.onResponse(
AuthenticationResult.terminate(
Strings.format(
"authentication expected API key type of [%s], but API key [%s] has type [%s]",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a trade-off here: technically this means that anyone with an API key ID (which is public-ish info that ends up all over logs, etc) but no valid credential could find out if an API key is REST or CROSS_CLUSTER simply by trying to authenticate with the given ID and a bogus secret. I don't think the API key type is very sensitive info, and I like the advantage of failing early and preventing authentication attempts, so I think I'm on board. WDYT?

() -> performRequestWithRemoteSearchUser(new Request("GET", "/wrong_api_key_type:*/_search"))
);
assertThat(e.getResponse().getStatusLine().getStatusCode(), equalTo(401));
assertThat(e.getMessage(), containsString("unable to authenticate user "));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for this PR but currently, and really more a musing than an actionable suggestion.

Currently, if a REST API key is configured (in place of a cross-cluster one) this is actually not that easy to detect, for a QC-side admin.

All they get are generic 401 messages on CCS (and other remote requests), without the extra context that this is due to the wrong API key type (I think).

This isn't something we introduced with this change (same problem exists for e.g., key not found) and may be kind of tricky to address. So perhaps the best way to deal with this is just having a trouble-shooting section in docs that emphasizes common causes for a 401 here...

Just jotting this down. As I said, not something to worry about now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Errr this turns out to be a test bug. Thanks for cathing it!

With #94998, we should report underlying error in the response. The real issue here is that the remote_search user was not created on the querying cluster. It was a native user created in the beginning of testCrossClusterSearch method and I didn't copy it in the new method. So It was actually reporting 401 of the remote_search on the querying cluster. I fixed it by merging it back to testCrossClusterSearch. Now the error response contains message about mismatched API key type. Also these errors are logging on the QC side so that QC admins will be able to see them right after configuring the remote cluster because the handshake would fail for the same error. In summary, I think the error should be pretty visible on the QC side.

@Nullable BytesReference metadataFlattened
) {
this.docType = docType;
this.type = type == null ? ApiKey.Type.REST : type;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional, but wondering if we should trace log here -- could be useful debugging context.

@ywangd
Copy link
Member Author

ywangd commented May 10, 2023

before this new restriction, QC-side admins could use routes like GET /_security/_authenticate to get information about the API key they've been given, which could be useful for debugging. This is not possible anymore. I think that's fine, and there was never a requirement for QC admins to be able to do this. Just something that occurred to me. It's an exception to other kinds of credentials where it's possible to run _authenticate and friends.

This is a good point. I haven't thought about it before. It's worth a discussion. I raised a JIRA issue which we can evaluate once the main tasks are done.

I don't think we covered secondary authentication tests

You are right. I didn't have direct tests for it. It's now added. Thanks!

@ywangd ywangd added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label May 10, 2023
@ywangd
Copy link
Member Author

ywangd commented May 10, 2023

@elasticmachine run elasticsearch-ci/part-2

@ywangd
Copy link
Member Author

ywangd commented May 10, 2023

Failure is unrelated I raised #95970

@ywangd
Copy link
Member Author

ywangd commented May 10, 2023

@elasticmachine run elasticsearch-ci/part-2

1 similar comment
@ywangd
Copy link
Member Author

ywangd commented May 10, 2023

@elasticmachine run elasticsearch-ci/part-2

@elasticsearchmachine elasticsearchmachine merged commit 86f3716 into elastic:main May 10, 2023
@ywangd ywangd deleted the rcs-api-key-role-building branch May 10, 2023 04:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) >non-issue :Security/Authentication Logging in, Usernames/passwords, Realms (Native/LDAP/AD/SAML/PKI/etc) Team:Security Meta label for security team v8.9.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants