Skip to content

Conversation

simonswine
Copy link
Contributor

@simonswine simonswine commented Aug 13, 2020

What this PR does:

Fixes configuration for TLS server validation, TLS skip verify was enabled for all TLS configurations and prevented validation of server certificates. This change adds a new flag for skipping validation which defaults to false.

Which issue(s) this PR fixes:

Fixes #3029

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@simonswine simonswine marked this pull request as draft August 13, 2020 09:50
@simonswine simonswine force-pushed the fix-tls-validation branch 4 times, most recently from f997530 to 7796cc9 Compare August 13, 2020 16:15
@simonswine simonswine marked this pull request as ready for review August 13, 2020 16:35
Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid work, thanks! You found the issue, fixed, tested, code changes are clean. Good attitude and good job 👏

I left a couple of nits and questions. Then we can 🚀

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to reviewers:

To change the label in the container context, you can add either of two suffixes :z or :Z to the volume mount. These suffixes tell Docker to relabel file objects on the shared volumes. The z option tells Docker that two containers share the volume content. As a result, Docker labels the content with a shared content label. Shared volume labels allow all containers to read/write content. The Z option tells Docker to label the content with a private unshared label. Only the current container can use a private volume.

From: https://docs.docker.com/engine/reference/commandline/run/#mount-volumes-from-container---volumes-from

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it's worth to also check for && !cfg.InsecureSkipVerify && cfg.ServerName == "" or we could just assume that if no cert file has been configured then TLS is disabled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that check is there for the case when someone what's to connect via TLS, but is not using client certificate auth.

I think it is still valid to skipVerify all together or make use of the ServerName flag to require a certain hostname on the server ceritficate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that check is there for the case when someone what's to connect via TLS, but is not using client certificate auth.

Is this case supported? Have you ever tested it? Also, following the case you describe, looks there's no way to use it unless disabling skip verify or setting the server name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be clearer to me to have special flag to enable TLS. I'm not sure if just setting InsecureSkipVerify should enable TLS. WDYT?

For example, it should be enough to just enable TLS via this flag, without setting anything else: in that case default set of CA certificates is used, with no client authentication. I think that should be a valid option, but right now that's not possible.

Copy link
Contributor

@pracucci pracucci Aug 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having a private conversation with Christian about this. I agree with you Peter, but I believe it's out of the scope of this PR honestly. This PR was supposed a bug fix and it's turning out to be a feature / change as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're all on the same page, but with different takes on what to do. Let's try to come up with action items.

Some facts:

  1. The version in master requires certs to enable TLS
  2. This PR was initially meant to be a bugfix
  3. This PR is moving forward improving TLS, allowing to use TLS for "1) Client verifies server" only use case, but it's a bit buggy because you can't enable use case (1) without either disabling verification or setting server name

The question we have to answer is:

  • Do we want this PR to be a bugfix or a mix of bugfix + additional feature?

My general suggestion is to keep separation of concerns so, if it's to me, I would:

  • Keep this PR as bug fix (no logic change except for the skip verification)
  • Open a new PR with the TLS improvements

If you prefer to do both in this PR I'm OK as well, but I believe we should fix the config issue. I guess we should add an explicit option to enable TLS.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason this became an "enhancement" as well, was that the integration tests only work if you either skip-verify or provide a server-name. Otherwise there will be a validation error, as we are connecting to unpredictable hosts IP addresses, which we get from consul (I assume).

If we want to keep it as minimal as possible we probably should set skip-verify to true in the integration tests and then iterate on the whole TLS implementation.

My feeling is we need to treat GRPC and HTTP client configuration slightly different:

If you enable TLS with HTTP you are required to use https:// in the URL explicitly. For GRPC we default to tls-insecure-skip-verify being true, through this

return []grpc.DialOption{grpc.WithInsecure()}, nil
. If we remove that check we would require valid certificates for everyone.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to keep it as minimal as possible we probably should set skip-verify to true in the integration tests and then iterate on the whole TLS implementation.

Setting skip-verify in integration tests sounds reasonable to me.

If you enable TLS with HTTP you are required to use https:// in the URL explicitly.

Right. It's not so much about enabling rather than configuring TLS in this case.

For GRPC we default to tls-insecure-skip-verify being true, through this

Whether TLS is used for gRPC or not should be specified for each gRPC config, and default to false (and when using grpc.WithInsecure()). It looks like it should not even be in tls.ClientConfig config then. 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To move further with this PR, I believe Marco's suggestion is to do

Suggested change
if cfg.CertPath == "" && cfg.KeyPath == "" && cfg.CAPath == "" && !cfg.InsecureSkipVerify && cfg.ServerName == "" {
if cfg.CertPath == "" && cfg.KeyPath == "" && cfg.CAPath == "" {

i.e. to enable TLS only if certificates are configured. That is compatible with behavior on master, although not as clear as explicit flag, but we can do the flag in other PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will split out the bugfix from the enhancement of allowing a ServerName specified.

Without having the ability to configure a expected ServerName, we will not have a feasible way of running running cortex without skipping verify (as explained in my comment before), so it should remain a priority sorting that out. We will need to think a bit more how to expose that flag for GRPC clients. Another big focus needs to be testing TLS. The current coverage is quite minimal and will even be more limited by skipping verify.

Copy link
Contributor

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good PR, thanks for fixing hardcoded skip verify – that's clearly bad.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be clearer to me to have special flag to enable TLS. I'm not sure if just setting InsecureSkipVerify should enable TLS. WDYT?

For example, it should be enough to just enable TLS via this flag, without setting anything else: in that case default set of CA certificates is used, with no client authentication. I think that should be a valid option, but right now that's not possible.

@simonswine simonswine force-pushed the fix-tls-validation branch 3 times, most recently from fd88adb to 0fbdab3 Compare August 17, 2020 09:11
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
f.StringVar(&cfg.CAPath, prefix+".tls-ca-path", "", "Patht to the CA certificates file to validate server certificate against. If not set, the host's root CA certificates are used.")
f.StringVar(&cfg.CAPath, prefix+".tls-ca-path", "", "Path to the CA certificates file to validate server certificate against. If not set, the host's root CA certificates are used.")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to keep it as minimal as possible we probably should set skip-verify to true in the integration tests and then iterate on the whole TLS implementation.

Setting skip-verify in integration tests sounds reasonable to me.

If you enable TLS with HTTP you are required to use https:// in the URL explicitly.

Right. It's not so much about enabling rather than configuring TLS in this case.

For GRPC we default to tls-insecure-skip-verify being true, through this

Whether TLS is used for gRPC or not should be specified for each gRPC config, and default to false (and when using grpc.WithInsecure()). It looks like it should not even be in tls.ClientConfig config then. 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To move further with this PR, I believe Marco's suggestion is to do

Suggested change
if cfg.CertPath == "" && cfg.KeyPath == "" && cfg.CAPath == "" && !cfg.InsecureSkipVerify && cfg.ServerName == "" {
if cfg.CertPath == "" && cfg.KeyPath == "" && cfg.CAPath == "" {

i.e. to enable TLS only if certificates are configured. That is compatible with behavior on master, although not as clear as explicit flag, but we can do the flag in other PR.

Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @simonswine for your hard work here and to patiently address my feedback. I know the discussion went longer then initially planned, but I think we're super close to merge it. I left one last comment.

@simonswine simonswine force-pushed the fix-tls-validation branch 2 times, most recently from f9bacbf to 0582424 Compare August 19, 2020 15:45
Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Would you mind opening a GitHub issue with your suggestions to improve TLS support, not necessarily requiring certs? I think the enhancements you did propose here were very good, but we need to add a config option to explicitly enable TLS.

@simonswine
Copy link
Contributor Author

Yes will take a look at this tomorrow

@simonswine
Copy link
Contributor Author

LGTM, thanks!

Thank you both for the reviews 👍

Fixes configuration for TLS server validation, TLS skip verify was
enabled for all TLS configurations and prevented validation of server
certificates.

Signed-off-by: Christian Simon <[email protected]>
Copy link
Contributor

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

This removes the bash script for CA and certificate generation and
replaces it with some golang library. This allows more control over the
certificates generated. The tls-server-name flags are now used to ensure
we are talking to a server with an appropriate certificate.

Signed-off-by: Christian Simon <[email protected]>
The 'z' option tells Docker that the volume content will be shared
between containers. Otherwise SELinux prevents file access between the
containers.

Signed-off-by: Christian Simon <[email protected]>
@pstibrany pstibrany merged commit 2bda7b9 into cortexproject:master Aug 20, 2020
simonswine added a commit to grafana/e2e that referenced this pull request Jan 13, 2022
…oject/cortex#3030)

* Ensure TLS server certificate validation is enabled

Fixes configuration for TLS server validation, TLS skip verify was
enabled for all TLS configurations and prevented validation of server
certificates.

Signed-off-by: Christian Simon <[email protected]>

* Improve integration tests

This removes the bash script for CA and certificate generation and
replaces it with some golang library. This allows more control over the
certificates generated. The tls-server-name flags are now used to ensure
we are talking to a server with an appropriate certificate.

Signed-off-by: Christian Simon <[email protected]>

* Mount volumes during integration tests with `z`

The 'z' option tells Docker that the volume content will be shared
between containers. Otherwise SELinux prevents file access between the
containers.

Signed-off-by: Christian Simon <[email protected]>

* Provide a better error if cert or key is missing

Signed-off-by: Christian Simon <[email protected]>

* Add comments where x509 material is sourced from

Signed-off-by: Christian Simon <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TLS verify is skipped by default

3 participants