Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backwards-compatible support for @cert-authority #9

Merged
merged 3 commits into from
Jul 12, 2024

Conversation

evanelias
Copy link
Contributor

@evanelias evanelias commented Jul 8, 2024

This pull request adds support for @cert-authority lines in known_hosts files in an optional, backwards-compatible manner. Fixes #7.

With this PR's changes, users can switch from knownhosts.New() to the newly-introduced knownhosts.NewDB() in order to opt-in to the @cert-authority support. This returns a new HostKeyDB struct instead of a HostKeyCallback. It has a very slight performance penalty as it requires re-reading the known_hosts file, and it has a different return type of its HostKeys() method, but otherwise it should be a drop-in replacement for users who require CA support. This makes sense for use-cases such as general-purpose SSH clients.

Git-specific SSH use-cases can likely stay on knownhosts.New(), since public Git forges (GitHub, Gitlab, etc) don't seem to use CAs for host keys anyway. [Edit: not necessarily true, see discussion in comments below.]

All previous HostKeyCallback logic remains backwards-compatible and avoids any functionality changes. [Edit to clarify: When using knownhosts.New or knownhosts.HostKeyCallback directly, there is no CA support. To get the CA support, your calling code must switch to using knownhosts.NewDB instead.]

Implementation

This includes @Javier-varez's commit from #8 as-is, and then adds an additional commit on top to adjust the following:

  • Move the CA support to new HostKeyDB struct, making it opt-in. This avoids changing the function signature of the HostKeyCallback.HostKeys() method, in order to retain backwards compatibility and avoiding a v2 version bump for the module.

  • When re-reading the known_hosts files to implement the CA support, it only re-reads each file a single time, at constructor time instead of in the callback. It reads using buffered IO similar to x/crypto/ssh/knownhosts which should ensure its line-counting behavior matches.

  • Add test coverage for all new behaviors and @cert-authority logic.

Alternatives considered

Conceptually the information on @cert-authority lines needs to be tracked somewhere, but the difficulty with the previous design is that New() returns a HostKeyCallback which is just a function, rather than a struct. So the chosen solution here leaves that type as-is, and instead introduces a separate new struct which supports adding more fields.

Instead of introducing a separate new struct, an alternative approach would have been to use a non-exported package global of the form map[*HostKeyCallback]CertInfo, in order to track additional information on each HostKeyCallback. This would result in simpler user-facing logic, however it would then require a separate function to "de-register" a callback to avoid a memory leak. Overall that seems hackier, and less extensible if additional metadata fields are needed in the future.

Testing and feedback

The PR includes unit test coverage, but it could use some further real-world testing to ensure it properly solves #7. I will keep this PR open a few days, and hugely appreciate any community feedback.

cc @lonnywong @abakum @Javier-varez

Javier-varez and others added 2 commits July 4, 2024 18:46
The previous commit d314bf3 added support for @cert-authority lines, but
technically broke backwards compatibility due to changing the return type of
one exported method. This commit adjusts that previous commit's new logic to
restore backwards compatibility, and makes additional changes as follows:

* Introduce new exported type HostKeyDB, which handles @cert-authority lines
  correctly and is returned by NewDB; old exported type HostKeyCallback (which
  is returned by New) omits that handling. Git-specific use-cases can likely
  remain with using New, since Git forges typically don't support CAs. Non-Git
  use-cases, such as general-purpose SSH clients, should consider switching to
  NewDB to get the CA logic.

* When NewDB re-reads the known_hosts files to implement the CA support, it
  only re-reads each file a single time (vs potentially multiple times at
  callback execution time in d314bf3), and it reads using buffered IO similar
  to x/crypto/ssh/knownhosts.

* This package's PublicKey struct now exports its Cert boolean field, vs
  keeping it private in d314bf3.

* Refactor the RSA-to-algo expansion logic to simplify its handling in the CA
  situation.

* Add test coverage for all new behaviors and @cert-authority logic.
Copy link

coveralls-official bot commented Jul 8, 2024

Coverage Status

coverage: 92.857% (+0.3%) from 92.593%
when pulling 53a26cc on certs-backwards-compat
into 5832aa8 on main.

This was referenced Jul 8, 2024
@Javier-varez
Copy link
Contributor

Hi @evanelias,

Thank you so much for taking the initiative to improve my original proposal. I do think this is a better choice going forward considering the number of users that depend on this repository. Making the functionality opt-in seems sensible to me.

Curiously enough, I do need this functionality for a git ssh backend that uses certificate authorities. We have an internal tool that was recently migrated from plain git subcommand calls to https://github.com/go-git/go-git, and that's when we first encountered this issue. If we go this route, I will then propose the go-git maintainers to either support this functionality or provide some sort of toggle so that it can be enabled in third party tools. Unfortunately for us this problem was hidden a couple of dependencies down the tree.

I will test this during the next couple of days and give you some feedback if it all seems to work as intended, but on first sight, it looks good to me.

Instead of introducing a separate new struct, an alternative approach would have been to use a non-exported package global of the form map[*HostKeyCallback]CertInfo, in order to track additional information on each HostKeyCallback. This would result in simpler user-facing logic, however it would then require a separate function to "de-register" a callback to avoid a memory leak. Overall that seems hackier, and less extensible if additional metadata fields are needed in the future.

I agree with you. I tend to really dislike globals for many reasons, like concurrency issues (to avoid concurrent mutations you'd probably introduce an undesired lock) and just the plain fact that over the runtime of a program, the host key callbacks can also change, so keeping them around doesn't sound like the best idea. I like this alternative best.

@abakum
Copy link

abakum commented Jul 9, 2024

It is correct that the returned result from func (hkdb *knownhosts.HostKeyDB) HostKeyCallback() is ssh.HostKeyCallback
but from func knownhosts.New(files ...string) is knownhosts.HostKeyCallback ?

@evanelias
Copy link
Contributor Author

@Javier-varez

Curiously enough, I do need this functionality for a git ssh backend that uses certificate authorities.

Oh interesting, I didn't realize anyone used CAs for Git use-cases. I'll revise the package and method doc comments later today to remove the stuff about Git use-cases being fine to remain on the old New constructor.

I will then propose the go-git maintainers to either support this functionality or provide some sort of toggle so that it can be enabled in third party tools

Sounds good. Probably the best path would be to just have go-git switch to using NewDB always. Making it an optional toggle in go-git could be messy, since NewDB and New return different types.

Just to be safe, I'll add a method to make this easier, allowing conversion from knownhosts.HostKeyCallback to a non-CA-supporting knownhosts.HostKeyDB for this situation. But hopefully callers won't actually need it.

I will test this during the next couple of days and give you some feedback if it all seems to work as intended

Thank you, that sounds great.

@evanelias
Copy link
Contributor Author

It is correct that the returned result from func (hkdb *knownhosts.HostKeyDB) HostKeyCallback() is ssh.HostKeyCallback
but from func knownhosts.New(files ...string) is knownhosts.HostKeyCallback ?

@abakum yes, that is correct and intentional. After switching to knownhosts.NewDB / knownhosts.HostKeyDB, you should no longer need to use knownhosts.New or knownhosts.HostKeyCallback at all for anything. I can try to improve the method doc string comments further if that isn't clear currently?

* Add new exported method HostKeyCallback.ToDB, to provide a mechanism for
  callers who want to conditionally enable or disable CA support, while still
  using a *HostKeyDB for both cases.

* Clarify many doc string comments.

* Add new exported function WriteKnownHostCA for writing a @cert-authority
  line to a known_hosts file. Previously this logic was in a test helper, but
  it could be useful to others, so let's export it outside of the tests.
@abakum
Copy link

abakum commented Jul 9, 2024

Thanks, evanelias, it works for me!

@abakum
Copy link

abakum commented Jul 10, 2024

@evanelias, if line in known_hosts like

@cert-authority * SHA256:HGzeMguvVfTsMb+WfkqmjZNXaeVcBXCQqXyjKUBy9pA

then kh.HostKeyAlgorithms("127.0.0.1:22") return [[email protected]]
but kh.HostKeyAlgorithms("127.0.0.1:2222") return []
and ssh from OpenSSH works well.
How to handle it?
image
image

@evanelias
Copy link
Contributor Author

@abakum Interesting catch, thanks. But the core host-matching logic is still handled by x/crypto/ssh/knownhosts, we don't re-implement or change that here.

The match logic in x/crypto/ssh/knownhosts appears to only apply wildcards like * to the hostname portion, not the port, per https://cs.opensource.google/go/x/crypto/+/refs/tags/v0.24.0:ssh/knownhosts/knownhosts.go;l=110

return wildcardMatch([]byte(p.addr.host), []byte(a.host)) && p.addr.port == a.port

So in order to match 127.0.0.1:2222 the known_hosts entry would need to specifically be[*]:2222.

I just searched the issue tracker for https://github.com/golang/go and found golang/go#52056 which seems to describe the root of the problem. I've commented there now too.

@abakum
Copy link

abakum commented Jul 10, 2024

@evanelias
Copy link
Contributor Author

On second thought, it might be possible to build a work-around here in skeema/knownhosts for that problem with wildcards and non-standard ports. The callback logic I'm envisioning will need to be nested and tricky, so I'm not 100% certain this is feasible, but I can give this a try sometime in the next few days.

@abakum

This comment was marked as outdated.

@Javier-varez
Copy link
Contributor

@evanelias

Just to be safe, I'll add a method to make this easier, allowing conversion from knownhosts.HostKeyCallback to a non-CA-supporting knownhosts.HostKeyDB for this situation. But hopefully callers won't actually need it.

This is a good point, thank you. I drafted here how it would look like to use the knownhosts DB. Javier-varez/go-git@0879ef1

I will test this during the next couple of days and give you some feedback if it all seems to work as intended

Thank you, that sounds great.

On this front, I tested the change, and it works great for my usecase. Thank you!

@abakum
Copy link

abakum commented Jul 11, 2024

I'm trying to re-check https://github.com/abakum/knownhosts/blob/0280d4dc9533ee3f92de6b57a5efb45087cbf3e0/cmd/main.go

  1. First run with empty known_hosts
PS Y:\src\knownhosts\cmd> go run .
main.go:95: []
main.go:98: Failed to dial:  ssh: handshake failed: knownhosts: key is unknown
main.go:145: []
main.go:117: innerCallback(hostname, remote, key) knownhosts: key is unknown
main.go:118: knownhosts.IsHostKeyChanged(err) false
main.go:119: knownhosts.IsHostUnknown(err) true
main.go:129: Added host 10.161.115.189:22 to known_hosts
main.go:95: []
main.go:98: Failed to dial:  ssh: handshake failed: knownhosts: key is unknown
main.go:145: []
main.go:117: innerCallback(hostname, remote, key) knownhosts: key is unknown
main.go:118: knownhosts.IsHostKeyChanged(err) false
main.go:119: knownhosts.IsHostUnknown(err) true
main.go:129: Added host 10.161.115.189:222 to known_hosts
  1. known_hosts now
10.161.115.189 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBARrpBl177hs/ykMnXHfkjmyKTbsax/vtSl+rInZvJoF8LfJaWCZSrai0uD5qRuYhy4QnJs563NBTmCgSBhm/MA=
[10.161.115.189]:222 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBARrpBl177hs/ykMnXHfkjmyKTbsax/vtSl+rInZvJoF8LfJaWCZSrai0uD5qRuYhy4QnJs563NBTmCgSBhm/MA=
  1. Edit known_hosts to:
* ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBARrpBl177hs/ykMnXHfkjmyKTbsax/vtSl+rInZvJoF8LfJaWCZSrai0uD5qRuYhy4QnJs563NBTmCgSBhm/MA=
  1. run ssh -v [email protected]
...
debug1: Host '10.161.115.189' is known and matches the ECDSA host key.
debug1: Found key in C:\\Users\\user/.ssh/known_hosts:1
...
  1. run ssh -v [email protected] -p 222
...
debug1: Host '[10.161.115.189]:222' is known and matches the ECDSA host key.
debug1: Found key in C:\\Users\\user/.ssh/known_hosts:1...
  1. Second run with * in known_hosts
PS Y:\src\knownhosts\cmd> go run .
main.go:95: [ecdsa-sha2-nistp256]
main.go:145: [ecdsa-sha2-nistp256]
main.go:117: innerCallback(hostname, remote, key) <nil>
main.go:118: knownhosts.IsHostKeyChanged(err) false
main.go:119: knownhosts.IsHostUnknown(err) false
main.go:95: []
main.go:98: Failed to dial:  ssh: handshake failed: knownhosts: key is unknown
main.go:145: []
main.go:117: innerCallback(hostname, remote, key) knownhosts: key is unknown
main.go:118: knownhosts.IsHostKeyChanged(err) false
main.go:119: knownhosts.IsHostUnknown(err) true
main.go:129: Added host 10.161.115.189:222 to known_hosts

@evanelias
Copy link
Contributor Author

@Javier-varez awesome, thank you! re: proposed go-git changes, makes sense and looks good. Might need to leave the old NewKnownHostsCallback in place too though as deprecated, since it's exported and could be used by third-party code, unless their versioning policy isn't strict about this. Inside go-git itself that function is also called from a unit test -- see plumbing/transport/ssh/auth_method_test.go

@abakum can you please summarize the finding? Is it just confirming that * wildcards are not working with non-standard ports, or something else? Thanks!

@abakum
Copy link

abakum commented Jul 11, 2024

Yes golang not working but OpenSSH working

@evanelias
Copy link
Contributor Author

Thank you both again for the testing assistance! I'm going to merge this momentarily.

Early next week, I'll do a separate branch / pull request with a fix for the wildcards on non-standard port issue. And then once that one looks good and gets merged too, I'll tag a new release.

@evanelias evanelias merged commit 7c797a4 into main Jul 12, 2024
4 checks passed
@evanelias evanelias deleted the certs-backwards-compat branch July 12, 2024 21:32
@abakum
Copy link

abakum commented Jul 13, 2024

I'm trying to re-check https://github.com/abakum/knownhosts/blob/0280d4dc9533ee3f92de6b57a5efb45087cbf3e0/cmd/main.go

  1. First run with empty known_hosts
PS Y:\src\knownhosts\cmd> go run .
main.go:95: []
main.go:98: Failed to dial:  ssh: handshake failed: knownhosts: key is unknown
main.go:145: []
main.go:117: innerCallback(hostname, remote, key) knownhosts: key is unknown
main.go:118: knownhosts.IsHostKeyChanged(err) false
main.go:119: knownhosts.IsHostUnknown(err) true
main.go:129: Added host 10.161.115.189:22 to known_hosts
main.go:95: []
main.go:98: Failed to dial:  ssh: handshake failed: knownhosts: key is unknown
main.go:145: []
main.go:117: innerCallback(hostname, remote, key) knownhosts: key is unknown
main.go:118: knownhosts.IsHostKeyChanged(err) false
main.go:119: knownhosts.IsHostUnknown(err) true
main.go:129: Added host 10.161.115.189:222 to known_hosts
  1. known_hosts now
10.161.115.189 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBARrpBl177hs/ykMnXHfkjmyKTbsax/vtSl+rInZvJoF8LfJaWCZSrai0uD5qRuYhy4QnJs563NBTmCgSBhm/MA=
[10.161.115.189]:222 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBARrpBl177hs/ykMnXHfkjmyKTbsax/vtSl+rInZvJoF8LfJaWCZSrai0uD5qRuYhy4QnJs563NBTmCgSBhm/MA=
  1. Edit known_hosts to:
* ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBARrpBl177hs/ykMnXHfkjmyKTbsax/vtSl+rInZvJoF8LfJaWCZSrai0uD5qRuYhy4QnJs563NBTmCgSBhm/MA=
  1. run ssh -v [email protected]
...
debug1: Host '10.161.115.189' is known and matches the ECDSA host key.
debug1: Found key in C:\\Users\\user/.ssh/known_hosts:1
...
  1. run ssh -v [email protected] -p 222
...
debug1: Host '[10.161.115.189]:222' is known and matches the ECDSA host key.
debug1: Found key in C:\\Users\\user/.ssh/known_hosts:1...
  1. Second run with * in known_hosts
PS Y:\src\knownhosts\cmd> go run .
main.go:95: [ecdsa-sha2-nistp256]
main.go:145: [ecdsa-sha2-nistp256]
main.go:117: innerCallback(hostname, remote, key) <nil>
main.go:118: knownhosts.IsHostKeyChanged(err) false
main.go:119: knownhosts.IsHostUnknown(err) false
main.go:95: []
main.go:98: Failed to dial:  ssh: handshake failed: knownhosts: key is unknown
main.go:145: []
main.go:117: innerCallback(hostname, remote, key) knownhosts: key is unknown
main.go:118: knownhosts.IsHostKeyChanged(err) false
main.go:119: knownhosts.IsHostUnknown(err) true
main.go:129: Added host 10.161.115.189:222 to known_hosts
  1. known_hosts now
* ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBARrpBl177hs/ykMnXHfkjmyKTbsax/vtSl+rInZvJoF8LfJaWCZSrai0uD5qRuYhy4QnJs563NBTmCgSBhm/MA=
[10.161.115.189]:222 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBARrpBl177hs/ykMnXHfkjmyKTbsax/vtSl+rInZvJoF8LfJaWCZSrai0uD5qRuYhy4QnJs563NBTmCgSBhm/MA=
  1. Edit known_hosts to:
* ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBARrpBl177hs/ykMnXHfkjmyKTbsax/vtSl+rInZvJoF8LfJaWCZSrai0uD5qRuYhy4QnJs563NBTmCgSBhm/MA=
  1. After fix from https://github.com/skeema/knownhosts/tree/fix-wildcards-port-match run with * in known_hosts works well:
PS Y:\src\knownhosts\cmd> go run .
main.go:95: [ecdsa-sha2-nistp256]
main.go:145: [ecdsa-sha2-nistp256]
main.go:117: innerCallback(hostname, remote, key) <nil>
main.go:118: knownhosts.IsHostKeyChanged(err) false
main.go:119: knownhosts.IsHostUnknown(err) false
main.go:95: [ecdsa-sha2-nistp256]
main.go:145: [ecdsa-sha2-nistp256]
main.go:117: innerCallback(hostname, remote, key) <nil>
main.go:118: knownhosts.IsHostKeyChanged(err) false
main.go:119: knownhosts.IsHostUnknown(err) false

Thanks, @evanelias !

abakum referenced this pull request Jul 13, 2024
*** This is a work-in-progress commit, which will be amended/rewritten ***
@evanelias
Copy link
Contributor Author

Just opened #10 for the wildcard host matching fix, along with some additional documentation/README tweaks.

@evanelias
Copy link
Contributor Author

I've just tagged v1.3.0 which includes this PR, as well as the wildcard matching fix in #10. Thanks again @Javier-varez and @abakum for all your assistance here!

Javier-varez added a commit to Javier-varez/go-git that referenced this pull request Jul 24, 2024
skeema/knownhosts v1.3.0 introduced a HostKeyDB type that extends the HostKeyCallback functionality
to support @cert-authority algorithms.

`known_hosts` files may contain lines with @cert-authority markers to indicate that a line corresponds
to a certificate instead of a key. If a git remote uses cert authorities as the preferred host
identification mechanism, the functionality added in skeema/knownhosts v1.3.0 is needed so that go-git
can interact with this remote.

See skeema/knownhosts#9 for details.
Javier-varez added a commit to Javier-varez/go-git that referenced this pull request Jul 24, 2024
skeema/knownhosts v1.3.0 introduced a HostKeyDB type that extends the HostKeyCallback functionality
to support @cert-authority algorithms.

`known_hosts` files may contain lines with @cert-authority markers to indicate that a line corresponds
to a certificate instead of a key. If a git remote uses cert authorities as the preferred host
identification mechanism, the functionality added in skeema/knownhosts v1.3.0 is needed so that go-git
can interact with this remote.

See skeema/knownhosts#9 for details.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Certificate in known_hosts
3 participants