Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential UX improvement for SSH tunnel configuration #513

Open
willdonnelly opened this issue Jun 4, 2022 · 5 comments
Open

Potential UX improvement for SSH tunnel configuration #513

willdonnelly opened this issue Jun 4, 2022 · 5 comments

Comments

@willdonnelly
Copy link
Member

willdonnelly commented Jun 4, 2022

I think that the configuration process for SSH tunneling can be made a lot better without a lot of difficulty.

Ideally the user shouldn't need to pick an arbitrary local port, and they shouldn't need to remember to point the main connector config at 127.0.0.1 and then put the real destination IP/hostname in the SSH forwarding config. The reason this is tricky is because the underlying connector's configuration is treated as a black box, and it's left entirely up to the user to point the connector at the SSH tunnel.

I think this can be improved significantly if we're willing to be a little bit Opinionated about how connectors represent their target address configuration. A proposal:

  • The 'Connector Config Guidelines' should strongly recommend representing host/port configuration as a top-level /address property with a value like "host:port". This seems desirable in general for consistency across connectors, but embracing a specific name also makes the next part very simple.

  • When asked to do actual work (validate/discover/read/etc), the connector proxy inspects the config. If /networkTunnel/sshForwarding/{forwardHost,forwardPort,localPort} are all unset and an /address property is present it:

    1. Uses the host:port information from the /address field to fill out its forwardHost and forwardPort information.
    2. Randomly picks a value for localPort from of some reasonable range.
    3. Replaces /address with "127.0.0.1:<localPort>" before giving the configuration to the underling connector.

Finally, when asked for a config spec the connector proxy should do something intelligent depending on whether an /address property is described in the underlying connector's config spec. I'm not sure what would be best here, but options include:

  • Omit the 'Forward Host', 'Forward Port', and 'Local Port' properties if /address exists.
  • Alternatively, still include those properties but somehow mark them as more advanced than the rest.
  • Or the most extreme option: just omit the SSH forwarding config entirely if no /address property is present, since SaaS connectors which don't take a target host/port can't benefit from SSH forwarding in the first place.

The main thing that makes me uncomfortable about this design is the reliance on a hard-coded property name /address, but I think that this satisfies the main goals of providing a better UX while being very straightforward to implement, and it wasn't clear to me that doing something clever with JSON schema annotations to indicate "This here is the address property" would actually buy us anything.

Since the logic would still accept manually specified forwardHost / forwardPort / localPort settings this change would be mostly backwards-compatible, except that materialize-postgres would need to be modified to replace the current host and port config properties with a combined "Address". But that's probably a good idea even without this proposal, just for consistency with source-mysql and source-postgres.

@willdonnelly
Copy link
Member Author

willdonnelly commented Jun 21, 2022

Also worth noting that the "Host/Port to Address" change to materialize-postgres can be done in a backwards-compatible way by adding the /address property, removing the host/port properties from the config specification but not the config struct, and a little bit of logic so it'll use whichever mechanism is present. So it should be possible to do this without any breaking changes to existing captures or materializations.

@mdibaiee
Copy link
Member

Okay, since #563 is implemented, I suppose the remaining part for us here is to avoid showing Network Tunnel config for connectors that do not expose the address property.

@willdonnelly
Copy link
Member Author

Okay, since #563 is implemented, I suppose the remaining part for us here is to avoid showing Network Tunnel config for connectors that do not expose the address property.

Pretty much, yeah. It should be trivial to make that change when we're ready, but before doing that I still need to update materialize-postgres to have the address property, and I want to take another look at all other connectors to make sure there aren't any others that can meaningfully benefit from SSH tunneling.

For instance, the "Amazon S3" capture has an "AWS Endpoint" property (which is presumably an AWS Service Endpoint URL), and it seems like it could theoretically be useful to have the capability to expose a self-hosted S3 clone over an SSH tunnel, but that may be more effort than it's worth at this time.

@mdibaiee
Copy link
Member

mdibaiee commented Jul 28, 2022

@willdonnelly one thing I just noticed is that, given the address field and our automatic integration with it, does it mean that the port is now a required part of the address and connectors should not put a default on it?

e.g. what happens if I create a postgres materialization with address: localhost without a port?

@willdonnelly
Copy link
Member Author

willdonnelly commented Jul 28, 2022

That's a bit of a grey area. Currently source-postgres and source-mysql will try to be helpful and use the appropriate default port if you just specify a hostname, but obviously the SSH tunneling code doesn't currently have any mechanism to know what the default port number for a particular connector ought to be so this only applies to direct connections.

And I actually think that's a reasonable solution for the moment, and I think if we want to make things more consistent between SSH-forwarding and direct connections the way to go is figuring out how to give the connector proxy information about what the default port number should be if address is just a bare hostname.

So my recommendation would be that materialize-postgres should support a bare hostname like address: db.example.com by appending the appropriate port number (assuming that the existing logic doesn't already have some code for the default port inside the database client).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants