Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch core packages to use connect. #6159

Merged
merged 6 commits into from
Apr 11, 2023
Merged

Conversation

absoludity
Copy link
Contributor

@absoludity absoludity commented Mar 31, 2023

Description of the change

This PR switches the core packages API to be using the connect gRPC for gRPC and gRPC-Web. The authz token is then set in the context when calling the (currently untransitioned) plugin implementations.

Benefits

Step 2 of 4 done.

Possible drawbacks

Applicable issues

Additional information

Still need to send auth from headers to context.

Signed-off-by: Michael Nelson <[email protected]>
@netlify
Copy link

netlify bot commented Mar 31, 2023

Deploy Preview for kubeapps-dev canceled.

Name Link
🔨 Latest commit bb3acdb
🔍 Latest deploy log https://app.netlify.com/sites/kubeapps-dev/deploys/6434f8bc8000c80008f75a46

@absoludity
Copy link
Contributor Author

So this is currently failing because of a difference in the way errors are handled by the grpc frontend libraries. We had an issue where the initial request to get resources for an installed app has no way of telling, for carvel and flux, if a 404 means the app itself does not exist, or the app has not yet been realised. This is no longer handled and requires a change in the dashboard.

Signed-off-by: Michael Nelson <[email protected]>
@absoludity
Copy link
Contributor Author

Finally found the issue causing the e2e failures in flux/carvel:

  1. For flux and carvel, when Kubeapps deploys an app, it really creates the CRD which is then handled by the flux/carvel controller, so as a result, the resource does not exist immediately. We currently deal with this in the dashboard by retrying if a "NotFound" error is returned, but
  2. the core packaging function was still returning a grpc.Status error, rather than a connect.Error, which encodes the error differently, so the frontend code which was looking for a NotFound, was not finding it and so did not retry.

I've only switched the one error that was related to the e2e failure, but we will want to switch all the others in an audit afterwards (so they are not Unknown errors).

@absoludity
Copy link
Contributor Author

...and it fixed the carvel e2e test, but not the flux one :/

@absoludity
Copy link
Contributor Author

...and it fixed the carvel e2e test, but not the flux one :/

Further debugging shows that for flux it gets as far as returning from the call to the plugin's get resource refs with the expected NotFound error but then, before the go-routine returns, it seems to dump a trace:

E0411 04:11:22.810623       1 packages.go:272] Unable to get the resource refs for the package "test-flux-6" using the plugin "fluxv2.packages": rpc error: code = NotFound desc = Unable to find Helm release "default/test-flux-6" in namespace "default": release: not found
helm.sh/helm/v3/pkg/storage/driver.init
        /go/pkg/mod/helm.sh/helm/[email protected]/pkg/storage/driver/driver.go:29
runtime.doInit
        /opt/bitnami/go/src/runtime/proc.go:6329
runtime.doInit
        /opt/bitnami/go/src/runtime/proc.go:6306
runtime.doInit
        /opt/bitnami/go/src/runtime/proc.go:6306
runtime.doInit
        /opt/bitnami/go/src/runtime/proc.go:6306
plugin.open
        /opt/bitnami/go/src/plugin/plugin_dlopen.go:101
plugin.Open
        /opt/bitnami/go/src/plugin/plugin.go:32
github.com/vmware-tanzu/kubeapps/cmd/kubeapps-apis/core/plugins/v1alpha1.(*PluginsServer).registerPlugins
        /go/src/github.com/vmware-tanzu/kubeapps/cmd/kubeapps-apis/core/plugins/v1alpha1/plugins.go:134
github.com/vmware-tanzu/kubeapps/cmd/kubeapps-apis/core/plugins/v1alpha1.NewPluginsServer
        /go/src/github.com/vmware-tanzu/kubeapps/cmd/kubeapps-apis/core/plugins/v1alpha1/plugins.go:93
github.com/vmware-tanzu/kubeapps/cmd/kubeapps-apis/server.Serve
        /go/src/github.com/vmware-tanzu/kubeapps/cmd/kubeapps-apis/server/server.go:105
github.com/vmware-tanzu/kubeapps/cmd/kubeapps-apis/cmd.newRootCmd.func2
        /go/src/github.com/vmware-tanzu/kubeapps/cmd/kubeapps-apis/cmd/root.go:39
github.com/spf13/cobra.(*Command).execute
        /go/pkg/mod/github.com/spf13/[email protected]/command.go:916
github.com/spf13/cobra.(*Command).ExecuteC
        /go/pkg/mod/github.com/spf13/[email protected]/command.go:1044
github.com/spf13/cobra.(*Command).Execute
        /go/pkg/mod/github.com/spf13/[email protected]/command.go:968
github.com/vmware-tanzu/kubeapps/cmd/kubeapps-apis/cmd.Execute
        /go/src/github.com/vmware-tanzu/kubeapps/cmd/kubeapps-apis/cmd/root.go:48
main.main
        /go/src/github.com/vmware-tanzu/kubeapps/cmd/kubeapps-apis/main.go:12
runtime.main
        /opt/bitnami/go/src/runtime/proc.go:250
runtime.goexit
        /opt/bitnami/go/src/runtime/asm_amd64.s:1594
I0411 04:11:22.823862       1 packages.go:109] "+core GetAvailablePackageDetail" cluster="default" namespace="default"

and the browser sees a 502:
get-installed-refs-flux

Caused a 502 as I assume the error was larger than expected.

Signed-off-by: Michael Nelson <[email protected]>
@absoludity
Copy link
Contributor Author

Further debugging shows that for flux it gets as far as returning from the call to the plugin's get resource refs with the expected NotFound error but then, before the go-routine returns, it seems to dump a trace:

...and the winner is: the error (which included the stack trace) was too large when converted to a ConnectError... but this was only seen during the translation back over the wire where the server returns a 502. Anyway, updated to simply send the error code over the wire and log the full error, which is working fine locally with flux.

@absoludity absoludity merged commit 84a6690 into main Apr 11, 2023
@absoludity absoludity deleted the 6013-connect-grpc-backend-6 branch April 11, 2023 23:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants