Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graceful Recovery Results 1.2 #1717

Merged
merged 1 commit into from
Mar 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions tests/graceful-recovery/graceful-recovery.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,11 @@ Ensure that NGF can recover gracefully from container failures without any user
1. Setup GKE Cluster.
2. Clone the repo and change into the nginx-gateway-fabric directory.
3. Check out the latest tag (unless you are installing the edge version from the main branch).
4. Go into `deploy/manifests/nginx-gateway.yaml` and change `runAsNonRoot` from `true` to `false`.
This allows us to insert our ephemeral container as root which enables us to restart the nginx-gateway container.
4. Go into `deploy/manifests/nginx-gateway.yaml` and change the following:

- `runAsNonRoot` from `true` to `false`: this allows us to insert our ephemeral container as root which enables us to restart the nginx-gateway container.
- Add the `--product-telemetry-disable` argument to the nginx-gateway container args.

5. Follow the [installation instructions](https://github.com/nginxinc/nginx-gateway-fabric/blob/main/site/content/installation/installing-ngf/manifests.md)
to deploy NGINX Gateway Fabric using manifests and expose it through a LoadBalancer Service.
6. In a separate terminal track NGF logs.
Expand Down
74 changes: 74 additions & 0 deletions tests/graceful-recovery/results/1.2.0/1.2.0.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
# Results for v1.2.0

<!-- TOC -->
- [Results for v1.2.0](#results-for-v120)
- [Summary](#summary)
- [Versions](#versions)
- [Tests](#tests)
- [Restart nginx-gateway container](#restart-nginx-gateway-container)
- [Restart NGINX container](#restart-nginx-container)
- [Restart Node with draining](#restart-node-with-draining)
- [Restart Node without draining](#restart-node-without-draining)
- [Future Improvements](#future-improvements)
<!-- TOC -->


## Summary

- No new issues since 1.1.
- Known issue https://github.com/nginxinc/nginx-gateway-fabric/issues/1108 still exists.

## Versions

NGF version:


```text
"version":"edge"
"commit":"ebb6b829d837cf3bec83ff9bf28d89895e601400"
"date":"2024-03-18T17:57:58Z"
```

with NGINX:

```text
nginx version: nginx/1.25.4
built by gcc 12.2.1 20220924 (Alpine 12.2.1_git20220924-r10)
built with OpenSSL 3.1.3 19 Sep 2023 (running with OpenSSL 3.1.4 24 Oct 2023)
```

Kubernetes:

```text
v1.29.2-gke.1217000
```

## Tests

### Restart nginx-gateway container

No errors.

### Restart NGINX container

Same error as 1.1: https://github.com/nginxinc/nginx-gateway-fabric/issues/1108

### Restart Node with draining

No errors.

### Restart Node without draining

Same issue as 1.1 where NGF is unable to recover: https://github.com/nginxinc/nginx-gateway-fabric/issues/1108

New error log in the previous NGF container:

```text
W0318 19:01:31.984977 6 reflector.go:462] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:229: watch of *v1.EndpointSlice ended with: an error on the server ("unable to decode an event from the watch stream: context canceled") has prevented the request from succeeding
```

This log comes from the client-go library that is reporting a context canceled event as an error. It is not actionable.

## Future Improvements

- None
Loading