Skip to content

Commit

Permalink
Graceful Recovery Results 1.2 (#1717)
Browse files Browse the repository at this point in the history
Problem: Need the results of the graceful recovery tests for the 1.2 release.

Solution: Add the results.
  • Loading branch information
kate-osborn authored Mar 18, 2024
1 parent 96a4424 commit 49ef3ef
Show file tree
Hide file tree
Showing 2 changed files with 79 additions and 2 deletions.
7 changes: 5 additions & 2 deletions tests/graceful-recovery/graceful-recovery.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,11 @@ Ensure that NGF can recover gracefully from container failures without any user
1. Setup GKE Cluster.
2. Clone the repo and change into the nginx-gateway-fabric directory.
3. Check out the latest tag (unless you are installing the edge version from the main branch).
4. Go into `deploy/manifests/nginx-gateway.yaml` and change `runAsNonRoot` from `true` to `false`.
This allows us to insert our ephemeral container as root which enables us to restart the nginx-gateway container.
4. Go into `deploy/manifests/nginx-gateway.yaml` and change the following:

- `runAsNonRoot` from `true` to `false`: this allows us to insert our ephemeral container as root which enables us to restart the nginx-gateway container.
- Add the `--product-telemetry-disable` argument to the nginx-gateway container args.

5. Follow the [installation instructions](https://github.com/nginxinc/nginx-gateway-fabric/blob/main/site/content/installation/installing-ngf/manifests.md)
to deploy NGINX Gateway Fabric using manifests and expose it through a LoadBalancer Service.
6. In a separate terminal track NGF logs.
Expand Down
74 changes: 74 additions & 0 deletions tests/graceful-recovery/results/1.2.0/1.2.0.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
# Results for v1.2.0

<!-- TOC -->
- [Results for v1.2.0](#results-for-v120)
- [Summary](#summary)
- [Versions](#versions)
- [Tests](#tests)
- [Restart nginx-gateway container](#restart-nginx-gateway-container)
- [Restart NGINX container](#restart-nginx-container)
- [Restart Node with draining](#restart-node-with-draining)
- [Restart Node without draining](#restart-node-without-draining)
- [Future Improvements](#future-improvements)
<!-- TOC -->


## Summary

- No new issues since 1.1.
- Known issue https://github.com/nginxinc/nginx-gateway-fabric/issues/1108 still exists.

## Versions

NGF version:


```text
"version":"edge"
"commit":"ebb6b829d837cf3bec83ff9bf28d89895e601400"
"date":"2024-03-18T17:57:58Z"
```

with NGINX:

```text
nginx version: nginx/1.25.4
built by gcc 12.2.1 20220924 (Alpine 12.2.1_git20220924-r10)
built with OpenSSL 3.1.3 19 Sep 2023 (running with OpenSSL 3.1.4 24 Oct 2023)
```

Kubernetes:

```text
v1.29.2-gke.1217000
```

## Tests

### Restart nginx-gateway container

No errors.

### Restart NGINX container

Same error as 1.1: https://github.com/nginxinc/nginx-gateway-fabric/issues/1108

### Restart Node with draining

No errors.

### Restart Node without draining

Same issue as 1.1 where NGF is unable to recover: https://github.com/nginxinc/nginx-gateway-fabric/issues/1108

New error log in the previous NGF container:

```text
W0318 19:01:31.984977 6 reflector.go:462] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:229: watch of *v1.EndpointSlice ended with: an error on the server ("unable to decode an event from the watch stream: context canceled") has prevented the request from succeeding
```

This log comes from the client-go library that is reporting a context canceled event as an error. It is not actionable.

## Future Improvements

- None

0 comments on commit 49ef3ef

Please sign in to comment.