Skip to content

Commit

Permalink
Lower upstream count for scale nfr test with nginx plus (#2695)
Browse files Browse the repository at this point in the history
Problem: We were running into infrequent test failures due to nginx plus running out of memory when scaling upstreams.

Solution: Lower the upstream count when running with nginx plus by 2% of the previous maximum. Will monitor future NFR runs to see if this issue persists.

Testing: Ran the nfr scale test and it passed.
  • Loading branch information
bjee19 authored Oct 16, 2024
1 parent 0a4f0fb commit f2da836
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 5 deletions.
8 changes: 7 additions & 1 deletion tests/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -276,7 +276,7 @@ required env vars. `GKE_SVC_ACCOUNT` needs to be the name of a service account t

In order to run the tests in GCP, you need a few things:

- GKE router to allow egress traffic (used by upgrade tests for pulling images from Github)
- GKE router to allow egress traffic (used by upgrade tests for pulling images from Github, and scale/reconfig tests for installing prometheus)
- this assumes that your GKE cluster is using private nodes. If using public nodes, you don't need this.
- GCP VM and firewall rule to send ingress traffic to GKE

Expand All @@ -286,6 +286,12 @@ To just set up the VM with no router (this will not run the tests):
make create-and-setup-vm
```

To set up just the router:

```makefile
make create-gke-router
```

Otherwise, you can set up the VM, router, and run the tests with a single command. See the options below.

By default, the tests run using the version of NGF that was `git cloned` during the setup. If you want to make
Expand Down
8 changes: 4 additions & 4 deletions tests/suite/scale_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ var _ = Describe("Scale test", Ordered, Label("nfr", "scale"), func() {
httpsListenerCount = 64
httpRouteCount = 1000
ossUpstreamServerCount = 648
plusUpstreamServerCount = 556
plusUpstreamServerCount = 545
)

BeforeAll(func() {
Expand Down Expand Up @@ -441,7 +441,7 @@ The logs are attached only if there are errors.

Eventually(
framework.CreateResponseChecker(url, address, timeoutConfig.RequestTimeout),
).WithTimeout(30 * time.Second).WithPolling(100 * time.Millisecond).Should(Succeed())
).WithTimeout(5 * timeoutConfig.RequestTimeout).WithPolling(100 * time.Millisecond).Should(Succeed())

ttr := time.Since(startCheck)

Expand Down Expand Up @@ -475,7 +475,7 @@ The logs are attached only if there are errors.

Eventually(
framework.CreateResponseChecker(url, address, timeoutConfig.RequestTimeout),
).WithTimeout(5 * time.Second).WithPolling(100 * time.Millisecond).Should(Succeed())
).WithTimeout(5 * timeoutConfig.RequestTimeout).WithPolling(100 * time.Millisecond).Should(Succeed())

Expect(
resourceManager.ScaleDeployment(namespace, "backend", upstreamServerCount),
Expand All @@ -488,7 +488,7 @@ The logs are attached only if there are errors.

Eventually(
framework.CreateResponseChecker(url, address, timeoutConfig.RequestTimeout),
).WithTimeout(5 * time.Second).WithPolling(100 * time.Millisecond).Should(Succeed())
).WithTimeout(5 * timeoutConfig.RequestTimeout).WithPolling(100 * time.Millisecond).Should(Succeed())
}

setNamespace := func(objects framework.ScaleObjects) {
Expand Down

0 comments on commit f2da836

Please sign in to comment.