nginx · bjee19 · Oct 24, 2023 · Oct 24, 2023
diff --git a/tests/reconfig/results/1.0.0/1.0.0.md b/tests/reconfig/results/1.0.0/1.0.0.md
@@ -0,0 +1,78 @@
+# Reconfiguration testing Results
+
+<!-- TOC -->
+- [Reconfiguration testing Results](#reconfiguration-testing-results)
+  - [Test environment](#test-environment)
+  - [Results Tables](#results-tables)
+    - [NGINX Reloads and Time to Ready](#nginx-reloads-and-time-to-ready)
+    - [Event Batch Processing](#event-batch-processing)
+  - [NumResources -> Total Resources](#numresources---total-resources)
+  - [Observations](#observations)
+<!-- TOC -->
+
+## Test environment
+
+GKE cluster:
+
+- Node count: 3
+- Instance Type: e2-medium
+- k8s version: 1.27.3-gke.100
+- Zone: us-central1-c
+- Total vCPUs: 6
+- Total RAM: 12GB
+- Max pods per node: 110
+
+NGF deployment:
+
+- NGF version: edge - git commit 29b45e38bacd7c4f22834938105e3cda4f29f6d1
+- NGINX Version: 1.25.2
+
+## Results Tables
+
+### NGINX Reloads and Time to Ready
+
+| Test number | NumResources | TimeToReadyTotal (s) | TimeToReadyAvgSingle (s) | NGINX reloads | NGINX reload avg time (ms) | <= 500ms | <= 1000ms |
+|-------------|--------------|----------------------|--------------------------|---------------|----------------------------|----------|-----------|
+| 1           | 30           | 1                    | 1                        | 2             | 191                        | 100%     | 100%      |
+| 1           | 150          | 2                    | 2                        | 2             | 440                        | 50%      | 100%      |
+| 2           | 30           | 50                   | <1                       | 93            | 162                        | 100%     | 100%      |
+| 2           | 150          | 208                  | <1                       | 396           | 281                        | 96.46%   | 100%      |
+| 3           | 30           | 1                    | 1                        | 93            | 129                        | 100%     | 100%      |
+| 3           | 150          | 1                    | 1                        | 453           | 130                        | 100%     | 100%      |
+
+
+### Event Batch Processing
+
+| Test number | NumResources | Event Batch Total | Event Batch Processing avg time (ms) | <= 500ms | <= 1000ms |
+|-------------|--------------|-------------------|--------------------------------------|----------|-----------|
+| 1           | 30           | 69                | 6.232                                | 100%     | 100%      |
+| 1           | 150          | 309               | 3.638                                | 99.68%   | 100%      |
+| 2           | 30           | 465               | 38.759                               | 100%     | 100%      |
+| 2           | 150          | 1941              | 68.539                               | 98.51%   | 100%      |
+| 3           | 30           | 374               | 36.834                               | 99.73%   | 99.73%    |
+| 3           | 150          | 1812              | 40.411                               | 99.94%   | 99.94%    |
+
+
+## NumResources -> Total Resources
+| NumResources | Gateways | Secrets | ReferenceGrants | Namespaces | application Pods | application Services | HTTPRoutes | Total Resources |
+| ------------ | -------- | ------- | --------------- | ---------- | ---------------- | -------------------- | ---------- | --------------- |
+| x            | 1        | 1       | 1               | x+1        | 2x               | 2x                   | 3x         | <total>         |
+| 30           | 1        | 1       | 1               | 31         | 60               | 60                   | 90         | 244             |
+| 150          | 1        | 1       | 1               | 151        | 300              | 300                  | 450        | 1204            |
+
+## Observations
+
+1. We are reloading after reconciling a ReferenceGrant even when there is no Gateway. This is because we treat every
+   upsert/delete of a ReferenceGrant as a change. This means we will regenerate NGINX config every time a ReferenceGrant
+   is created, updated (generation must change), or deleted, even if it does not apply to the accepted Gateway.
+
+   Issue filed: https://github.com/nginxinc/nginx-gateway-fabric/issues/1124
+
+2. We are reloading after reconciling a HTTPRoute even when there is no accepted Gateway and no config being generated.
+
+   Issue filed: https://github.com/nginxinc/nginx-gateway-fabric/issues/1123
+
+3. Majority of NGINX reloads were in the <= 500ms bucket, with all of them being in the <= 1000ms bucket. An increase
+   in the reload time based on number of configured resources resulting in NGINX configuration changes was observed.
+
+4. No errors (NGF or NGINX) were observed in any test run.
diff --git a/tests/reconfig/results/v1.0.0.md b/tests/reconfig/results/v1.0.0.md
diff --git a/tests/reconfig/setup.md b/tests/reconfig/setup.md
@@ -13,8 +13,8 @@
 
 ## Goals
 
-- Measure how long it takes NGF to reconfigure NGINX when a number of Gateway API and referenced core Kubernetes
-  resources are created at once.
+- Measure how long it takes NGF to reconfigure NGINX and update statuses when a number of Gateway API and
+  referenced core Kubernetes resources are created at once.
 - Two runs of each test should be ran with differing numbers of resources. Each run will deploy:
   - a single Gateway, Secret, and ReferenceGrant resources
   - `x+1` number of namespaces
@@ -38,7 +38,8 @@
    kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v0.8.1/standard-install.yaml
    ```
 
-3. Deploy NGF from edge using Helm install (NOTE: For Test 1, deploy AFTER resources):
+3. Deploy NGF from edge using Helm install and wait for LoadBalancer Service to be ready
+   (NOTE: For Test 1, deploy AFTER resources):
 
    ```console
    helm install my-release oci://ghcr.io/nginxinc/charts/nginx-gateway-fabric  --version 0.0.0-edge \
@@ -65,10 +66,20 @@
    kubectl port-forward $GW_POD -n nginx-gateway 9113:9113 &
    ```
 
-6. Measure Time To Ready as described in each test, get the reload count, and get the average NGINX reload duration.
-   The average reload duration can be computed by taking the `nginx_gateway_fabric_nginx_reloads_milliseconds_sum`
-   metric value and dividing it by the `nginx_gateway_fabric_nginx_reloads_milliseconds_count` metric value.
-7. For accuracy, repeat the test suite once or twice, take the averages, and look for any anomolies or outliers.
+6. Measure NGINX Reloads and Time to Ready Results
+   1. TimeToReadyTotal as described in each test - NGF logs.
+   2. TimeToReadyAvgSingle which is the average time between updating any resource and the
+      NGINX configuration being reloaded - NGF logs.
+   3. NGINX Reload count - metrics.
+   4. Average NGINX reload duration - metrics.
+      1. The average reload duration can be computed by taking the `nginx_gateway_fabric_nginx_reloads_milliseconds_sum`
+         metric value and dividing it by the `nginx_gateway_fabric_nginx_reloads_milliseconds_count` metric value.
+7. Measure Event Batch Processing Results
+   1. Event Batch Total - metrics.
+   2. Average Event Batch Processing duration - metrics.
+      1. The average event batch processing duraiton can be computed by taking the `nginx_gateway_fabric_event_batch_processing_milliseconds_sum`
+         metric value and dividing it by the `nginx_gateway_fabric_event_batch_processing_milliseconds_count` metric value.
+8. For accuracy, repeat the test suite once or twice, take the averages, and look for any anomolies or outliers.
 
 ## Tests
 
@@ -79,8 +90,8 @@
       e.g. `cd scripts && bash create-resources-gw-last.sh 30`. The script will deploy backend apps and services, wait
       60 seconds for them to be ready, and deploy 1 Gateway, 1 RefGrant, 1 Secret, and HTTPRoutes.
    2. Deploy NGF
-   3. Check logs for time it takes from start-up -> config written and NGINX reloaded. Get reload count and average reload
-      duration from metrics and logs.
+   3. Measure TimeToReadyTotal as the time it takes from start-up -> config written and
+      NGINX reloaded. Measure the other results as described in steps 6-7 of the [Setup](#setup) section.
 
 ### Test 2: Start NGF, deploy Gateway, create many resources attached to GW
 
@@ -89,9 +100,8 @@
    2. Run the provided script with the required number of resources,
       e.g. `cd scripts && bash create-resources-routes-last.sh 30`. The script will deploy backend apps and services,
       wait 60 seconds for them to be ready, and deploy 1 Gateway, 1 Secret, 1 RefGrant, and HTTPRoutes at the same time.
-   3. Check logs for time it takes from NGF receiving first resource update -> final config written, and NGINX's final
-      reload. Check logs for average individual HTTPRoute TTR also. Get reload count and average reload duration from
-      metrics and logs.
+   3. Measure TimeToReadyTotal as the time it takes from NGF receiving the first HTTPRoute resource update -> final
+      config written and NGINX reloaded. Measure the other results as described in steps 6-7 of the [Setup](#setup) section.
 
 ### Test 3: Start NGF, create many resources attached to a Gateway, deploy the Gateway
 
@@ -101,5 +111,5 @@
       e.g. `cd scripts && bash create-resources-gw-last.sh 30`.
       The script will deploy the namespaces, backend apps and services, 1 Secret, 1 ReferenceGrant, and the HTTPRoutes;
       wait 60 seconds for the backend apps to be ready, and then deploy 1 Gateway for all HTTPRoutes.
-   3. Check logs for time it takes from NGF receiving gateway resource -> config written and NGINX reloaded. Get reload
-      count and average reload duration from metrics and logs.
+   3. Measure TimeToReadyTotal as the time it takes from NGF receiving gateway resource -> config written and NGINX reloaded.
+      Measure the other results as described in steps 6-7 of the [Setup](#setup) section.