Skip to content

lazy read disable for the http1 codec#20148

Merged
mattklein123 merged 6 commits intoenvoyproxy:mainfrom
wbpcode:no-read-disable
Mar 10, 2022
Merged

lazy read disable for the http1 codec#20148
mattklein123 merged 6 commits intoenvoyproxy:mainfrom
wbpcode:no-read-disable

Conversation

@wbpcode
Copy link
Copy Markdown
Member

@wbpcode wbpcode commented Mar 1, 2022

Signed-off-by: wbpcode wbphub@live.com

Commit Message: lazy read disable for the http1 codec
Additional Description:

Ref #19900 for more info.

Risk Level: Low.
Testing: N/A.
Docs Changes: N/A.
Release Notes: N/A.
Platform Specific Features: N/A.
Optional Runtime guard: envoy.reloadable_features.http1_lazy_read_disable.

Signed-off-by: wbpcode <wbphub@live.com>
@wbpcode
Copy link
Copy Markdown
Member Author

wbpcode commented Mar 1, 2022

cc @KBaichoo cc @alyssawilk

@rojkov
Copy link
Copy Markdown
Member

rojkov commented Mar 1, 2022

/assign @KBaichoo
/wait
on CI

Signed-off-by: wbpcode <wbphub@live.com>
@wbpcode
Copy link
Copy Markdown
Member Author

wbpcode commented Mar 2, 2022

/retest

@repokitteh-read-only
Copy link
Copy Markdown

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #20148 (comment) was created by @wbpcode.

see: more, trace.

Copy link
Copy Markdown
Contributor

@KBaichoo KBaichoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great otherwise, mind posting the performance numbers (compared to the prior implementation) since this is a performance change? Thanks

Comment on lines +1160 to +1164
// Active downstream request remote complete but there is some remaining data in the read buffer
// then try to disable the connection reading. Connection reading will be re-enabled after the
// current active downstream request has completed.
// This ensures that the remaining data can be consumed after the current active downstream
// request has completed.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit hard to grok, perhaps:
Eagerly read disable the connection if the downstream is sending pipelined requests as we serially process them. Reading from the connection will be re-enabled after the active request is completed.

Comment on lines +1148 to +1152
// Active downstream request remote complete but there is some new data comming then try to
// disable the connection reading. Connection reading will be re-enabled after the current
// active downstream request has completed.
// This ensures that the new comming data can be consumed after the current active downstream
// request has completed.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit hard to grok, perhaps:
Read disable the connection if the downstream is sending additional data while we are working on an existing request. Reading from the connection will be re-enabled after the active request is completed.

Signed-off-by: wbpcode <wbphub@live.com>
@wbpcode
Copy link
Copy Markdown
Member Author

wbpcode commented Mar 3, 2022

Benchmark with simplest envoy configuration and concurrency 1. A muti-processes nginx as backend return 1k response body. And a wrk as client.

wrk -c 64 -d 60s -t 2 -H"host:a.test.com" http://localhost:9090/anything

Result before this PR:

Running 1m test @ http://localhost:9090/anything
  2 threads and 64 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     3.28ms  808.89us  34.30ms   86.29%
    Req/Sec     9.83k   549.35    10.87k    83.50%
  1174226 requests in 1.00m, 1.27GB read
Requests/sec:  19558.90
Transfer/sec:     21.73MB

Result after this PR:

Running 1m test @ http://localhost:9090/anything
  2 threads and 64 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     3.13ms  844.64us  19.47ms   89.36%
    Req/Sec    10.32k   618.36    11.67k    79.75%
  1232194 requests in 1.00m, 1.34GB read
Requests/sec:  20531.11
Transfer/sec:     22.81MB

In a simple stress test scenario, this PR can bring about a 4~5% throughput improvement.

@wbpcode
Copy link
Copy Markdown
Member Author

wbpcode commented Mar 3, 2022

/retest

@repokitteh-read-only
Copy link
Copy Markdown

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #20148 (comment) was created by @wbpcode.

see: more, trace.

@KBaichoo
Copy link
Copy Markdown
Contributor

KBaichoo commented Mar 3, 2022

/retest

@repokitteh-read-only
Copy link
Copy Markdown

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #20148 (comment) was created by @KBaichoo.

see: more, trace.

KBaichoo
KBaichoo previously approved these changes Mar 3, 2022
Copy link
Copy Markdown
Contributor

@KBaichoo KBaichoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

/assign @alyssawilk as senior maintainer for merge

@KBaichoo KBaichoo assigned KBaichoo and alyssawilk and unassigned KBaichoo Mar 3, 2022
Copy link
Copy Markdown
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fantastic! I'm inclined to think it's high enough risk to runtime guard, though you're welcome to get a second opinion from Matt or Snow if you would rather not. Definitely let's add a release note to docs/root/version_history/current.rst nothing this should be nothing but a perf win, but in case there are behavioral changes.
Let's do that and add mock checks in the HTTP/1.1 codec test to regression test, and you'll be good to go!
/wait

@wbpcode
Copy link
Copy Markdown
Member Author

wbpcode commented Mar 4, 2022

Get it. A runtime guard is reasonable for me.

wbpcode added 2 commits March 8, 2022 10:02
Signed-off-by: wbpcode <wbphub@live.com>
Signed-off-by: wbpcode <wbphub@live.com>
Signed-off-by: wbpcode <wbphub@live.com>
@wbpcode
Copy link
Copy Markdown
Member Author

wbpcode commented Mar 9, 2022

/retest

@repokitteh-read-only
Copy link
Copy Markdown

Retrying Azure Pipelines:
Check envoy-presubmit didn't fail.

🐱

Caused by: a #20148 (comment) was created by @wbpcode.

see: more, trace.

@wbpcode
Copy link
Copy Markdown
Member Author

wbpcode commented Mar 9, 2022

cc @alyssawilk Release note added. Tests added. Runtime guard added. 😄

Copy link
Copy Markdown
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!

@alyssawilk
Copy link
Copy Markdown
Contributor

Throwing over to Matt for a last (non-google) look

Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@mattklein123 mattklein123 merged commit b435d3a into envoyproxy:main Mar 10, 2022
JuniorHsu pushed a commit to JuniorHsu/envoy that referenced this pull request Mar 17, 2022
Signed-off-by: wbpcode <wbphub@live.com>
Signed-off-by: kuochunghsu <kuochunghsu@pinterest.com>
@rojkov
Copy link
Copy Markdown
Member

rojkov commented Mar 24, 2022

After I had merged the latest main I noticed performance of the io_uring-backed IoHandle deteriorated significantly. git-bisecting led to this PR. Basically the io_uring-backed IoHandle gives the same performance as the normal one. The difference is that with the former a CPU core is idling more often (CPU load is ~6% less).

With envoy.reloadable_features.http1_lazy_read_disable set to false I get

rojkov@drozhkov:~/work/nighthawk (main)$ ./bazel-bin/nighthawk_client --duration 30 --rps 100000 --open-loop http://127.0.0.1:10000/
[17:29:50.785794][2078262][I] Starting 1 threads / event loops. Time limit: 30 seconds.
[17:29:50.785852][2078262][I] Global targets: 100 connections and 100000 calls per second.
[17:30:21.335928][2078268][I] Stopping after 30000 ms. Initiated: 3000000 / Completed: 2999899. (Completion rate was 99996.56666895555 per second.)
Nighthawk - A layer 7 protocol benchmarking tool.

benchmark_http_client.latency_2xx (695804 samples)
  min: 0s 001ms 665us | mean: 0s 004ms 196us | max: 0s 088ms 604us | pstdev: 0s 000ms 656us

  Percentile  Count       Value
  0.5         347918      0s 004ms 154us
  0.75        521934      0s 004ms 363us
  0.8         556710      0s 004ms 417us
  0.9         626227      0s 004ms 547us
  0.95        661028      0s 004ms 674us
  0.990625    689282      0s 005ms 881us
  0.99902344  695125      0s 007ms 784us

Queueing and connection setup latency (695904 samples)
  min: 0s 000ms 001us | mean: 0s 000ms 043us | max: 0s 040ms 495us | pstdev: 0s 000ms 342us

  Percentile  Count       Value
  0.5         348046      0s 000ms 001us
  0.75        521951      0s 000ms 001us
  0.8         556746      0s 000ms 002us
  0.9         626314      0s 000ms 002us
  0.95        661119      0s 000ms 004us
  0.990625    689380      0s 001ms 118us
  0.99902344  695226      0s 003ms 347us

Request start to response end (695804 samples)
  min: 0s 001ms 665us | mean: 0s 004ms 196us | max: 0s 088ms 604us | pstdev: 0s 000ms 656us

  Percentile  Count       Value
  0.5         347975      0s 004ms 154us
  0.75        521984      0s 004ms 363us
  0.8         556752      0s 004ms 417us
  0.9         626253      0s 004ms 547us
  0.95        661044      0s 004ms 674us
  0.990625    689281      0s 005ms 881us
  0.99902344  695125      0s 007ms 784us

Response body size in bytes (695804 samples)
  min: 10 | mean: 10.0 | max: 10 | pstdev: 0.0

Response header size in bytes (695804 samples)
  min: 141 | mean: 141.0 | max: 141 | pstdev: 0.0

Initiation to completion (2999899 samples)
  min: 0s 000ms 000us | mean: 0s 001ms 011us | max: 0s 088ms 674us | pstdev: 0s 001ms 824us

  Percentile  Count       Value
  0.5         1500028     0s 000ms 002us
  0.75        2249926     0s 000ms 384us
  0.8         2399925     0s 003ms 856us
  0.9         2699965     0s 004ms 230us
  0.95        2850018     0s 004ms 422us
  0.990625    2971776     0s 005ms 304us
  0.99902344  2996972     0s 007ms 442us

Counter                                 Value       Per second
benchmark.http_2xx                      695804      23193.45
benchmark.pool_overflow                 2304095     76803.11
cluster_manager.cluster_added           1           0.03
default.total_match_count               1           0.03
membership_change                       1           0.03
runtime.load_success                    1           0.03
runtime.override_dir_not_exists         1           0.03
upstream_cx_http1_total                 100         3.33
upstream_cx_overflow                    635383      21179.42
upstream_cx_rx_bytes_total              133594368   4453142.52
upstream_cx_total                       100         3.33
upstream_cx_tx_bytes_total              28532064    951068.14
upstream_rq_pending_overflow            2304095     76803.11
upstream_rq_pending_total               18744       624.80
upstream_rq_total                       695904      23196.78

[17:30:26.456727][2078268][I] Wait for the connection pool drain timed out, proceeding to hard shutdown.
[17:30:26.466043][2078262][I] Done.

With envoy.reloadable_features.http1_lazy_read_disable set to true I get

rojkov@drozhkov:~/work/nighthawk (main)$ ./bazel-bin/nighthawk_client --duration 30 --rps 100000 --open-loop http://127.0.0.1:10000/
[17:15:03.708749][2054509][I] Starting 1 threads / event loops. Time limit: 30 seconds.
[17:15:03.708790][2054509][I] Global targets: 100 connections and 100000 calls per second.
[17:15:34.258833][2054515][I] Stopping after 30000 ms. Initiated: 2999998 / Completed: 2999897. (Completion rate was 99996.54666735732 per second.)
Nighthawk - A layer 7 protocol benchmarking tool.

benchmark_http_client.latency_2xx (580308 samples)
  min: 0s 001ms 622us | mean: 0s 004ms 822us | max: 0s 072ms 667us | pstdev: 0s 001ms 166us

  Percentile  Count       Value
  0.5         290210      0s 004ms 641us
  0.75        435272      0s 005ms 279us
  0.8         464271      0s 005ms 463us
  0.9         522281      0s 006ms 143us
  0.95        551296      0s 006ms 771us
  0.990625    574868      0s 008ms 001us
  0.99902344  579743      0s 009ms 154us

Queueing and connection setup latency (580408 samples)
  min: 0s 000ms 001us | mean: 0s 000ms 052us | max: 0s 052ms 011us | pstdev: 0s 000ms 684us

  Percentile  Count       Value
  0.5         290216      0s 000ms 002us
  0.75        435317      0s 000ms 003us
  0.8         464405      0s 000ms 004us
  0.9         522424      0s 000ms 004us
  0.95        551399      0s 000ms 007us
  0.990625    574967      0s 001ms 379us
  0.99902344  579842      0s 007ms 007us

Request start to response end (580308 samples)
  min: 0s 001ms 622us | mean: 0s 004ms 822us | max: 0s 072ms 667us | pstdev: 0s 001ms 166us

  Percentile  Count       Value
  0.5         290183      0s 004ms 641us
  0.75        435249      0s 005ms 279us
  0.8         464260      0s 005ms 463us
  0.9         522281      0s 006ms 143us
  0.95        551296      0s 006ms 771us
  0.990625    574868      0s 008ms 001us
  0.99902344  579742      0s 009ms 153us

Response body size in bytes (580308 samples)
  min: 10 | mean: 10.0 | max: 10 | pstdev: 0.0

Response header size in bytes (580308 samples)
  min: 141 | mean: 141.0 | max: 141 | pstdev: 0.0

Initiation to completion (2999897 samples)
  min: 0s 000ms 000us | mean: 0s 001ms 100us | max: 0s 072ms 699us | pstdev: 0s 001ms 984us

  Percentile  Count       Value
  0.5         1499962     0s 000ms 064us
  0.75        2249934     0s 000ms 808us
  0.8         2399919     0s 001ms 381us
  0.9         2699938     0s 004ms 674us
  0.95        2849935     0s 005ms 344us
  0.990625    2971773     0s 007ms 007us
  0.99902344  2996969     0s 008ms 717us

Counter                                 Value       Per second
benchmark.http_2xx                      580308      19343.60
benchmark.pool_overflow                 2419589     80652.95
cluster_manager.cluster_added           1           0.03
default.total_match_count               1           0.03
membership_change                       1           0.03
runtime.load_success                    1           0.03
runtime.override_dir_not_exists         1           0.03
upstream_cx_http1_total                 100         3.33
upstream_cx_overflow                    541830      18061.00
upstream_cx_rx_bytes_total              111419136   3713970.45
upstream_cx_total                       100         3.33
upstream_cx_tx_bytes_total              23796728    793224.11
upstream_rq_pending_overflow            2419589     80652.95
upstream_rq_pending_total               13318       443.93
upstream_rq_total                       580408      19346.93

[17:15:39.376828][2054515][I] Wait for the connection pool drain timed out, proceeding to hard shutdown.
[17:15:39.386339][2054509][I] Done.

@wbpcode
Copy link
Copy Markdown
Member Author

wbpcode commented Mar 25, 2022

Hi, @rojkov can you provide some more detailed config and also your code base? I can have a try in my local env. This PR reduces the number of calls to readDisable, although it adds a few extra if-checks, which obviously shouldn't lead to poor performance.

Or can we open a new issue to record and discuss this problem?

@rojkov
Copy link
Copy Markdown
Member

rojkov commented Mar 25, 2022

Hi @wbpcode. The new IoHandle is WIP still. This PR doesn't break anything in the main branch. I just wanted to give a heads-up. The culprit may well be in the new IoHandle actually.

I'd appreciate if you took a look. The code can be obtained from #19082. My current config is

bootstrap_extensions:
  - name: envoy.extensions.io_socket.io_uring
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.network.socket_interface.v3.IoUringSocketInterface
      use_submission_queue_polling: false
      read_buffer_size: 8192
      io_uring_size: 300
default_socket_interface: "envoy.extensions.network.socket_interface.io_uring"
enable_dispatcher_stats: false
static_resources:
  clusters:
    name: cluster_0
    connect_timeout: 0.25s
    circuit_breakers:
      thresholds:
      - priority: DEFAULT
        max_connections: 1000000000
        max_pending_requests: 1000000000
        max_requests: 1000000000
        max_retries: 1000000000
      - priority: HIGH
        max_connections: 1000000000
        max_pending_requests: 1000000000
        max_requests: 1000000000
        max_retries: 1000000000
    load_assignment:
      cluster_name: cluster_0
      endpoints:
        - lb_endpoints:
            - endpoint:
                address:
                  socket_address:
                    address: 127.0.0.1
                    port_value: 4500
  listeners:
    name: listener_0
    address:
      socket_address:
        address: 0.0.0.0
        port_value: 10000
    filter_chains:
      filters:
      - name: envoy.filters.network.http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
          codec_type: auto
          generate_request_id: false
          stat_prefix: ingress_http
          route_config:
            name: local_route
            virtual_hosts:
            - name: backend
              domains:
              - "*"
              routes:
              - match:
                  prefix: "/"
                route:
                  cluster: cluster_0
          http_filters:
          - name: envoy.filters.http.router
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
              dynamic_stats: false
layered_runtime:
  layers:
  - name: static_layer
    static_layer:
      envoy.reloadable_features.http1_lazy_read_disable: false

For benchmarking I use Nighthawk. The server config:

static_resources:
  listeners:
    # define an origin server on :10000 that always returns "lorem ipsum..."
    - address:
        socket_address:
          address: 0.0.0.0
          port_value: 4500
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                generate_request_id: false
                codec_type: AUTO
                stat_prefix: ingress_http
                route_config:
                  name: local_route
                  virtual_hosts:
                    - name: service
                      domains:
                        - "*"
                http_filters:
                  - name: test-server # before envoy.router because order matters!
                    typed_config:
                      "@type": type.googleapis.com/nighthawk.server.ResponseOptions
                      response_body_size: 10
                      v3_response_headers:
                        - { header: { key: "foo", value: "bar3" } }
                        - {
                            header: { key: "foo", value: "bar2" },
                            append: true,
                          }
                        - { header: { key: "x-nh", value: "1" } }
                  - name: envoy.filters.http.router
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
                      dynamic_stats: false
admin:
  access_log_path: /tmp/envoy.log
  address:
    socket_address:
      address: 0.0.0.0
      port_value: 8081

I run the client with

./bazel-bin/nighthawk_client --duration 30 --rps 100000 --open-loop http://127.0.0.1:10000/

And also I pin Envoy to a single core with

taskset --cpu-list 0 bazel-bin/source/exe/envoy-static -c envoy-config-perf-measurement-io_uring.yaml --concurrency 1 -l warn

ravenblackx pushed a commit to ravenblackx/envoy that referenced this pull request Jun 8, 2022
Signed-off-by: wbpcode <wbphub@live.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants