Skip to content
This repository has been archived by the owner on Feb 5, 2020. It is now read-only.

modules/azure: disable TX checksum offloading #1586

Merged
merged 1 commit into from
Aug 4, 2017
Merged

Conversation

squat
Copy link
Contributor

@squat squat commented Aug 4, 2017

This commit disables TX checksum offloading as a workaround to fix #1171. We ran some iperf benchmarks to determine how different MTU and tx checksum offloading combinations would perform. From the results shown below, we found that disabling TX checksum offloading and keeping the default MTU does not result in a major performance penalty; it will be the most maintainable workaround going forward.

cc @crawford @philips @Quentin-M

iperf.yaml:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: iperf
  namespace: default
  labels:
    app: iperf
spec:
  replicas: 1
  template:
    metadata:
      name: iperf
      labels:
        app: iperf
    spec:
      containers:
      - name: iperf-server
        image: networkstatic/iperf3
        args:
        - -s
        ports:
        - containerPort: 5201
          protocol: TCP
      nodeSelector:
        node-role.kubernetes.io/node: ""
---
apiVersion: v1
kind: Service
metadata:
  name: iperf
  namespace: default
  labels:
    app: iperf
spec:
  type: NodePort
  selector:
    app: iperf
  ports:
  - name: iperf
    protocol: TCP
    port: 5201
    nodePort: 31211
---
apiVersion: batch/v1
kind: Job
metadata:
  name: iperf
  namespace: default
  labels:
    app: iperf
spec:
  template:
    metadata:
      name: iperf
      labels:
        app: iperf
    spec:
      hostNetwork: true
      containers:
      - name: iperf
        image: networkstatic/iperf3
        args: ["-c", "green.westcentralus.cloudapp.azure.com", "-p", "31211", "-t", "30", "-V"]
      restartPolicy: Never

I got the following results:

MTU 1350 tx on

Time: Thu, 03 Aug 2017 23:21:12 GMT
Connecting to host green.westcentralus.cloudapp.azure.com, port 31211
      Cookie: green-worker-0.1501802472.198990.573
      TCP MSS: 1228 (default)
[  4] local 10.0.16.4 port 39306 connected to 52.161.110.8 port 31211
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks, omitting 0 seconds, 30 second test
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec  88.3 MBytes   741 Mbits/sec  802    191 KBytes       
[  4]   1.00-2.00   sec  82.3 MBytes   690 Mbits/sec  292    170 KBytes       
[  4]   2.00-3.00   sec  73.1 MBytes   613 Mbits/sec  337    154 KBytes       
[  4]   3.00-4.00   sec  81.7 MBytes   686 Mbits/sec  258    171 KBytes       
[  4]   4.00-5.00   sec  84.1 MBytes   706 Mbits/sec  283    229 KBytes       
[  4]   5.00-6.00   sec  83.1 MBytes   698 Mbits/sec  355    216 KBytes       
[  4]   6.00-7.00   sec  82.6 MBytes   693 Mbits/sec  318    175 KBytes       
[  4]   7.00-8.00   sec  72.8 MBytes   611 Mbits/sec  427    109 KBytes       
[  4]   8.00-9.00   sec  81.3 MBytes   682 Mbits/sec  300    246 KBytes       
[  4]   9.00-10.00  sec  83.9 MBytes   704 Mbits/sec  355    199 KBytes       
[  4]  10.00-11.00  sec  83.2 MBytes   698 Mbits/sec  442    199 KBytes       
[  4]  11.00-12.00  sec  83.0 MBytes   697 Mbits/sec  290    228 KBytes       
[  4]  12.00-13.00  sec  73.3 MBytes   614 Mbits/sec  341    173 KBytes       
[  4]  13.00-14.00  sec  81.5 MBytes   684 Mbits/sec  296    216 KBytes       
[  4]  14.00-15.00  sec  84.1 MBytes   705 Mbits/sec  332    143 KBytes       
[  4]  15.00-16.00  sec  83.7 MBytes   702 Mbits/sec  191    218 KBytes       
[  4]  16.00-17.00  sec  83.1 MBytes   697 Mbits/sec  394    162 KBytes       
[  4]  17.00-18.00  sec  72.7 MBytes   610 Mbits/sec  282    174 KBytes       
[  4]  18.00-19.00  sec  82.0 MBytes   688 Mbits/sec  399    180 KBytes       
[  4]  19.00-20.00  sec  83.3 MBytes   698 Mbits/sec  378    140 KBytes       
[  4]  20.00-21.00  sec  83.4 MBytes   700 Mbits/sec  298    156 KBytes       
[  4]  21.00-22.00  sec  84.0 MBytes   705 Mbits/sec  351    155 KBytes       
[  4]  22.00-23.00  sec  72.6 MBytes   608 Mbits/sec  264    215 KBytes       
[  4]  23.00-24.00  sec  82.1 MBytes   689 Mbits/sec  567    156 KBytes       
[  4]  24.00-25.00  sec  82.6 MBytes   692 Mbits/sec  363    154 KBytes       
[  4]  25.00-26.00  sec  83.1 MBytes   698 Mbits/sec  281    176 KBytes       
[  4]  26.00-27.00  sec  84.1 MBytes   705 Mbits/sec  206    229 KBytes       
[  4]  27.00-28.00  sec  71.6 MBytes   600 Mbits/sec  436    150 KBytes       
[  4]  28.00-29.00  sec  83.3 MBytes   699 Mbits/sec  327    189 KBytes       
[  4]  29.00-30.00  sec  83.1 MBytes   697 Mbits/sec  352    197 KBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-30.00  sec  2.38 GBytes   680 Mbits/sec  10517             sender
[  4]   0.00-30.00  sec  2.37 GBytes   680 Mbits/sec                  receiver
CPU Utilization: local/sender 1.1% (0.0%u/1.1%s), remote/receiver 0.1% (0.0%u/0.1%s)

MTU 1350 tx off:

Time: Thu, 03 Aug 2017 23:43:17 GMT
Connecting to host green.westcentralus.cloudapp.azure.com, port 31211
      Cookie: green-worker-0.1501803797.590523.57c
      TCP MSS: 1228 (default)
[  4] local 10.0.16.4 port 37416 connected to 52.161.110.8 port 31211
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks, omitting 0 seconds, 30 second test
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec  87.7 MBytes   736 Mbits/sec  889    211 KBytes       
[  4]   1.00-2.00   sec  70.9 MBytes   592 Mbits/sec  256    204 KBytes       
[  4]   2.00-3.00   sec  82.2 MBytes   692 Mbits/sec  449    195 KBytes       
[  4]   3.00-4.00   sec  84.8 MBytes   711 Mbits/sec  137    198 KBytes       
[  4]   4.00-5.00   sec  83.6 MBytes   701 Mbits/sec  205    212 KBytes       
[  4]   5.00-6.01   sec  84.3 MBytes   702 Mbits/sec  266    243 KBytes       
[  4]   6.01-7.01   sec  69.8 MBytes   583 Mbits/sec  250    149 KBytes       
[  4]   7.01-8.00   sec  83.2 MBytes   705 Mbits/sec  360    161 KBytes       
[  4]   8.00-9.00   sec  84.6 MBytes   709 Mbits/sec  286    175 KBytes       
[  4]   9.00-10.00  sec  82.9 MBytes   696 Mbits/sec  457    140 KBytes       
[  4]  10.00-11.00  sec  82.9 MBytes   695 Mbits/sec  207    270 KBytes       
[  4]  11.00-12.00  sec  70.6 MBytes   592 Mbits/sec  267    163 KBytes       
[  4]  12.00-13.00  sec  83.6 MBytes   701 Mbits/sec  255    125 KBytes       
[  4]  13.00-14.00  sec  84.3 MBytes   707 Mbits/sec  324    118 KBytes       
[  4]  14.00-15.00  sec  84.4 MBytes   708 Mbits/sec  205    205 KBytes       
[  4]  15.00-16.00  sec  82.4 MBytes   691 Mbits/sec  317    154 KBytes       
[  4]  16.00-17.00  sec  70.4 MBytes   590 Mbits/sec  144    211 KBytes       
[  4]  17.00-18.00  sec  84.1 MBytes   705 Mbits/sec  301    162 KBytes       
[  4]  18.00-19.00  sec  83.9 MBytes   703 Mbits/sec  292    164 KBytes       
[  4]  19.00-20.00  sec  83.4 MBytes   699 Mbits/sec  178    151 KBytes       
[  4]  20.00-21.00  sec  83.5 MBytes   700 Mbits/sec  115    205 KBytes       
[  4]  21.00-22.00  sec  70.8 MBytes   594 Mbits/sec   90    223 KBytes       
[  4]  22.00-23.00  sec  84.6 MBytes   710 Mbits/sec  350    235 KBytes       
[  4]  23.00-24.00  sec  82.6 MBytes   693 Mbits/sec  350    231 KBytes       
[  4]  24.00-25.00  sec  84.0 MBytes   704 Mbits/sec  403    223 KBytes       
[  4]  25.00-26.00  sec  82.3 MBytes   690 Mbits/sec  275    229 KBytes       
[  4]  26.00-27.00  sec  70.6 MBytes   592 Mbits/sec  202    317 KBytes       
[  4]  27.00-28.00  sec  84.7 MBytes   710 Mbits/sec  602    228 KBytes       
[  4]  28.00-29.00  sec  82.7 MBytes   694 Mbits/sec  232    228 KBytes       
[  4]  29.00-30.00  sec  84.3 MBytes   707 Mbits/sec  176    233 KBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-30.00  sec  2.38 GBytes   680 Mbits/sec  8840             sender
[  4]   0.00-30.00  sec  2.37 GBytes   680 Mbits/sec                  receiver
CPU Utilization: local/sender 1.0% (0.0%u/1.0%s), remote/receiver 0.5% (0.0%u/0.5%s)

MTU 1500 tx off:

Time: Thu, 03 Aug 2017 23:31:42 GMT
Connecting to host green.westcentralus.cloudapp.azure.com, port 31211
      Cookie: green-worker-0.1501803101.996116.571
      TCP MSS: 1378 (default)
[  4] local 10.0.16.4 port 53694 connected to 52.161.110.8 port 31211
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks, omitting 0 seconds, 30 second test
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec  85.6 MBytes   718 Mbits/sec  1115    219 KBytes       
[  4]   1.00-2.00   sec  69.2 MBytes   581 Mbits/sec  305    209 KBytes       
[  4]   2.00-3.01   sec  82.9 MBytes   692 Mbits/sec  350    237 KBytes       
[  4]   3.01-4.00   sec  83.3 MBytes   702 Mbits/sec  510    178 KBytes       
[  4]   4.00-5.00   sec  82.2 MBytes   690 Mbits/sec  306    240 KBytes       
[  4]   5.00-6.00   sec  80.5 MBytes   676 Mbits/sec  470    175 KBytes       
[  4]   6.00-7.00   sec  70.4 MBytes   590 Mbits/sec  276    261 KBytes       
[  4]   7.00-8.00   sec  82.4 MBytes   691 Mbits/sec  643    238 KBytes       
[  4]   8.00-9.00   sec  82.5 MBytes   692 Mbits/sec  325    214 KBytes       
[  4]   9.00-10.00  sec  82.1 MBytes   689 Mbits/sec  292    277 KBytes       
[  4]  10.00-11.00  sec  81.7 MBytes   686 Mbits/sec  286    188 KBytes       
[  4]  11.00-12.00  sec  69.7 MBytes   585 Mbits/sec  104    203 KBytes       
[  4]  12.00-13.00  sec  83.7 MBytes   702 Mbits/sec  312    164 KBytes       
[  4]  13.00-14.00  sec  80.2 MBytes   673 Mbits/sec  245    195 KBytes       
[  4]  14.00-15.00  sec  83.9 MBytes   703 Mbits/sec  241    184 KBytes       
[  4]  15.00-16.00  sec  82.1 MBytes   686 Mbits/sec  365    211 KBytes       
[  4]  16.00-17.00  sec  70.2 MBytes   591 Mbits/sec  561    209 KBytes       
[  4]  17.00-18.00  sec  82.6 MBytes   693 Mbits/sec  350    223 KBytes       
[  4]  18.00-19.00  sec  83.1 MBytes   697 Mbits/sec  251    254 KBytes       
[  4]  19.00-20.00  sec  81.1 MBytes   680 Mbits/sec  487    219 KBytes       
[  4]  20.00-21.00  sec  82.0 MBytes   688 Mbits/sec  199    226 KBytes       
[  4]  21.00-22.00  sec  69.5 MBytes   583 Mbits/sec  249    280 KBytes       
[  4]  22.00-23.00  sec  83.0 MBytes   697 Mbits/sec  798    246 KBytes       
[  4]  23.00-24.00  sec  82.0 MBytes   688 Mbits/sec  460    252 KBytes       
[  4]  24.00-25.00  sec  80.4 MBytes   674 Mbits/sec  306    155 KBytes       
[  4]  25.00-26.00  sec  83.4 MBytes   700 Mbits/sec  267    246 KBytes       
[  4]  26.00-27.00  sec  70.6 MBytes   592 Mbits/sec  234    217 KBytes       
[  4]  27.00-28.00  sec  82.7 MBytes   694 Mbits/sec  212    201 KBytes       
[  4]  28.00-29.00  sec  80.7 MBytes   677 Mbits/sec  289    270 KBytes       
[  4]  29.00-30.00  sec  84.2 MBytes   706 Mbits/sec  442    225 KBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-30.00  sec  2.34 GBytes   671 Mbits/sec  11250             sender
[  4]   0.00-30.00  sec  2.34 GBytes   670 Mbits/sec                  receiver
CPU Utilization: local/sender 0.8% (0.0%u/0.8%s), remote/receiver 0.5% (0.0%u/0.4%s)

MTU 2000 tx off:

Time: Thu, 03 Aug 2017 23:52:56 GMT
Connecting to host green.westcentralus.cloudapp.azure.com, port 31211
      Cookie: green-worker-0.1501804376.270876.56c
      TCP MSS: 1878 (default)
[  4] local 10.0.16.4 port 33618 connected to 52.161.110.8 port 31211
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks, omitting 0 seconds, 30 second test
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec  77.3 MBytes   649 Mbits/sec  978    226 KBytes       
[  4]   1.00-2.00   sec  85.1 MBytes   714 Mbits/sec  391    207 KBytes       
[  4]   2.00-3.00   sec  85.1 MBytes   714 Mbits/sec  395    244 KBytes       
[  4]   3.00-4.00   sec  85.2 MBytes   715 Mbits/sec  168    262 KBytes       
[  4]   4.00-5.00   sec  85.0 MBytes   713 Mbits/sec   81    328 KBytes       
[  4]   5.00-6.00   sec  72.8 MBytes   610 Mbits/sec  271    264 KBytes       
[  4]   6.00-7.00   sec  85.9 MBytes   720 Mbits/sec  305    242 KBytes       
[  4]   7.00-8.00   sec  86.6 MBytes   726 Mbits/sec  195    282 KBytes       
[  4]   8.00-9.00   sec  84.2 MBytes   706 Mbits/sec  378    271 KBytes       
[  4]   9.00-10.00  sec  84.7 MBytes   711 Mbits/sec  316    303 KBytes       
[  4]  10.00-11.00  sec  71.9 MBytes   603 Mbits/sec  465    161 KBytes       
[  4]  11.00-12.00  sec  86.9 MBytes   729 Mbits/sec  441    248 KBytes       
[  4]  12.00-13.00  sec  85.3 MBytes   715 Mbits/sec  259    255 KBytes       
[  4]  13.00-14.00  sec  85.4 MBytes   716 Mbits/sec  334    194 KBytes       
[  4]  14.00-15.00  sec  85.9 MBytes   721 Mbits/sec  311    216 KBytes       
[  4]  15.00-16.00  sec  72.3 MBytes   606 Mbits/sec  223    213 KBytes       
[  4]  16.00-17.00  sec  85.8 MBytes   719 Mbits/sec  371    224 KBytes       
[  4]  17.00-18.00  sec  86.3 MBytes   724 Mbits/sec   88    297 KBytes       
[  4]  18.00-19.00  sec  84.9 MBytes   713 Mbits/sec  440    171 KBytes       
[  4]  19.00-20.00  sec  84.3 MBytes   708 Mbits/sec  318   91.7 KBytes       
[  4]  20.00-21.00  sec  72.5 MBytes   608 Mbits/sec  327    260 KBytes       
[  4]  21.00-22.00  sec  85.6 MBytes   718 Mbits/sec  361    242 KBytes       
[  4]  22.00-23.00  sec  85.6 MBytes   717 Mbits/sec  391    215 KBytes       
[  4]  23.00-24.00  sec  85.4 MBytes   717 Mbits/sec  423    242 KBytes       
[  4]  24.00-25.00  sec  84.3 MBytes   707 Mbits/sec  705    196 KBytes       
[  4]  25.00-26.00  sec  71.9 MBytes   603 Mbits/sec  342    229 KBytes       
[  4]  26.00-27.00  sec  85.3 MBytes   716 Mbits/sec  210    343 KBytes       
[  4]  27.00-28.00  sec  85.9 MBytes   720 Mbits/sec  317    185 KBytes       
[  4]  28.00-29.00  sec  84.8 MBytes   711 Mbits/sec  304    257 KBytes       
[  4]  29.00-30.00  sec  85.0 MBytes   713 Mbits/sec  450    204 KBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-30.00  sec  2.43 GBytes   695 Mbits/sec  10558             sender
[  4]   0.00-30.00  sec  2.43 GBytes   695 Mbits/sec                  receiver
CPU Utilization: local/sender 0.7% (0.0%u/0.7%s), remote/receiver 0.4% (0.0%u/0.3%s)

Native host networking, MTU 1500 tx on:

Time: Thu, 03 Aug 2017 23:57:16 GMT
Connecting to host green.westcentralus.cloudapp.azure.com, port 31211
      Cookie: b59be62b4f3d.1501804635.978869.6b579
      TCP MSS: 1428 (default)
[  4] local 172.17.0.3 port 47012 connected to 52.161.110.8 port 31211
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks, omitting 0 seconds, 30 second test
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec  94.8 MBytes   795 Mbits/sec  379    531 KBytes
[  4]   1.00-2.00   sec  87.9 MBytes   737 Mbits/sec  715    335 KBytes
[  4]   2.00-3.00   sec  82.3 MBytes   691 Mbits/sec  157    321 KBytes
[  4]   3.00-4.00   sec  89.4 MBytes   750 Mbits/sec  513    331 KBytes
[  4]   4.00-5.00   sec  88.6 MBytes   743 Mbits/sec  651    350 KBytes
[  4]   5.00-6.00   sec  90.9 MBytes   762 Mbits/sec    0    509 KBytes
[  4]   6.00-7.00   sec  88.2 MBytes   740 Mbits/sec  563    342 KBytes
[  4]   7.00-8.00   sec  82.4 MBytes   691 Mbits/sec    0    491 KBytes
[  4]   8.00-9.00   sec  88.9 MBytes   745 Mbits/sec  520    409 KBytes
[  4]   9.00-10.00  sec  90.1 MBytes   756 Mbits/sec  254    407 KBytes
[  4]  10.00-11.00  sec  89.4 MBytes   750 Mbits/sec  386    409 KBytes
[  4]  11.00-12.00  sec  88.9 MBytes   746 Mbits/sec  446    541 KBytes
[  4]  12.00-13.00  sec  81.5 MBytes   684 Mbits/sec   49    633 KBytes
[  4]  13.00-14.00  sec  89.8 MBytes   754 Mbits/sec  677    195 KBytes
[  4]  14.00-15.00  sec  89.1 MBytes   747 Mbits/sec   96    575 KBytes
[  4]  15.00-16.00  sec  89.3 MBytes   749 Mbits/sec  680    360 KBytes
[  4]  16.00-17.00  sec  88.9 MBytes   745 Mbits/sec  228    368 KBytes
[  4]  17.00-18.00  sec  81.2 MBytes   682 Mbits/sec  303    300 KBytes
[  4]  18.00-19.00  sec  89.8 MBytes   754 Mbits/sec   46    477 KBytes
[  4]  19.00-20.00  sec  90.6 MBytes   760 Mbits/sec  270    446 KBytes
[  4]  20.00-21.00  sec  89.4 MBytes   750 Mbits/sec  472    396 KBytes
[  4]  21.00-22.00  sec  88.0 MBytes   738 Mbits/sec  584    489 KBytes
[  4]  22.00-23.00  sec  82.2 MBytes   689 Mbits/sec    0    491 KBytes
[  4]  23.00-24.00  sec  89.5 MBytes   751 Mbits/sec  300    570 KBytes
[  4]  24.00-25.00  sec  88.9 MBytes   745 Mbits/sec  477    339 KBytes
[  4]  25.00-26.00  sec  89.3 MBytes   749 Mbits/sec  485    386 KBytes
[  4]  26.00-27.00  sec  90.4 MBytes   758 Mbits/sec  120    435 KBytes
[  4]  27.00-28.00  sec  81.0 MBytes   680 Mbits/sec  261    368 KBytes
[  4]  28.00-29.00  sec  90.8 MBytes   762 Mbits/sec    6    519 KBytes
[  4]  29.00-30.00  sec  89.2 MBytes   748 Mbits/sec  310    404 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-30.00  sec  2.58 GBytes   738 Mbits/sec  9948             sender
[  4]   0.00-30.00  sec  2.58 GBytes   738 Mbits/sec                  receiver
CPU Utilization: local/sender 0.8% (0.0%u/0.8%s), remote/receiver 1.4% (0.1%u/1.3%s)

Copy link
Contributor

@philips philips left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall seems OK. Not a huge fan of the duplicated tx-off.service.

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/sbin/ethtool --offload eth0 tx off
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it always eth0? cc @crawford @euank

@Quentin-M
Copy link
Contributor

Quentin-M commented Aug 4, 2017

So the payload was off on master, and thus the tests cannot pass the structure-check. #1584 fixes it. Could you rebase?

@squat
Copy link
Contributor Author

squat commented Aug 4, 2017

@Quentin-M tests are still failing in Azure because of resource issues, however, I have just finished booting several clusters manually and see that the TX checksum offloading is correctly set to off on all of the clusters. If someone approves the PR, we should merge.

@Quentin-M
Copy link
Contributor

Quentin-M commented Aug 4, 2017 via email

@squat squat merged commit 181e331 into coreos:master Aug 4, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants