Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-fix OTA exchange leaks #20632

Merged

Conversation

bzbarsky-apple
Copy link
Contributor

@bzbarsky-apple bzbarsky-apple commented Jul 12, 2022

Problem

OTA exchanges can leak on error.

Change overview

  1. Re-land (via git cherry-pick) the commit from [ota] Fix exchange context leak in OTA requestor #20304
  2. Fix exchange lifetime management issues in BDXMessenger to address the crashes initially observed with [ota] Fix exchange context leak in OTA requestor #20304.

Fixes #20565

Testing

Ran the steps in #20563 (comment) and verified those do not crash.

* [ota] Fix exchange context leak

It turns out that Exchange Context allocated for BDX
transfer is not released on completion or connection abort.

It is not seen in a happy path that results in applying
and rebooting into a new firmware, but may lead to the
exchange leak when the transfer is interrupted.

Furthermore, if an exchange is never released, a Sleepy End
Device never returns to the idle mode, needlessly draining
the battery.

Signed-off-by: Damian Krolik <[email protected]>

* Restyled by clang-format

Co-authored-by: Restyled.io <[email protected]>
@github-actions
Copy link

github-actions bot commented Jul 12, 2022

PR #20632: Size comparison from 57cb679 to 7fa3055

Increases (18 builds for bl602, cc13x2_26x2, cyw30739, efr32, linux, nrfconnect)
platform target config section 57cb679 7fa3055 change % change
bl602 lighting-app bl602 (read/write) 1397386 1397434 48 0.0
.text 1058588 1058628 40 0.0
bl602+rpc (read/write) 1442818 1442858 40 0.0
.text 1090276 1090308 32 0.0
cc13x2_26x2 lock-ftd LP_CC2652R7 (read only) 669183 669207 24 0.0
.rodata 76279 76287 8 0.0
.text 592424 592440 16 0.0
lock-mtd LP_CC2652R7 (read only) 618591 618623 32 0.0
.rodata 76159 76167 8 0.0
.text 541944 541968 24 0.0
pump-app LP_CC2652R7 (read only) 678183 678207 24 0.0
.text 589196 589220 24 0.0
pump-controller-app LP_CC2652R7 (read only) 664007 664023 16 0.0
.text 579160 579176 16 0.0
shell LP_CC2652R7 (read only) 658766 658774 8 0.0
.text 573532 573540 8 0.0
cyw30739 light cyw930739m2evb_01 (read/write) 579622 579654 32 0.0
.app_xip_area 458376 458408 32 0.0
lock cyw930739m2evb_01 (read/write) 585574 585614 40 0.0
.app_xip_area 459600 459640 40 0.0
ota-requestor-no-progress-logging cyw930739m2evb_01 (read/write) 582774 582806 32 0.0
.app_xip_area 462376 462408 32 0.0
efr32 lighting-app BRD4161A (read/write) 1081436 1081476 40 0.0
.text 946372 946412 40 0.0
BRD4161A+rpc (read/write) 1135764 1135788 24 0.0
.text 983808 983832 24 0.0
BRD4161A+rs911x (read/write) 947924 947948 24 0.0
.text 805088 805112 24 0.0
lock-app BRD4161A+wf200 (read/write) 1128848 1128880 32 0.0
.text 982580 982612 32 0.0
window-app BRD4161A (read/write) 1075244 1075260 16 0.0
.text 938676 938692 16 0.0
linux ota-requestor-app debug (read only) 2439361 2439513 152 0.0
(read/write) 125216 125248 32 0.0
.data.rel.ro 67288 67304 16 0.0
.text 2060914 2061042 128 0.0
nrfconnect all-clusters-app nrf52840dk_nrf52840 (read/write) 1175175 1175195 20 0.0
rodata 141888 141892 4 0.0
text 811492 811516 24 0.0
all-clusters-minimal-app nrf52840dk_nrf52840 (read/write) 1155367 1155387 20 0.0
rodata 133416 133420 4 0.0
text 800936 800960 24 0.0
Decreases (5 builds for cc13x2_26x2, esp32)
platform target config section 57cb679 7fa3055 change % change
cc13x2_26x2 lock-ftd LP_CC2652R7 (read/write) 172184 172160 -24 -0.0
pump-app LP_CC2652R7 (read/write) 164032 164008 -24 -0.0
pump-controller-app LP_CC2652R7 (read/write) 178328 178312 -16 -0.0
shell LP_CC2652R7 (read/write) 187960 187952 -8 -0.0
esp32 all-clusters-app c3devkit (read only) 1020100 1020098 -2 -0.0
.flash.text 1020100 1020098 -2 -0.0
Full report (43 builds for bl602, cc13x2_26x2, cyw30739, efr32, esp32, k32w, linux, mbed, nrfconnect, p6, telink)
platform target config section 57cb679 7fa3055 change % change
bl602 lighting-app bl602 (read/write) 1397386 1397434 48 0.0
.bss 116978 116978 0 0.0
.data 4480 4480 0 0.0
.text 1058588 1058628 40 0.0
bl602+rpc (read/write) 1442818 1442858 40 0.0
.bss 124418 124418 0 0.0
.data 4600 4600 0 0.0
.text 1090276 1090308 32 0.0
cc13x2_26x2 all-clusters-app LP_CC2652R7 (read only) 666275 666275 0 0.0
(read/write) 184948 184948 0 0.0
.bss 74116 74116 0 0.0
.data 3356 3356 0 0.0
.rodata 88139 88139 0 0.0
.text 577820 577820 0 0.0
all-clusters-minimal-app LP_CC2652R7 (read only) 632083 632083 0 0.0
(read/write) 157684 157684 0 0.0
.bss 73412 73412 0 0.0
.data 3356 3356 0 0.0
.rodata 77379 77379 0 0.0
.text 554380 554380 0 0.0
lock-ftd LP_CC2652R7 (read only) 669183 669207 24 0.0
(read/write) 172184 172160 -24 -0.0
.bss 71148 71148 0 0.0
.data 3280 3280 0 0.0
.rodata 76279 76287 8 0.0
.text 592424 592440 16 0.0
lock-mtd LP_CC2652R7 (read only) 618591 618623 32 0.0
(read/write) 144264 144264 0 0.0
.bss 66868 66868 0 0.0
.data 3280 3280 0 0.0
.rodata 76159 76167 8 0.0
.text 541944 541968 24 0.0
pump-app LP_CC2652R7 (read only) 678183 678207 24 0.0
(read/write) 164032 164008 -24 -0.0
.bss 71228 71228 0 0.0
.data 3280 3280 0 0.0
.rodata 88503 88503 0 0.0
.text 589196 589220 24 0.0
pump-controller-app LP_CC2652R7 (read only) 664007 664023 16 0.0
(read/write) 178328 178312 -16 -0.0
.bss 71348 71348 0 0.0
.data 3276 3276 0 0.0
.rodata 84367 84367 0 0.0
.text 579160 579176 16 0.0
shell LP_CC2652R7 (read only) 658766 658774 8 0.0
(read/write) 187960 187952 -8 -0.0
.bss 76420 76420 0 0.0
.data 3360 3360 0 0.0
.rodata 84918 84918 0 0.0
.text 573532 573540 8 0.0
cyw30739 light cyw930739m2evb_01 (read/write) 579622 579654 32 0.0
.app_xip_area 458376 458408 32 0.0
.bss 64184 64184 0 0.0
.data 716 716 0 0.0
.rodata 0 0 0 0.0
.text 112 112 0 0.0
lock cyw930739m2evb_01 (read/write) 585574 585614 40 0.0
.app_xip_area 459600 459640 40 0.0
.bss 68912 68912 0 0.0
.data 720 720 0 0.0
.rodata 0 0 0 0.0
.text 112 112 0 0.0
ota-requestor-no-progress-logging cyw930739m2evb_01 (read/write) 582774 582806 32 0.0
.app_xip_area 462376 462408 32 0.0
.bss 63392 63392 0 0.0
.data 660 660 0 0.0
.rodata 0 0 0 0.0
.text 112 112 0 0.0
efr32 lighting-app BRD4161A (read/write) 1081436 1081476 40 0.0
.bss 132996 132996 0 0.0
.data 2048 2048 0 0.0
.text 946372 946412 40 0.0
BRD4161A+rpc (read/write) 1135764 1135788 24 0.0
.bss 149676 149676 0 0.0
.data 2260 2260 0 0.0
.text 983808 983832 24 0.0
BRD4161A+rs911x (read/write) 947924 947948 24 0.0
.bss 140768 140768 0 0.0
.data 2048 2048 0 0.0
.text 805088 805112 24 0.0
lock-app BRD4161A+wf200 (read/write) 1128848 1128880 32 0.0
.bss 144184 144184 0 0.0
.data 2060 2060 0 0.0
.text 982580 982612 32 0.0
window-app BRD4161A (read/write) 1075244 1075260 16 0.0
.bss 134468 134468 0 0.0
.data 2076 2076 0 0.0
.text 938676 938692 16 0.0
esp32 all-clusters-app c3devkit (read only) 1020100 1020098 -2 -0.0
(read/write) 1485642 1485642 0 0.0
.dram0.bss 70080 70080 0 0.0
.dram0.data 14600 14600 0 0.0
.flash.rodata 215528 215528 0 0.0
.flash.text 1020100 1020098 -2 -0.0
.iram0.text 62902 62902 0 0.0
m5stack (read only) 1073971 1073971 0 0.0
(read/write) 487712 487712 0 0.0
.dram0.bss 75600 75600 0 0.0
.dram0.data 34144 34144 0 0.0
.flash.rodata 245972 245972 0 0.0
.flash.text 1068587 1068587 0 0.0
.iram0.text 123267 123267 0 0.0
k32w light k32w061+release (read/write) 658832 658832 0 0.0
.bss 69516 69516 0 0.0
.data 1992 1992 0 0.0
.text 581524 581524 0 0.0
lock k32w061+release (read/write) 685684 685684 0 0.0
.bss 69980 69980 0 0.0
.data 2004 2004 0 0.0
.text 607900 607900 0 0.0
linux all-clusters-app debug (read only) 2960825 2960825 0 0.0
(read/write) 154752 154752 0 0.0
.bss 61536 61536 0 0.0
.data 2048 2048 0 0.0
.data.rel.ro 84968 84968 0 0.0
.dynamic 608 608 0 0.0
.got 4536 4536 0 0.0
.init 27 27 0 0.0
.init_array 1048 1048 0 0.0
.rodata 263613 263613 0 0.0
.text 2520002 2520002 0 0.0
all-clusters-minimal-app debug (read only) 2813401 2813401 0 0.0
(read/write) 146688 146688 0 0.0
.bss 60864 60864 0 0.0
.data 2048 2048 0 0.0
.data.rel.ro 77608 77608 0 0.0
.dynamic 608 608 0 0.0
.got 4488 4488 0 0.0
.init 27 27 0 0.0
.init_array 1048 1048 0 0.0
.rodata 265341 265341 0 0.0
.text 2373026 2373026 0 0.0
bridge-app debug+rpc (read only) 2315449 2315449 0 0.0
(read/write) 125504 125504 0 0.0
.bss 48928 48928 0 0.0
.data 3824 3824 0 0.0
.data.rel.ro 66984 66984 0 0.0
.dynamic 608 608 0 0.0
.got 4392 4392 0 0.0
.init 27 27 0 0.0
.init_array 728 728 0 0.0
.rodata 198016 198016 0 0.0
.text 1955698 1955698 0 0.0
chip-tool debug (read only) 10345025 10345025 0 0.0
(read/write) 622240 622240 0 0.0
.bss 24728 24728 0 0.0
.data 3234 3234 0 0.0
.data.rel.ro 587888 587888 0 0.0
.dynamic 608 608 0 0.0
.got 5096 5096 0 0.0
.init 27 27 0 0.0
.init_array 640 640 0 0.0
.rodata 515861 515861 0 0.0
.text 8397476 8397476 0 0.0
chip-tool-no-interactive-ipv6only arm64 (read only) 10031716 10031716 0 0.0
(read/write) 684529 684529 0 0.0
.bss 42609 42609 0 0.0
.data 1152 1152 0 0.0
.data.rel.ro 623432 623432 0 0.0
.dynamic 528 528 0 0.0
.got 13520 13520 0 0.0
.init 24 24 0 0.0
.init_array 192 192 0 0.0
.rodata 478260 478260 0 0.0
.text 7992788 7992788 0 0.0
lighting-app debug+rpc (read only) 2551193 2551193 0 0.0
(read/write) 129528 129528 0 0.0
.bss 49440 49440 0 0.0
.data 2096 2096 0 0.0
.data.rel.ro 72136 72136 0 0.0
.dynamic 608 608 0 0.0
.got 4392 4392 0 0.0
.init 27 27 0 0.0
.init_array 816 816 0 0.0
.rodata 213704 213704 0 0.0
.text 2167522 2167522 0 0.0
lock-app debug (read only) 2515913 2515913 0 0.0
(read/write) 124512 124512 0 0.0
.bss 47840 47840 0 0.0
.data 1712 1712 0 0.0
.data.rel.ro 69096 69096 0 0.0
.dynamic 608 608 0 0.0
.got 4424 4424 0 0.0
.init 27 27 0 0.0
.init_array 792 792 0 0.0
.rodata 228744 228744 0 0.0
.text 2122002 2122002 0 0.0
ota-provider-app debug (read only) 2322305 2322305 0 0.0
(read/write) 118312 118312 0 0.0
.bss 47488 47488 0 0.0
.data 1944 1944 0 0.0
.data.rel.ro 63096 63096 0 0.0
.dynamic 608 608 0 0.0
.got 4488 4488 0 0.0
.init 27 27 0 0.0
.init_array 672 672 0 0.0
.rodata 203512 203512 0 0.0
.text 1956018 1956018 0 0.0
ota-requestor-app debug (read only) 2439361 2439513 152 0.0
(read/write) 125216 125248 32 0.0
.bss 49856 49856 0 0.0
.data 2232 2232 0 0.0
.data.rel.ro 67288 67304 16 0.0
.dynamic 608 608 0 0.0
.got 4480 4480 0 0.0
.init 27 27 0 0.0
.init_array 728 728 0 0.0
.rodata 207296 207296 0 0.0
.text 2060914 2061042 128 0.0
shell debug (read only) 2551169 2551169 0 0.0
(read/write) 141104 141104 0 0.0
.bss 57448 57448 0 0.0
.data 1264 1264 0 0.0
.data.rel.ro 76688 76688 0 0.0
.dynamic 608 608 0 0.0
.got 4136 4136 0 0.0
.init 27 27 0 0.0
.init_array 928 928 0 0.0
.rodata 227762 227762 0 0.0
.text 2166306 2166306 0 0.0
thermostat-no-ble arm64 (read only) 2595316 2595316 0 0.0
(read/write) 158289 158289 0 0.0
.bss 65249 65249 0 0.0
.data 1704 1704 0 0.0
.data.rel.ro 83240 83240 0 0.0
.dynamic 528 528 0 0.0
.got 5072 5072 0 0.0
.init 24 24 0 0.0
.init_array 400 400 0 0.0
.rodata 165476 165476 0 0.0
.text 2190064 2190064 0 0.0
tv-app debug (read only) 3102225 3102225 0 0.0
(read/write) 257704 257704 0 0.0
.bss 167016 167016 0 0.0
.data 4848 4848 0 0.0
.data.rel.ro 79392 79392 0 0.0
.dynamic 608 608 0 0.0
.got 4848 4848 0 0.0
.init 27 27 0 0.0
.init_array 952 952 0 0.0
.rodata 249024 249024 0 0.0
.text 2665298 2665298 0 0.0
tv-casting-app debug (read only) 5577785 5577785 0 0.0
(read/write) 161968 161968 0 0.0
.bss 50248 50248 0 0.0
.data 2416 2416 0 0.0
.data.rel.ro 103048 103048 0 0.0
.dynamic 608 608 0 0.0
.got 4744 4744 0 0.0
.init 27 27 0 0.0
.init_array 864 864 0 0.0
.rodata 343209 343209 0 0.0
.text 4956626 4956626 0 0.0
mbed lock-app CY8CPROTO_062_4343W+release (read only) 6224 6224 0 0.0
(read/write) 2448112 2448112 0 0.0
.bss 213940 213940 0 0.0
.data 5872 5872 0 0.0
.text 1410756 1410756 0 0.0
nrfconnect all-clusters-app nrf52840dk_nrf52840 (read/write) 1175175 1175195 20 0.0
bss 142900 142900 0 0.0
rodata 141888 141892 4 0.0
text 811492 811516 24 0.0
all-clusters-minimal-app nrf52840dk_nrf52840 (read/write) 1155367 1155387 20 0.0
bss 142136 142136 0 0.0
rodata 133416 133420 4 0.0
text 800936 800960 24 0.0
p6 all-clusters-app default (read/write) 2566144 2566144 0 0.0
.bss 149120 149120 0 0.0
.data 2776 2776 0 0.0
.text 1524408 1524408 0 0.0
all-clusters-minimal-app default (read/write) 2511440 2511440 0 0.0
.bss 148400 148400 0 0.0
.data 2776 2776 0 0.0
.text 1469704 1469704 0 0.0
light-app default (read/write) 2441368 2441368 0 0.0
.bss 140456 140456 0 0.0
.data 2592 2592 0 0.0
.text 1399632 1399632 0 0.0
lock-app default (read/write) 2468520 2468520 0 0.0
.bss 140304 140304 0 0.0
.data 2600 2600 0 0.0
.text 1426784 1426784 0 0.0
telink light-switch-app tlsr9518adk80d (read/write) 797284 797284 0 0.0
bss 70576 70576 0 0.0
noinit 40416 40416 0 0.0
text 565678 565678 0 0.0
lighting-app tlsr9518adk80d (read/write) 817116 817116 0 0.0
bss 71420 71420 0 0.0
noinit 40416 40416 0 0.0
text 582002 582002 0 0.0

@mrjerryjohns
Copy link
Contributor

Is there a unit-test, or YAML test we could add to re-create the failure, and validate the fix?

@bzbarsky-apple
Copy link
Contributor Author

Is there a unit-test, or YAML test we could add to re-create the failure, and validate the fix?

@Damian-Nordic @isiu-apple @carol-apple

@carol-apple
Copy link
Contributor

carol-apple commented Jul 12, 2022

Is there a unit-test, or YAML test we could add to re-create the failure, and validate the fix?

@Damian-Nordic @isiu-apple @carol-apple

There is one existing test that for a successful transfer. We could build on that but maybe force the provider to be stopped (via SystemCommands) to try to re-create this failure? The happy path test is at: https://github.com/project-chip/connectedhomeip/blob/master/src/app/tests/suites/OTA_SuccessfulTransfer.yaml

Copy link
Contributor

@Damian-Nordic Damian-Nordic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this change manually and it doesn't crash anymore but when I tried to inject faults in different places in OTA, I found two more places that need to be fixed:

diff --git a/src/app/clusters/ota-requestor/BDXDownloader.cpp b/src/app/clusters/ota-requestor/BDXDownloader.cpp
index 8a00a3820..d3b9f0398 100644
--- a/src/app/clusters/ota-requestor/BDXDownloader.cpp
+++ b/src/app/clusters/ota-requestor/BDXDownloader.cpp
@@ -201,10 +201,10 @@ void BDXDownloader::EndDownload(CHIP_ERROR reason)
         {
             mImageProcessor->Abort();
         }
-        SetState(State::kIdle, OTAChangeReasonEnum::kSuccess);
 
         // Because AbortTransfer() will generate a StatusReport to send.
         PollTransferSession();
+        SetState(State::kIdle, OTAChangeReasonEnum::kSuccess);
     }
     else
     {
diff --git a/src/app/clusters/ota-requestor/DefaultOTARequestor.cpp b/src/app/clusters/ota-requestor/DefaultOTARequestor.cpp
index 3f4cf6782..fdc3b4364 100644
--- a/src/app/clusters/ota-requestor/DefaultOTARequestor.cpp
+++ b/src/app/clusters/ota-requestor/DefaultOTARequestor.cpp
@@ -832,8 +832,19 @@ CHIP_ERROR DefaultOTARequestor::StartDownload(OperationalDeviceProxy & devicePro
     mBdxDownloader->SetMessageDelegate(&mBdxMessenger);
     mBdxDownloader->SetStateDelegate(this);
 
-    ReturnErrorOnFailure(mBdxDownloader->SetBDXParams(initOptions, kDownloadTimeoutSec));
-    return mBdxDownloader->BeginPrepareDownload();
+    CHIP_ERROR error = mBdxDownloader->SetBDXParams(initOptions, kDownloadTimeoutSec);
+
+    if (error == CHIP_NO_ERROR)
+    {
+        error = mBdxDownloader->BeginPrepareDownload();
+    }
+
+    if (error != CHIP_NO_ERROR)
+    {
+        mBdxMessenger.Reset();
+    }
+
+    return error;
 }

The first is needed because if we call SetState(State::kIdle, OTAChangeReasonEnum::kSuccess); to early and close the exchange, we won't manage to send the status report message to inform the provider about the failure. The other one is just another exchange leak if preparation of the download fails in an early stage.

@bzbarsky-apple bzbarsky-apple merged commit 55da2d6 into project-chip:master Jul 14, 2022
@bzbarsky-apple bzbarsky-apple deleted the ota-exchange-leak-fix branch July 14, 2022 20:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ota] Exchange leak
4 participants