-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Description
Describe the bug
Description
When encoding large sets of cached LwM2M resources during LWM2M 1.1 send using SenML CBOR, the serialization fails with -ENOMEM
if the number of records exceeds CONFIG_LWM2M_RW_SENML_CBOR_RECORDS
.
Currently, this aborts mid-serialization, discards already encoded data, and sends nothing. From that point, the client is unable to send further updates and caches keep growing.
Update: a better and deeper description was added to the PR: #95842 (comment)
Expected behavior
If some records were already serialized before -ENOMEM
, the encoder should finalize the CBOR output and send the partial payload, instead of discarding everything.
Impact
- Device can get stuck, unable to upload any data.
- Local caches fill up over time.
- A tiny change restores forward progress by sending what has already been encoded.
Regression
- This is a regression.
Steps to reproduce
How to reproduce
- Use the
nrf/lwm2m_client
sample from Nordics ncs-3.1.0 - LwM2M 1.1 with caching enabled, local cache of multiple observed objects (e.g., ~8 FP64 objects × ~350 elements including timestamps).
- Set
CONFIG_LWM2M_RW_SENML_CBOR_RECORDS
lower than the total elements to be serialized. - Trigger a send of all 8 objects - it works as long as all cached elements and timestamps together count less then arround CONFIG_LWM2M_RW_SENML_CBOR_RECORDS
- Observe
-ENOMEM
duringlwm2m_perform_composite_read_op()
and no data being sent; subsequent attempts keep failing as caches grow.
I'm not sure about the exact limits after its failing; seems like there are more elements that count as only the actual cached data - like additional CBOR elements.
Relevant log output
<err> net_lwm2m_senml_cbor: CONFIG_LWM2M_RW_SENML_CBOR_RECORDS too small
<err> net_lwm2m_message_handling: engine_put_timestamp unsupported
<err> net_lwm2m_message_handling: Read operation fail (-12)
<err> net_lwm2m_message_handling: READ OP failed, out of memory
<err> net_lwm2m_message_handling: Supported message size is too small for read object instance (num_read=0)
Impact
Major – Severely degrades functionality; workaround is difficult or unavailable.
Environment
Environment
- nRF Connect SDK: 3.0.2, 3.1.0 (uses Zephyr downstream)
- Boards: Thingy91X (nRF9151), nRF9151 DK
- Sample: nrf/lwm2m_client
Additional Context
Proposed solution
- Treat CBOR output buffer shortage as recoverable when some records were already serialized.
- Finalize the CBOR output (add end marker) and return the partial payload.
- Optionally downgrade the log from ERR to WRN for this recoverable condition.
This is a small, focused change (3 lines plus sugar coating) with a large practical impact. It preserves forward progress without changing public APIs or behavior in the success case.
Next steps
I will prepare a PR with this fix against the current Zephyr main
branch.
This way the patch can be reviewed and, once merged, will automatically flow downstream into NCS in a future upmerge.
subsys/net/lib/lwm2m/lwm2m_message_handling.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/subsys/net/lib/lwm2m/lwm2m_message_handling.c b/subsys/net/lib/lwm2m/lwm2m_message_handling.c
index 8641036d9c4..58f467adcf1 100644
--- a/subsys/net/lib/lwm2m/lwm2m_message_handling.c
+++ b/subsys/net/lib/lwm2m/lwm2m_message_handling.c
@@ -3304,6 +3304,10 @@ int lwm2m_perform_composite_read_op(struct lwm2m_message *msg, uint16_t content_
ret = lwm2m_perform_read_object_instance(msg, obj_inst, &num_read);
if (ret == -ENOMEM) {
+ if (num_read > 0) {
+ /* Return what we have read so far */
+ goto put_end;
+ }
return ret;
}
}
@@ -3312,6 +3316,7 @@ int lwm2m_perform_composite_read_op(struct lwm2m_message *msg, uint16_t content_
return -ENOENT;
}
+put_end:
/* Add object end mark */
if (engine_put_end(&msg->out, &msg->path) < 0) {
return -ENOMEM;